An intriguing and neglected direction for control proposal research concerns endogenous control—i.e., self-control.
Agree. To frame this in paradigm-language: most of the discussion on this forum, both arguments about AI/AGI dangers and plans that consider possible solutions, uses paradigm A:
Paradigm A: We treat the AGI as a spherical econ with an unknown and opaque internal structure, which was set up to maximise a reward function/reward signal.
But there is also
Paradigm B: We treat the AGI as a computer program with an internal motivation and structure that we can control utterly, because we are writing it.
Most ‘mainstream’ ML researchers, and definitely most robotics researchers, are working under paradigm B. This explains some of the disconnect between this forum and mainstream theoretical and applied AI/AI safety research.
Agree. To frame this in paradigm-language: most of the discussion on this forum, both arguments about AI/AGI dangers and plans that consider possible solutions, uses paradigm A:
Paradigm A: We treat the AGI as a spherical econ with an unknown and opaque internal structure, which was set up to maximise a reward function/reward signal.
But there is also
Paradigm B: We treat the AGI as a computer program with an internal motivation and structure that we can control utterly, because we are writing it.
This second paradigm leads to AGI safety research like my Creating AGI Safety Interlocks or the work by Cohen et al here.
Most ‘mainstream’ ML researchers, and definitely most robotics researchers, are working under paradigm B. This explains some of the disconnect between this forum and mainstream theoretical and applied AI/AI safety research.