The_Jaded_One comments on Corrigibility thoughts II: the robot operator

The_Jaded_One 22 Jan 2017 20:13 UTC
0 points

but their position relies on physical facts about the world, along with a narrow definition of the correction of value event. To combat that, we’d need to define the operator properly

I feel like this is a very generic problem with safety.

If you can’t specify to an algorithm what part of the real world you’re talking about (at least to a reasonably good approximation) then it is very hard to make progress.

Perhaps the simplest way to specify “who the operator is” is to define the abstract notion of creation-of-subagents, and tell the AI that the “operator” is the agent that created it.

The abstract notion of creation-of-subagents seems quite robust once you have a system that can represent and reason about that notion; it certainly seems like if you created an agent which, from the get-go understood and cared about “the agent that created it”, you would rule out entire classes of paperclipper-like AIs.
- Stuart_Armstrong 23 Jan 2017 13:30 UTC
  0 points
  Parent
  “a physical event” seems much easier to define than “an operator” or “a subagent”.