steven0461′s comment notwithstanding, I can take a guess at what the robot actually wants. I think it wants to take the action that will minimize the number of blue cells existing in the world, according to the robot’s current model of the world. That rule for choosing actions probably doesn’t correspond to any coherent utility function over the real world, but that’s not really a surprise.
The interesting question that you probably meant to ask is whether the robot’s utility function over its model of the world can be converted to a utility function over the real world. But the robot won’t agree to any such upgrade, so the question is kinda moot.
That might sound hopeless for CEV, but fortunately humans aren’t consequentialists with a fixed model of the world. Instead they seem to be motivated by pleasure and pain, which you can’t disprove out of existence by coming up with a better model. So maybe there’s hope in that direction.
steven0461′s comment notwithstanding, I can take a guess at what the robot actually wants. I think it wants to take the action that will minimize the number of blue cells existing in the world, according to the robot’s current model of the world. That rule for choosing actions probably doesn’t correspond to any coherent utility function over the real world, but that’s not really a surprise.
The interesting question that you probably meant to ask is whether the robot’s utility function over its model of the world can be converted to a utility function over the real world. But the robot won’t agree to any such upgrade, so the question is kinda moot.
That might sound hopeless for CEV, but fortunately humans aren’t consequentialists with a fixed model of the world. Instead they seem to be motivated by pleasure and pain, which you can’t disprove out of existence by coming up with a better model. So maybe there’s hope in that direction.