shminux comments on Why are we sure that AI will “want” something?

shminux 17 Sep 2022 0:35 UTC
2 points
0
you’re training it to understand the whole distribution of human performance on this task, and then selecting a policy conditional on good performance
Yeah, that makes sense to me.
it’s trying to predict good navigations of the obstacle course, and it models that as a process that picks actions based on their modeled impact on the real world, and in order to do that modeling, it actually runs the compuations.
I can see why it would run a simulation of what would happen if a robot walked an obstacle course. I don’t see why it would actually walk the robot through it if not asked.
- Charlie Steiner 17 Sep 2022 2:11 UTC
  3 points
  0
  Parent
  I can see why it would run a simulation of what would happen if a robot walked an obstacle course. I don’t see why it would actually walk the robot through it if not asked.
  So, this is an argument about generalization properties. Which means it’s kind of the opposite of the thing you asked for :P
  That is, it’s not about this AI doing its intended job even when you don’t turn it on. It’s about the AI doing something other than its intended job when you do turn it on.
  That is… the claim is that you might put the AI in a new situation and have it behave badly (e.g. the robot punching through walls to complete the obstacle course faster, if you put it in a new environment where it’s able to punch through walls) in a way that looks like goal-directed behavior, even if you tried not to give it any goals, or were just trying to have it mimic humans.