whpearson comments on Drive-less AIs and experimentation

whpearson 17 Jun 2011 20:54 UTC
1 point

I suspect “avoiding changing the world” actually has some human-values baked into it.

See my reply to Vladimir Nesov.

This seems to be trying to box an AI with it’s own goal system, which I think puts it in the tricky-wish category.

Do you count CEV to be in the same category?
- JamesAndrix 18 Jun 2011 1:20 UTC
  0 points
  Parent
  With 2 differences: CEV is tries to correct any mistakes in the initial formulation of the wish(aiming for an attractor), and it doesn’t force the designers to specify details like whether making bacteria is ok or not-ok.
  
  It’s the difference between painting a painting of a specific scene, and making an auto-focus camera.
  
  I do currently think it is possible to create a powerful cross-domain optimizer that is not a person and will not create persons or unbox itself or look at our universe or tile the universe with anything or make AI that doesn’t comply with this. But I approach this line of thought with extreme caution, and really only to accelerate whatever it takes to get to CEV, because AI can’t safely make changes to the real world without some knowledge of human volition, even if it wants to.
  
  What if I missed something that’s on the scale of the nonperson predicate? My AI works, creatively paints the apple, but somehow it’s solution is morally awful. Even staying within pure math could be bad for unforseen reasons.