Donald Hobson comments on Practical consequences of impossibility of value learning

Donald Hobson 4 Aug 2019 22:03 UTC
1 point
I should have been clearer, the point isn’t that you get correct values, the point is that you get out of the swath of null or meaningless values and into the just wrong. While the values gained will be wrong, they would be significantly correlated, its the sort of AI to produce drugged out brains in vats, or something else that’s not what we want, but closer than paperclips. One measure you could use of human effectiveness is given all possible actions ordered by util, what percentile are the actions we took in.
Once we get into this region, it becomes clear that the next task is to fine tune our model of the bounds on human rationality, or figure out how to get an AI to do it for us.
- Stuart_Armstrong 4 Aug 2019 23:24 UTC
  2 points
  Parent
  I disagree. I think that if we put a complexity upper bound on human rationality, and assume noisy rationality, then we will get values that are “meaningless” from your perspective.
  
  I’m trying to think of ways how of we could test this....