Manfred comments on Rationality versus Short Term Selves

Manfred 25 Oct 2012 4:48 UTC
0 points

The one thing that is worrisome is that if you ask that function to consult people’s function, to extract some form of extrapolation, then you don’t have the problem in the meta level, but still have it on the level of what the AGI thinks people are (say, because it scrutinized short-terms)

That’s a good point. One could imagine a method of getting utility functions from human values that, maybe due to improper specification, returned some parts from short-term desires and some other parts from long-term desires, maybe even inconsistently. Though that still wouldn’t result in the AI acting like a human—it would do weirder things.