I actually think this is just slightly off the mark–this question–in the sense that maybe we can put almost any reward into the system and if the environment’s complex enough amazing things will happen just in maximizing that reward. Maybe we don’t have to solve this “What’s the right thing for intelligence to really emerge at the end of it?” kind of question and instead embrace the fact that there are many forms of intelligence, each of which is optimizing for its own target. And it’s okay if we have AIs in the future some of which are trying to control satellites and some of which are trying to sail boats and some of which are trying to win games of chess and they may all come up with their own abilities in order to allow that intelligence to achieve its end as effectively as possible.
In other words, power-seeking, intelligence, and all those other behaviors are convergent instrumental drives so almost any reward function will work and thus Clippy is entirely possible.
What are the chances that we get lucky and acting in an altruistic manner towards other sentient beings is also a convergent drive? My guess is most people here on LessWrong would say close to epsilon, but I wonder what the folks at DeepMind would say…
(The convergent drive would be to tit-for-tat until you observe enough to solve the POMDP of them, betraying/exploiting them maximally the instant you gather enough info to decide that is more rewarding...)
Paperclip maximizers aren’t necessarily sentient, and Demis explicitly says in his episode that it’d be best to avoid creating sentient AI at least initially to avoid the ethical issues surrounding that.
In other words, power-seeking, intelligence, and all those other behaviors are convergent instrumental drives so almost any reward function will work and thus Clippy is entirely possible.
What are the chances that we get lucky and acting in an altruistic manner towards other sentient beings is also a convergent drive? My guess is most people here on LessWrong would say close to epsilon, but I wonder what the folks at DeepMind would say…
(The convergent drive would be to tit-for-tat until you observe enough to solve the POMDP of them, betraying/exploiting them maximally the instant you gather enough info to decide that is more rewarding...)
Paperclip maximizers aren’t necessarily sentient, and Demis explicitly says in his episode that it’d be best to avoid creating sentient AI at least initially to avoid the ethical issues surrounding that.