I have not been convinced but am open toward the idea that a paperclip maximizer is the overwhelmingly likely outcome if we create a superhuman AI. At present, my thinking is that if some care is taking in the creation of a superhuman AI, more likely than a paperclip maximizer is an AI which partially shares human values, that is, the dicotomy “paper clip maximizer vs. Friendly AI” seems like a false dicotomy—I imagine that the sort of AI that people would actually build would be somewhere in the middle. Any recommended reading on this point appreciated.
I believed similarly until I read Steve Omohundro’s The Basic AI Drives. It convinced me that a paperclip maximizer is the overwhelmingly likely outcome of creating an AGI.
That paper makes a convincing case that the ‘generic’ AI (some distribution of AI motivations weighted by our likelihood of developing them) will most prefer outcomes that rank low in our preference ordering, i.e. the free energy and atoms needed to support life as we know it or would want it will get reallocated to something else. That means that an AI given arbitrary power (e.g. because of a very hard takeoff, or easy bargaining among AIs but not humans, or other reasons) would be lethal. However, the situation seems different and more sensitive to initial conditions when we consider AIs with limited power that must trade off chances of conquest with a risk of failure and retaliation. I’m working on a write up of those issues.
I believed similarly until I read Steve Omohundro’s The Basic AI Drives. It convinced me that a paperclip maximizer is the overwhelmingly likely outcome of creating an AGI.
That paper makes a convincing case that the ‘generic’ AI (some distribution of AI motivations weighted by our likelihood of developing them) will most prefer outcomes that rank low in our preference ordering, i.e. the free energy and atoms needed to support life as we know it or would want it will get reallocated to something else. That means that an AI given arbitrary power (e.g. because of a very hard takeoff, or easy bargaining among AIs but not humans, or other reasons) would be lethal. However, the situation seems different and more sensitive to initial conditions when we consider AIs with limited power that must trade off chances of conquest with a risk of failure and retaliation. I’m working on a write up of those issues.
Thanks Craig, I’ll check it out!