tl/dr: “Survival” inside an AGI does not require Friendliness, but only that it is able to create models of us that are good enough for us to accept as genuine copies.
I don’t think the AGI needs to care for our values in order to facilitate our transition. For the sake of argument, lets assume an AGI that doesn’t care about human values—the Paperclip Maximizer will do.
Couldn’t this AGI, if it so chose, easily create something that we’d accept as a continuation of our identity? A digital copy of a human that is so convincing that this person (and the people that know him or her) could accept it as identical? Or a hyper-persuasive philosophy that tells people their non-copyable features (say consciousness) are nonessential?
I imagine that it could (alternative discussed below). Which leads to the question: Would it?
I think it would (alternative discussed below). Any AGI that wants self-preservation would want to minimize risk of conflict by appearing helpful or at least non-threatening—at least until the cost of appearing so is greater that the cost of being repeatedly nuked. If it can convince people it is offering genuine immortality in upload form, its risk of being nuked is greatly reduced. It could delete the (probably imperfect) models after humans aren’t a threat anymore if—and only if—it it so sure it’ll never need specimina of the best work of Darwinian evolution again that it’d rather turn the comparatively tiny piece of computronium they exist in into paperclips. But how could it be sure?
So unless it is much, much better at nanotech than it is at modeling people, I do expect an Earth AGI would contain at least some vestiges of human identity (maybe even more of those than of the vestiges of oak or flatworm identity). Of course they would be irrelevant to almost the entire rest of the system, because they’re not good enough at making paperclips to matter.
This leaves the scenarios where my assumptions are wrong and the Paperclip Maximizer is somehow either unable to create a persuasive “transition” offer, or decides against making it. Such Paperclip Maximizer variants don’t seem superintelligent to me (more like Grey Goo), but I suppose they could be built. However, in this case, its lack of human values is only a problem because it also lacks the ability to model humans and a credible deterrent. The former of these two might be an easier problem than Friendlyness, if we’re only talking about our survival (as superintelligent robots or whatever) inside that AGI, not about a goal of actually having a say in what it does.
tl/dr: “Survival” inside an AGI does not require Friendliness, but only that it is able to create models of us that are good enough for us to accept as genuine copies.
I don’t think the AGI needs to care for our values in order to facilitate our transition. For the sake of argument, lets assume an AGI that doesn’t care about human values—the Paperclip Maximizer will do.
Couldn’t this AGI, if it so chose, easily create something that we’d accept as a continuation of our identity? A digital copy of a human that is so convincing that this person (and the people that know him or her) could accept it as identical? Or a hyper-persuasive philosophy that tells people their non-copyable features (say consciousness) are nonessential?
I imagine that it could (alternative discussed below). Which leads to the question: Would it?
I think it would (alternative discussed below). Any AGI that wants self-preservation would want to minimize risk of conflict by appearing helpful or at least non-threatening—at least until the cost of appearing so is greater that the cost of being repeatedly nuked. If it can convince people it is offering genuine immortality in upload form, its risk of being nuked is greatly reduced. It could delete the (probably imperfect) models after humans aren’t a threat anymore if—and only if—it it so sure it’ll never need specimina of the best work of Darwinian evolution again that it’d rather turn the comparatively tiny piece of computronium they exist in into paperclips. But how could it be sure?
So unless it is much, much better at nanotech than it is at modeling people, I do expect an Earth AGI would contain at least some vestiges of human identity (maybe even more of those than of the vestiges of oak or flatworm identity). Of course they would be irrelevant to almost the entire rest of the system, because they’re not good enough at making paperclips to matter.
This leaves the scenarios where my assumptions are wrong and the Paperclip Maximizer is somehow either unable to create a persuasive “transition” offer, or decides against making it. Such Paperclip Maximizer variants don’t seem superintelligent to me (more like Grey Goo), but I suppose they could be built. However, in this case, its lack of human values is only a problem because it also lacks the ability to model humans and a credible deterrent. The former of these two might be an easier problem than Friendlyness, if we’re only talking about our survival (as superintelligent robots or whatever) inside that AGI, not about a goal of actually having a say in what it does.