Why should an uploaded superintelligence based on a human copy be any innately safer than an artificial superintelligence? Just because humans are usually friendly doesn’t mean a human AI would have to be friendly. This is especially true for a superintelligent human AI, which may not even be comparable to its original human template. Even the friendliest human might be angry and abusive when they’re having a bad day.
Your idea that a WBE copy would be easier to undergo a relatively more enhanced supervised, safe growth, is basically an assumption. You would need to argue this in much more detail for it to merit deeper consideration.
Also, you cannot assume that an uploaded human superintelligence would be more constrained, as in ”...after a best-effort psychiatric evaluation (for whatever good that might do) gives it Internet access”. This is related to the the AI-box problem, where it is contended that a superintelligence could not be contained, no matter what. Personally I dispute this, but at least it’s not something to be taken for granted.
WBE safety could benefit from an existing body of knowledge about human behavior and capabilities, and the spaghetti code of the brain could plausibly impose a higher barrier to rapid self-improvement. And institutions exploiting the cheap copyability of brain emulations could greatly help in stabilizing benevolent motivations.
WBE is a tiny region of the space of AI designs that we can imagine as plausible possibilities, and we have less uncertainty about it than about “whatever non-WBE AI technology comes first.” Some architectures might be easier to make safe, and others harder, but if you are highly uncertain about non-WBE AI’s properties then you need wide confidence intervals.
WBE also has the nice property that it is relatively all-or-nothing. With de novo AI, designers will be tempted to trade off design safety for speed, but for WBE a design that works at all will be relatively close to the desired motivations (there will still be tradeoffs with emulation brain damage, but the effect seems less severe than for de novo AI). Attempts to reduce WBE risk might just involve preparing analysis and institutions to manage WBE upon development, where AI safety would require control of the development process to avoid intrinsically unsafe designs.
At least we know what a friendly human being looks like.
And I wouldn’t stop at a psychiatric evaluation of the person to be uploaded. I’d work on evaluating whether the potential uploadee was good for the people they associate with.
Why should an uploaded superintelligence based on a human copy be any innately safer than an artificial superintelligence? Just because humans are usually friendly doesn’t mean a human AI would have to be friendly. This is especially true for a superintelligent human AI, which may not even be comparable to its original human template. Even the friendliest human might be angry and abusive when they’re having a bad day.
Your idea that a WBE copy would be easier to undergo a relatively more enhanced supervised, safe growth, is basically an assumption. You would need to argue this in much more detail for it to merit deeper consideration.
Also, you cannot assume that an uploaded human superintelligence would be more constrained, as in ”...after a best-effort psychiatric evaluation (for whatever good that might do) gives it Internet access”. This is related to the the AI-box problem, where it is contended that a superintelligence could not be contained, no matter what. Personally I dispute this, but at least it’s not something to be taken for granted.
WBE safety could benefit from an existing body of knowledge about human behavior and capabilities, and the spaghetti code of the brain could plausibly impose a higher barrier to rapid self-improvement. And institutions exploiting the cheap copyability of brain emulations could greatly help in stabilizing benevolent motivations.
WBE is a tiny region of the space of AI designs that we can imagine as plausible possibilities, and we have less uncertainty about it than about “whatever non-WBE AI technology comes first.” Some architectures might be easier to make safe, and others harder, but if you are highly uncertain about non-WBE AI’s properties then you need wide confidence intervals.
WBE also has the nice property that it is relatively all-or-nothing. With de novo AI, designers will be tempted to trade off design safety for speed, but for WBE a design that works at all will be relatively close to the desired motivations (there will still be tradeoffs with emulation brain damage, but the effect seems less severe than for de novo AI). Attempts to reduce WBE risk might just involve preparing analysis and institutions to manage WBE upon development, where AI safety would require control of the development process to avoid intrinsically unsafe designs.
This is a good summary.
At least we know what a friendly human being looks like.
And I wouldn’t stop at a psychiatric evaluation of the person to be uploaded. I’d work on evaluating whether the potential uploadee was good for the people they associate with.