I’m glad to see that you place high priority on talking to good researchers, but I think that the main benefit that will derive from doing so (aside from increasing awareness of AI risk) will be to shift SingInst’ staff member’s beliefs in the direction of the Friendly AI problem being intractable.
Believing problem intractable isn’t a step towards solving the problem. It might be correct to downgrade your confidence in a problem being solvable, but isn’t in itself a useful thing if the goal remains motivated. It mostly serves as an indication of epistemic rationality, if indeed the problem is less tractable than believed, or perhaps it could be a useful strategic consideration. Noticing that the current approach is worse than an alternative (i.e. open problems are harder to communicate than expected, but what’s the better alternative that makes it possible to use this piece of better understanding?), or noticing a particular error in present beliefs, is much more useful.
Believing problem intractable isn’t a step towards solving the problem. It might be correct to downgrade your confidence in a problem being solvable, but isn’t in itself a useful thing if the goal remains motivated.
I agree, but it may be appropriate to be more modest in aim (e.g. by pushing for neuromorphic AI with some built-in safety precautions even if achieving this outcome is much less valuable than creating a Friendly AI would be).
e.g. by pushing for neuromorphic AI with some built-in safety precautions even if achieving this outcome is much less valuable than creating a Friendly AI would be
I believe it won’t be “less valuable”, but instead would directly cause existential catastrophe, if successful. Feasibility of solving FAI doesn’t enter into this judgment.
I believe it won’t be “less valuable”, but instead would directly cause existential catastrophe, if successful.
I meant in expected value.
As Anna mentioned in one of her Google AGI talks there’s the possibility of an AGI being willing to trade with humans to avoid a small probabity of being destroyed by humans (though I concede that it’s not at all clear how one would create an enforceable agreement). Also a neuromorphic AI could be not so far from a WBE. Do you think that whole brain emulation would directly cause existential catastrophe?
I believe it won’t be “less valuable”, but instead would directly cause existential catastrophe, if successful.
I meant in expected value.
Huh? I didn’t mean opportunity cost, but simply that successful neuromorphic AI destroys the world. Staging a global catastrophe does have lower expected value than protecting from global catastrophe (with whatever probabilities), but also lower expected value than watching TV.
Do you think that whole brain emulation would directly cause existential catastrophe?
Indirectly, but with influence that compresses expected time-to-catastrophe after the tech starts working from decades-centuries to years (decades if WBE tech comes early and only slow or few uploads can be supported initially). It’s not all lost at that point, since WBEs could do some FAI research, and would be in a better position to actually implement a FAI and think longer about it, but ease of producing an UFAI would go way up (directly, by physically faster research of AGI, or by experimenting with variations on human brains or optimization processes built out of WBEs).
The main thing that distinguishes WBEs is that they are still initially human, still have same values. All other tech breaks values, and giving it power makes humane values lose the world.
Huh? I didn’t mean opportunity cost, but simply that successful neuromorphic AI destroys the world. Staging a global catastrophe does have lower expected value than protecting from global catastrophe (with whatever probabilities), but also lower expected value than watching TV.
I was saying that it could be that with more information we would find that
0 < EU(Friendly AI research) < EU(Pushing for relatively safe neuromorphic AI) < EU(Successful construction of a Friendly AI).
even if there’s a high chance that relatively safe neuromorphic AI would cause global catastrophe and carry no positive benefits. This could be the case if Friendly AI research sufficiently hard. I think that given the current uncertainty about the difficulty of friendly AI research would have to be extremely confident that relatively safe neuromorphic AI that would cause global catastrophe to rule this possibility out.
Indirectly, but with influence that compresses expected time-to-catastrophe after the tech starts working from decades-centuries to years (decades if WBE tech comes early and only slow or few uploads can be supported initially). It’s not all lost at that point, since WBEs could do some FAI research, and would be in a better position to actually implement a FAI and think longer about it, but ease of producing an UFAI would go way up (directly, by physically faster research of AGI, or by experimenting with variations on human brains or optimization processes built out of WBEs).
Agree with this
The main thing that distinguishes WBEs is that they are still initially human, still have same values. All other tech breaks values, and giving it power makes humane values lose the world.
I think that I’d rather have an uploaded crow brain have its computational power and memory substantially increased and then go FOOM than have an arbitrary powerful optimization process; just because a neuromorphic AI wouldn’t have values that are precisely human doesn’t mean it wouldn’t be totally devoid of value from our point of view.
I think that I’d rather have an uploaded crow brain have its computational power and memory substantially increased and then go FOOM than have an arbitrary powerful optimization process; just because a neuromorphic AI wouldn’t have values that are precisely human doesn’t mean it would be totally devoid of value from our point of view.
I expect it would; even a human whose brain was meddled with to make it more intelligent is probably a very bad idea, unless this modified human builds a modified-human-Friendly-AI (in which case some value drift would probably be worth protection from existential risk) or, even better, a useful FAI theory elicited Oracle AI-style. The crucial question here is the character of FOOMing, how much of initial value is retained.
Believing problem intractable isn’t a step towards solving the problem. It might be correct to downgrade your confidence in a problem being solvable, but isn’t in itself a useful thing if the goal remains motivated. It mostly serves as an indication of epistemic rationality, if indeed the problem is less tractable than believed, or perhaps it could be a useful strategic consideration. Noticing that the current approach is worse than an alternative (i.e. open problems are harder to communicate than expected, but what’s the better alternative that makes it possible to use this piece of better understanding?), or noticing a particular error in present beliefs, is much more useful.
I agree, but it may be appropriate to be more modest in aim (e.g. by pushing for neuromorphic AI with some built-in safety precautions even if achieving this outcome is much less valuable than creating a Friendly AI would be).
I believe it won’t be “less valuable”, but instead would directly cause existential catastrophe, if successful. Feasibility of solving FAI doesn’t enter into this judgment.
I meant in expected value.
As Anna mentioned in one of her Google AGI talks there’s the possibility of an AGI being willing to trade with humans to avoid a small probabity of being destroyed by humans (though I concede that it’s not at all clear how one would create an enforceable agreement). Also a neuromorphic AI could be not so far from a WBE. Do you think that whole brain emulation would directly cause existential catastrophe?
Huh? I didn’t mean opportunity cost, but simply that successful neuromorphic AI destroys the world. Staging a global catastrophe does have lower expected value than protecting from global catastrophe (with whatever probabilities), but also lower expected value than watching TV.
Indirectly, but with influence that compresses expected time-to-catastrophe after the tech starts working from decades-centuries to years (decades if WBE tech comes early and only slow or few uploads can be supported initially). It’s not all lost at that point, since WBEs could do some FAI research, and would be in a better position to actually implement a FAI and think longer about it, but ease of producing an UFAI would go way up (directly, by physically faster research of AGI, or by experimenting with variations on human brains or optimization processes built out of WBEs).
The main thing that distinguishes WBEs is that they are still initially human, still have same values. All other tech breaks values, and giving it power makes humane values lose the world.
I was saying that it could be that with more information we would find that
0 < EU(Friendly AI research) < EU(Pushing for relatively safe neuromorphic AI) < EU(Successful construction of a Friendly AI).
even if there’s a high chance that relatively safe neuromorphic AI would cause global catastrophe and carry no positive benefits. This could be the case if Friendly AI research sufficiently hard. I think that given the current uncertainty about the difficulty of friendly AI research would have to be extremely confident that relatively safe neuromorphic AI that would cause global catastrophe to rule this possibility out.
Agree with this
I think that I’d rather have an uploaded crow brain have its computational power and memory substantially increased and then go FOOM than have an arbitrary powerful optimization process; just because a neuromorphic AI wouldn’t have values that are precisely human doesn’t mean it wouldn’t be totally devoid of value from our point of view.
I expect it would; even a human whose brain was meddled with to make it more intelligent is probably a very bad idea, unless this modified human builds a modified-human-Friendly-AI (in which case some value drift would probably be worth protection from existential risk) or, even better, a useful FAI theory elicited Oracle AI-style. The crucial question here is the character of FOOMing, how much of initial value is retained.