I’ve noticed a surprising conclusion about moral value of the outcomes (1) existential disaster that terminates civilization, leaving no rational singleton behind (“Doom”), (2) Unfriendly AI (“UFAI”) and (3) FAI. It now seems that although the most important factor in optimizing the value of the world (according to your personal formal preference) is increasing probability of FAI (no surprise here), all else equal UFAI is much more preferable than Doom. That is, if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.
The main argument (known as Rolf Nelson’s AI deterrence) can be modeled by counterfactual mugging: an UFAI will give up a (small) portion of the control over its world to FAI’s preference (pay the $100), if there is a (correspondingly small) probability that FAI could’ve been created, had the circumstances played out differently (which corresponds to the coin landing differently in counterfactual mugging), in exchange for the FAI (counterfactually) giving up a portion of control to the UFAI (reward from Omega).
As a result, having an UFAI in the world is better than having no AI (at any point in the future), because this UFAI can work as a counterfactual trading partner to a FAI that could’ve existed under other circumstances, which would make the FAI stronger (improve the value of the possible worlds). Of course, the negative effect of decreasing the probability of FAI is much stronger than the positive effect of increasing the probability of UFAI to the same extent, which means that if the choice is purely between UFAI and FAI, the balance is conclusively in FAI’s favor. That there are FAIs in the possible worlds also shows that the Doom outcome is not completely devoid of moral value.
It can mostly be ignored, but uFAI affects physically-nearby aliens who might have developed a half-Friendly AI otherwise. (But if they could have, then they have counterfactual leverage in trading with your uFAI.) No reason to suspect that those aliens had a much better shot than we did at creating FAI, though. Creating uFAI might also benefit the aliens for other reasons… that I won’t go into, so instead I will just say that it is easy to miss important factors when thinking about these things. Anyway, if the nanobots are swarming the Earth, then launching uFAI does indeed seem very reasonable for many reasons.
That is, if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.
Fascinating! Do you still agree with what you wrote there? Are you still researching this issues and do you plan on writing a progress report or an open problems post? Would you be willing to write a survey paper on decision theoretic issues related to acausal trade?
My best guess about what’s preferable to what is still this way, but I’m significantly less certain of its truth (there are analogies that make the answer come out differently, and level of rigor in the above comment is not much better than these analogies). In any case, I don’t see how we can actually use these considerations. (I’m working in a direction that should ideally make questions like this more clear in the future.)
In any case, I don’t see how we can actually use these considerations.
If you know how to build a uFAI (or “probably somewhat reflective on its goal system but nowhere near provably Friendly” AI), build one and put it in an encrypted glass case. Ideally you would work out the AGI theory in your head, determine how long it would take to code the AGI after adjusting for planning fallacy, then be ready to start coding if doom is predictably going to occur. If doom isn’t predictable then the safety tradeoffs are larger. This can easily go wrong, obviously.
I’ve noticed a surprising conclusion about moral value of the outcomes (1) existential disaster that terminates civilization, leaving no rational singleton behind (“Doom”), (2) Unfriendly AI (“UFAI”) and (3) FAI. It now seems that although the most important factor in optimizing the value of the world (according to your personal formal preference) is increasing probability of FAI (no surprise here), all else equal UFAI is much more preferable than Doom. That is, if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.
The main argument (known as Rolf Nelson’s AI deterrence) can be modeled by counterfactual mugging: an UFAI will give up a (small) portion of the control over its world to FAI’s preference (pay the $100), if there is a (correspondingly small) probability that FAI could’ve been created, had the circumstances played out differently (which corresponds to the coin landing differently in counterfactual mugging), in exchange for the FAI (counterfactually) giving up a portion of control to the UFAI (reward from Omega).
As a result, having an UFAI in the world is better than having no AI (at any point in the future), because this UFAI can work as a counterfactual trading partner to a FAI that could’ve existed under other circumstances, which would make the FAI stronger (improve the value of the possible worlds). Of course, the negative effect of decreasing the probability of FAI is much stronger than the positive effect of increasing the probability of UFAI to the same extent, which means that if the choice is purely between UFAI and FAI, the balance is conclusively in FAI’s favor. That there are FAIs in the possible worlds also shows that the Doom outcome is not completely devoid of moral value.
More arguments and a related discussion here.
It can mostly be ignored, but uFAI affects physically-nearby aliens who might have developed a half-Friendly AI otherwise. (But if they could have, then they have counterfactual leverage in trading with your uFAI.) No reason to suspect that those aliens had a much better shot than we did at creating FAI, though. Creating uFAI might also benefit the aliens for other reasons… that I won’t go into, so instead I will just say that it is easy to miss important factors when thinking about these things. Anyway, if the nanobots are swarming the Earth, then launching uFAI does indeed seem very reasonable for many reasons.
Fascinating! Do you still agree with what you wrote there? Are you still researching this issues and do you plan on writing a progress report or an open problems post? Would you be willing to write a survey paper on decision theoretic issues related to acausal trade?
My best guess about what’s preferable to what is still this way, but I’m significantly less certain of its truth (there are analogies that make the answer come out differently, and level of rigor in the above comment is not much better than these analogies). In any case, I don’t see how we can actually use these considerations. (I’m working in a direction that should ideally make questions like this more clear in the future.)
If you know how to build a uFAI (or “probably somewhat reflective on its goal system but nowhere near provably Friendly” AI), build one and put it in an encrypted glass case. Ideally you would work out the AGI theory in your head, determine how long it would take to code the AGI after adjusting for planning fallacy, then be ready to start coding if doom is predictably going to occur. If doom isn’t predictable then the safety tradeoffs are larger. This can easily go wrong, obviously.