Is this the correct interpretation of the first three sentences?
If the aliens are sufficiently less likely to present the ultimatum when they believe that we would not surrender upon being presented with the ultimatum, then we should not surrender.
That is, our decisions procedures should not return “surrender” in the situation where having a decision procedure that returns “surrender” increases the counter-factual prior probability of being presented the ultimatum, even after we have been given the ultimatum.
This correct decision of not surrendering when given the ultimatum (a decision which results in a loss of utility via the captives being tortured), if it is given or expected with sufficient certainty that “the aliens are more likely to not present the ultimatum if they think we will not surrender upon being presented the ultimatum” is analogous to the correct decision of paying the counterfactual mugger when given the losing outcome of a bet (a decision which results in a loss of utility via paying money), if it is provided or expected with sufficient certainty that “the counter-factual mugger would pay us if we would have won the bet given that the counter-factual mugger thought that we would pay em upon losing the bet”.
That is, in the same way that we act in accordance with how we have precommited to paying the losing counter-factual mugging bet, since doing so would have maximized our counterfactual prior expected utility, we now should act in accordance with how would we had precommited to not surrendering upon being presented the ultimatum, since doing so would have increased our counterfactual prior expected utility.
That is, the reflectively consistent algorithm to which a friendly AI would self-modify in advance of being presented with this situation is such that it would choose to let the captives be tortured in order to decrease the prior expectation of captives being tortured.
-
If all of that was correct, would a FAI modify to such a reflectively consistent decision procedure only on the condition of expecting to encounter such situations or unconditionally?
Is this the correct interpretation of the first three sentences?
If the aliens are sufficiently less likely to present the ultimatum when they believe that we would not surrender upon being presented with the ultimatum, then we should not surrender.
That is, our decisions procedures should not return “surrender” in the situation where having a decision procedure that returns “surrender” increases the counter-factual prior probability of being presented the ultimatum, even after we have been given the ultimatum.
This correct decision of not surrendering when given the ultimatum (a decision which results in a loss of utility via the captives being tortured), if it is given or expected with sufficient certainty that “the aliens are more likely to not present the ultimatum if they think we will not surrender upon being presented the ultimatum” is analogous to the correct decision of paying the counterfactual mugger when given the losing outcome of a bet (a decision which results in a loss of utility via paying money), if it is provided or expected with sufficient certainty that “the counter-factual mugger would pay us if we would have won the bet given that the counter-factual mugger thought that we would pay em upon losing the bet”.
That is, in the same way that we act in accordance with how we have precommited to paying the losing counter-factual mugging bet, since doing so would have maximized our counterfactual prior expected utility, we now should act in accordance with how would we had precommited to not surrendering upon being presented the ultimatum, since doing so would have increased our counterfactual prior expected utility.
That is, the reflectively consistent algorithm to which a friendly AI would self-modify in advance of being presented with this situation is such that it would choose to let the captives be tortured in order to decrease the prior expectation of captives being tortured.
-
If all of that was correct, would a FAI modify to such a reflectively consistent decision procedure only on the condition of expecting to encounter such situations or unconditionally?