A problem with this framing that’s related to, but I think not the same as, the one pointed out by Joe_Collman:
“Play to your outs” suggests focusing on this one goal even if what you have to do to have a hope of meeting that goal have other very bad consequences. This makes sense in many games, where the worst win is better than the best not-win. It may not make sense in actual life.
If you reckon some drastic or expensive action makes a 2% improvement to our chances of not getting wiped out by a super-powerful unaligned AI, then that improvement might be worth a lot of pain. If you reckon the same action makes a 0.02% improvement to our chances, the tradeoff against other considerations may look quite different.
But if you frame things so that your only goal is “win if possible” at the game of not getting wiped out by a super-powerful unaligned AI, and if either of those is the best “out” available, then “play to your outs” says you should do it.
Joe points out that you might be wrong about it being the best “out”. I am pointing out that if it is your best “out” then whether it’s a good idea may depend on how good an “out” it is, and that the “play to your outs” framing discourages thinking about that and considering the possibility that it might not be good enough to be worth playing to.
[EDITED to add:] Q3 at the end of the post is relevant, but I don’t think the answer given invalidates what I’m saying. It may very well be true that most crazy/violent plans are bad, but that’s the wrong question; the right question is whether the highest-probability plans (in the scenario where all those probabilities are very low) are crazy/violent.
A problem with this framing that’s related to, but I think not the same as, the one pointed out by Joe_Collman:
“Play to your outs” suggests focusing on this one goal even if what you have to do to have a hope of meeting that goal have other very bad consequences. This makes sense in many games, where the worst win is better than the best not-win. It may not make sense in actual life.
If you reckon some drastic or expensive action makes a 2% improvement to our chances of not getting wiped out by a super-powerful unaligned AI, then that improvement might be worth a lot of pain. If you reckon the same action makes a 0.02% improvement to our chances, the tradeoff against other considerations may look quite different.
But if you frame things so that your only goal is “win if possible” at the game of not getting wiped out by a super-powerful unaligned AI, and if either of those is the best “out” available, then “play to your outs” says you should do it.
Joe points out that you might be wrong about it being the best “out”. I am pointing out that if it is your best “out” then whether it’s a good idea may depend on how good an “out” it is, and that the “play to your outs” framing discourages thinking about that and considering the possibility that it might not be good enough to be worth playing to.
[EDITED to add:] Q3 at the end of the post is relevant, but I don’t think the answer given invalidates what I’m saying. It may very well be true that most crazy/violent plans are bad, but that’s the wrong question; the right question is whether the highest-probability plans (in the scenario where all those probabilities are very low) are crazy/violent.