[...] I had originally intended the scenario of Pascal’s Mugging to point up what seemed like a basic problem with combining conventional epistemology with conventional decision theory: Conventional epistemology says to penalize hypotheses by an exponential factor of computational complexity. This seems pretty strict in everyday life: “What? for a mere 20 bits I am to be called a million times less probable?” But for stranger hypotheses about things like Matrix Lords, the size of the hypothetical universe can blow up enormously faster than the exponential of its complexity. This would mean that all our decisions were dominated by tiny-seeming probabilities (on the order of 2-100 and less) of scenarios where our lightest action affected 3↑↑4 people… which would in turn be dominated by even more remote probabilities of affecting 3↑↑5 people...
[...] Unfortunately I failed to make it clear in my original writeup that this was where the problem came from, and that it was general to situations beyond the Mugger. Nick Bostrom’s writeup of Pascal’s Mugging for a philosophy journal used a Mugger offering a quintillion days of happiness, where a quintillion is merely 1,000,000,000,000,000,000 = 1018. It takes at least two exponentiations to outrun a singly-exponential complexity penalty. I would be willing to assign a probability of less than 1 in 1018 to a random person being a Matrix Lord. You may not have to invoke 3↑↑↑3 to cause problems, but you’ve got to use something like 1010100 - double exponentiation or better. Manipulating ordinary hypotheses about the ordinary physical universe taken at face value, which just contains 1080 atoms within range of our telescopes, should not lead us into such difficulties.
(And then the phrase “Pascal’s Mugging” got completely bastardized to refer to an emotional feeling of being mugged that some people apparently get when a high-stakes charitable proposition is presented to them, regardless of whether it’s supposed to have a low probability. This is enough to make me regret having ever invented the term “Pascal’s Mugging” in the first place; and for further thoughts on this see The Pascal’s Wager Fallacy Fallacy (just because the stakes are high does not mean the probabilities are low, and Pascal’s Wager is fallacious because of the low probability, not the high stakes!) and Being Half-Rational About Pascal’s Wager Is Even Worse. Again, when dealing with issues the mere size of the apparent universe, on the order of 1080 - for small large numbers—we do not run into the sort of decision-theoretic problems I originally meant to single out by the concept of “Pascal’s Mugging”. My rough intuitive stance on x-risk charity is that if you are one of the tiny fraction of all sentient beings who happened to be born here on Earth before the intelligence explosion, when the existence of the whole vast intergalactic future depends on what we do now, you should expect to find yourself surrounded by a smorgasbord of opportunities to affect small large numbers of sentient beings. There is then no reason to worry about tiny probabilities of having a large impact when we can expect to find medium-sized opportunities of having a large impact, so long as we restrict ourselves to impacts no larger than the size of the known universe.)
[...] And finally, I once again state that I abjure, refute, and disclaim all forms of Pascalian reasoning and multiplying tiny probabilities by large impacts when it comes to existential risk. We live on a planet with upcoming prospects of, among other things, human intelligence enhancement, molecular nanotechnology, sufficiently advanced biotechnology, brain-computer interfaces, and of course Artificial Intelligence in several guises. If something has only a tiny chance of impacting the fate of the world, there should be something with a larger probability of an equally huge impact to worry about instead. You cannot justifiably trade off tiny probabilities of x-risk improvement against efforts that do not effectuate a happy intergalactic civilization, but there is nonetheless no need to go on tracking tiny probabilities when you’d expect there to be medium-sized probabilities of x-risk reduction.
[...] EDIT: To clarify, “Don’t multiply tiny probabilities by large impacts” is something that I apply to large-scale projects and lines of historical probability. On a very large scale, if you think FAI stands a serious chance of saving the world, then humanity should dump a bunch of effort into it, and if nobody’s dumping effort into it then you should dump more effort than currently into it. On a smaller scale, to compare two x-risk mitigation projects in demand of money, you need to estimate something about marginal impacts of the next added effort (where the common currency of utilons should probably not be lives saved, but “probability of an ok outcome”, i.e., the probability of ending up with a happy intergalactic civilization). In this case the average marginal added dollar can only account for a very tiny slice of probability, but this is not Pascal’s Wager. Large efforts with a success-or-failure criterion are rightly, justly, and unavoidably going to end up with small marginally increased probabilities of success per added small unit of effort. It would only be Pascal’s Wager if the whole route-to-an-OK-outcome were assigned a tiny probability, and then a large payoff used to shut down further discussion of whether the next unit of effort should go there or to a different x-risk.
If I understand your argument, it’s something like “when the probability of the world being saved is below n%, humanity should stop putting any effort into saving the world”. Could you clarify what value of n (roughly) you think justifies “let’s give up”?
(If we just speak in qualitative terms, we’re more likely to just talk past each other. E.g., making up numbers: maybe you’ll say ‘we should give up if the world is only one-in-a-million likely to survive’, and Eliezer will reply ‘oh, of course, but our survival odds are way higher than that’. Or maybe you’ll say ‘we should give up if the world is only one-in-fifty likely to survive’, and Eliezer will say ‘that sounds like the right ballpark for how dire our situation is, but that seems way too early to simply give up’.)
I think - Humans are bad at informal reasoning about small probabilities since they don’t have much experience to calibrate on, and will tend to overestimate the ones brought to their attention, so informal estimates of the probability very unlikely events should usually be adjusted even lower. - Humans are bad at reasoning about large utilities, due to lack of experience as well as issues with population ethics and the mathematical issues with unbounded utility, so estimates of large utilities of outcomes should usually be adjusted lower. - Throwing away most of the value in the typical case for the sake of an unlikely case seems like a dubious idea to me even if your probabilities and utility estimates are entirely correct; the lifespan dilemma and similar results are potential intuition pumps about the issues with this, and go through even with only single-exponential utilities at each stage. Accordingly I lean towards overweighting the typical range of outcomes in my decision theory relative to extreme outcomes, though there are certainly issues with this approach as well.
As far as where the penalty starts kicking in quantitatively, for personal decisionmaking I’d say somewhere around “unlikely enough that you expect to see events at least this extreme less than once per lifetime”, and for altruistic decisionmaking “unlikely enough that you expect to see events at least this extreme less than once in the history of humanity”. For something on the scale of AI alignment I think that’s around 1/1000? If you think the chances of success are still over 1% then I withdraw my objection.
The Pascalian concern aside I note that the probability of AI alignment succeeding doesn’t have to be *that* low before its worthwhileness becomes sensitive to controversial population ethics questions. If you don’t consider lives averted to be a harm then spending $10B to decrease the chance of 10 billion deaths by 1/10000 is worse value than AMF. If you’re optimizing for the average utility of all lives eventually lived then increasing the chance of a flourishing future civilization to pull up the average is likely worth more but plausibly only ~100x more (how many people would accept a 1% chance of postsingularity life for a 99% chance of immediate death?) so it’d still be a bad bet below 1/1000000. (Also if decreasing xrisk increases srisk, or if the future ends up run by total utilitarians, it might actually pull the average down.)
If I’m understanding the original question correctly (and if not, well, I’m asking it myself), the issue is that as you just pointed out, there are plenty of non-AI-related massive threats to humanity that we may be able to avert with far higher likelihood, (assuming we survive long enough to be able to do so). If the probability of changing the AGI end-of-the-world situation is extremely low, and if that was the only potential danger to humanity, then of course we should still focus on it. However, we also face many other risks we actually stand a chance of fighting, and according to Yudkowsky’s line of thinking, we should act for the counterfactual world in which we somehow solve the alignment problem. Therefore, shouldn’t we be focusing more on other issues, if the probabilities are really that bad?
It’s not Pascal’s mugging in the senses described in the first posts about the problem:
Quoting from Being Half-Rational About Pascal’s Wager is Even Worse:
If I understand your argument, it’s something like “when the probability of the world being saved is below n%, humanity should stop putting any effort into saving the world”. Could you clarify what value of n (roughly) you think justifies “let’s give up”?
(If we just speak in qualitative terms, we’re more likely to just talk past each other. E.g., making up numbers: maybe you’ll say ‘we should give up if the world is only one-in-a-million likely to survive’, and Eliezer will reply ‘oh, of course, but our survival odds are way higher than that’. Or maybe you’ll say ‘we should give up if the world is only one-in-fifty likely to survive’, and Eliezer will say ‘that sounds like the right ballpark for how dire our situation is, but that seems way too early to simply give up’.)
I think
- Humans are bad at informal reasoning about small probabilities since they don’t have much experience to calibrate on, and will tend to overestimate the ones brought to their attention, so informal estimates of the probability very unlikely events should usually be adjusted even lower.
- Humans are bad at reasoning about large utilities, due to lack of experience as well as issues with population ethics and the mathematical issues with unbounded utility, so estimates of large utilities of outcomes should usually be adjusted lower.
- Throwing away most of the value in the typical case for the sake of an unlikely case seems like a dubious idea to me even if your probabilities and utility estimates are entirely correct; the lifespan dilemma and similar results are potential intuition pumps about the issues with this, and go through even with only single-exponential utilities at each stage. Accordingly I lean towards overweighting the typical range of outcomes in my decision theory relative to extreme outcomes, though there are certainly issues with this approach as well.
As far as where the penalty starts kicking in quantitatively, for personal decisionmaking I’d say somewhere around “unlikely enough that you expect to see events at least this extreme less than once per lifetime”, and for altruistic decisionmaking “unlikely enough that you expect to see events at least this extreme less than once in the history of humanity”. For something on the scale of AI alignment I think that’s around 1/1000? If you think the chances of success are still over 1% then I withdraw my objection.
The Pascalian concern aside I note that the probability of AI alignment succeeding doesn’t have to be *that* low before its worthwhileness becomes sensitive to controversial population ethics questions. If you don’t consider lives averted to be a harm then spending $10B to decrease the chance of 10 billion deaths by 1/10000 is worse value than AMF. If you’re optimizing for the average utility of all lives eventually lived then increasing the chance of a flourishing future civilization to pull up the average is likely worth more but plausibly only ~100x more (how many people would accept a 1% chance of postsingularity life for a 99% chance of immediate death?) so it’d still be a bad bet below 1/1000000. (Also if decreasing xrisk increases srisk, or if the future ends up run by total utilitarians, it might actually pull the average down.)
If I’m understanding the original question correctly (and if not, well, I’m asking it myself), the issue is that as you just pointed out, there are plenty of non-AI-related massive threats to humanity that we may be able to avert with far higher likelihood, (assuming we survive long enough to be able to do so). If the probability of changing the AGI end-of-the-world situation is extremely low, and if that was the only potential danger to humanity, then of course we should still focus on it. However, we also face many other risks we actually stand a chance of fighting, and according to Yudkowsky’s line of thinking, we should act for the counterfactual world in which we somehow solve the alignment problem. Therefore, shouldn’t we be focusing more on other issues, if the probabilities are really that bad?