Incidentally: How would it affect your intuition if you instead could participate in the Intergalactic Utilium Lottery, where probabilities and payoffs are the same but where you trust the organizers that they do what they promise?
If I actually trust the lottery officials, that means that I have certain knowledge of the utility probabilities and costs for each of my choices. Thus, I guess I’d choose whichever option generated the most utility, and it wouldn’t be a matter of “intuition” any more.
Applying that logic to the initial Mugger problem, if I calculated, and was certain of, there being at least a 1 in 3^^^^3 chance that the mugger was telling the truth, then I’d pay him. In fact, I could mentally reformulate the problem to have the mugger saying “If you don’t give me $5, I will use the powers vested in me by the Intergalactic Utilium Lottery Commission to generate a random number between 1 and N, and if it’s a 7, then I kill K people.” I then divide K by N to get an idea of the full moral force of what’s going on. If K/N is even within several orders of magnitude of 1, I’d better pay up.
The problem is the uncertainty. Solomonoff induction gives the claim “I can kill 3^^^^3 people any time I want” a substantial probability, whereas “common sense” will usually give it literally zero. If we trust the lottery guys, questions of induction versus common sense become moot—we know the probability, and must act on it.
I think this is actually the core of the issue—not certainty of your probability, per se, but rather how it is derived. I think I may have finally solved this!
See if you can follow me on this…
If Pascal Muggers were completely independent instances of each other—that is, every person attempting a Pascal’s Mugging has their own unique story and motivation for initiating it, without it correlating to you or the other muggers, then you have no additional information to go on. You shut up and multiply, and if the utility calculation comes out right, you pay the mugger. Sure, you’re almost certainly throwing money away, but the off-chance more than offsets this by definition. Note that the probability calculation itself is complicated and not linear: Claiming higher numbers increases the probability that they are lying. However it’s still possible they would come up with a number high enough to override this function.
At which point we previously said: “Aha! So this is a losing strategy! The Mugger ought not be able to arbitrarily manipulate me in this manner!”
Or: “So what’s stopping the mugger from upping the number arbitrarily, or mugging me multiple times?”
…To which I answer, “check the assumptions we started with”.
Note that the assumption was that the Mugger is not influenced by me, nor by other muggings. The mugger’s reasons for making the claim are their own. So “not trying to manipulate me knowing my algorithm” was an explicit assumption here.
What if we get rid of the assumption? Why, then now an increasingly higher utility claim (or recurring muggings) don’t just raise the probability that the mugger is wrong/lying for their own inscrutable reasons. It additionally raises the probability that they are lying to manipulate me, knowing (or guessing) my algorithm.
Basically, I add in the question “why did the mugger choose the number 3^^^3 and not 1967? This makes it more likely that they are trying to overwhelm my algorithm, (mistakenly) thinking that it can thus be overwhelmed”. If the mugger chooses 4^^^4 instead, this further (and proportionally?) increases said suspicion. And so on.
I propose that the combined weight of these probabilities rises faster than the claimed utility. If that is the case, then for all claimed utilities x higher than N, where N is a number that prompts a negative expected utility result, x would likewise produce a negative expected utility result.
Presumably, an AI with good enough grasp of motives and manipulation, this would not pose a problem for very long. We can specifically test for this behavior, checking the AI’s analysis for increasingly higher claims and seeing whether the expected utility function really has a downward slope under these conditions.
I can try to further mathematize this (is this even a real word?). Is this necessary?
The answer seems superficially satisfactory. Have I actually solved it? I don’t really have a lot of time to keep grappling with it (been thinking about this on and off for the past few months), so I would welcome criticism even more than usual.
This is a very good point—the higher the number chosen, the more likely it is that the mugger is lying—but I don’t think it quite solves the problem.
The probability that a person, out to make some money, will attempt a Pascal’s Mugging can be no greater than 1, so let’s imagine that it is 1. Every time I step out of my front door, I get mobbed by Pascal’s Muggers. My mail box is full of Pascal’s Chain Letters. Whenever I go online, I get popups saying “Click this link or 3^^^^3 people will die!”. Let’s say I get one Pascal-style threat every couple of minutes, so the probability of getting one in any given minute is 0.5.
Then, let the probability of someone genuinely having the ability to kill 3^^^^3 people, and then choosing to threaten me with that, be x per minute—that is, over the course of one minute, there’s an x chance that a genuine extra-Matrix being will contact me and make a Pascal Mugging style threat, on which they will actually deliver.
Naturally, x is tiny. But, if I receive a Pascal threat during a particular minute, the probability that it’s genuine is x/(0.5+x), or basically 2x. If 2x * 3^^^^3 is at all close to 1, then what can I do but pay up? Like it or not, Pascal muggings would be more common in a world where people can carry out the threat, than in a world where they can’t. No amount of analysis of the muggers’ psychology can change the prior probability that a genuine threat will be made—it just increases the amount of noise that hides the genuine threat in a sea of opportunistic muggings.
But that is precisely it—it’s no longer a Pascal mugging if the threat is credible. That is, in order to be successful, the mugger needs to be able to up the utility claim arbitrarily! It is assumed that we already know how to handle a credible threat, what we didn’t know how to deal with was a mugger who could always make up a bigger number, to a degree where the seeming impossibility of the claim no longer offsets the claimed utility. But as I showed, this only works if you don’t enter the mugger’s thought process into the calculation.
This actually brings up an important corollary to my earlier point: The higher the number, the less likely the coupling is between the mugger’s claim and the mugger’s intent.
A person who can kill another person might well want 5$, for whatever reason.
In contrast, a person who can use power from beyond the Matrix to torture 3^^^3 people already has IMMENSE power. Clearly such a person has all the money they want, and even more than that in the influence that money represents. They can probably create the money out of nothing. So already their claims don’t make sense if taken at face value.
Maybe the mugger just wants me to surrender to an arbitrary threat? But in that case, why me? If the mugger really has immense power, they could create a person they know would cave in to their demands.
Maybe I’m special for some reason. But if the mugger is REALLY that powerful, wouldn’t they be able to predict my actions beforehand, a-la Omega?
Each rise in claimed utility brings with it a host of assumptions that need to be made for the action-claimed reaction link to be maintained. And remember, the mugger’s ability is not the only thing dictating expected utility, but also the mugger’s intentions. Each such assumption not only weakens the probability of the mugger carrying out their threat because they can’t, it also raises the probability of the mugger rewarding refusal and/or punishing compliance. Just because the off-chance comes true and the mugger contacting me actually CAN carry out the threat, does not make them sincere; the mugger might be testing my rationality skills, for instance, and could severely punish me for failing the test.
As the claimed utility approaches infinity, so does the scenario approach Pascal’s Wager: An unknowable, symmetrical situation, where an infinite number of possible outcomes cancel each other out. The one outcome that isn’t canceled out is the loss of 5$. So the net utility is negative. So I don’t comply with the mugger.
I’m still not sure I’m fully satisfied with the level of math my explanation has, even though I’ve tried to set the solution in terms of limits and attractors. But I think I can draw a graph that dips under zero utility fairly quickly (or maybe doesn’t really ever go over it?), and never goes back up—asymptotic at −5$ utility. Am I wrong?
A person who can kill another person might well want 5$, for whatever reason. In contrast, a person who can use power from beyond the Matrix to torture 3^^^3 people already has IMMENSE power. Clearly such a person has all the money they want, and even more than that in the influence that money represents. They can probably create the money out of nothing. So already their claims don’t make sense if taken at face value.
A person who can kill another person might well want 5$, for whatever reason. In contrast, a person who can use power from beyond the Matrix to torture 3^^^3 people already has IMMENSE power. Clearly such a person has all the money they want, and even more than that in the influence that money represents. They can probably create the money out of nothing. So already their claims don’t make sense if taken at face value.
Ah, my mistake. You’re arguing based on the intent of a legitimate mugger, rather than the fakes. Yes, that makes sense. If we let f(N) be the probability that somebody has the power to kill N people on demand, and g(N) be the probability that somebody who has the power to kill N people on demand would threaten to do so if he doesn’t get his $5, then it seems highly likely that Nf(N)g(N) approaches zero as N approaches infinity. What’s even better news is that, while f(N) may only approach zero slowly for easily constructed values of N like 3^^^^3 and 4^^^^4 because of their low Kolmogorov complexity, g(N) should scale with 1/N or something similar, because the more power someone has, the less likely they are to execute such a miniscule, petty threat. You’re also quite right in stating that the more power the mugger has, the more likely it is that they’ll reward refusal, punish compliance or otherwise decouple the wording of the threat from their actual intentions, thus making g(N) go to zero even more quickly.
So, yeah, I’m pretty satisfied that Nf(N)g(N) will asymptote to zero, taking all of the above into account.
(In more unrelated news, my boyfriend claims that he’d pay the mugger, on account of him obviously being mentally ill. So that’s two out of three in my household. I hope this doesn’t catch on.)
That is backward. It is only a Pascal mugging if the threat is credible. Like one made by Omega, who you mention later on.
Huh? Isn’t the whole point of Pascal’s mugging that it isn’t likely and the mugger makes up for the lack of credibility by making the threat massive? If the mugger is making a credible threat we just call that a mugging.
“The threat has to be credible at the level of probability it is assigned. ”
And what, precisely, does THAT mean?
If I try to taboo some words here, I get “we must evaluate the likelihood of something happening as the likelihood we assigned for it to happen”. That’s simply tautological.
No probability is exactly zero except for self-contradictory statements. So “credible” can’t mean “of zero probability” or “impossible to believe”. To me, “credible” means “something I would not have a hard time believing without requiring extraordinary evidence”, which in itself translates pretty much to “>0.1% probability”. If you have some reason for distinguishing between a threat that is not credible and a threat with exceedingly low probability of being carried out, please state it. Also please note that my use of the word makes sense within the original context of my reply to HopeFox, who was discussing the implications of a world where such threats were not incredible.
Pascal’s mugging happens when the probability you would assign disregarding manipulation is very low (not a credible threat by normal standards), with the claimed utility being arbitrarily high to offset this. If that is not the case, it’s a non-challenge and is not particularly relevant to our discussion.
Does that clarify my original statement?
Pascal’s mugging happens when the probability you would assign disregarding manipulation is very low (not a credible threat by normal standards), with the claimed utility being arbitrarily high to offset this. If that is not the case, it’s a non-challenge and is not particularly relevant to our discussion. Does that clarify my original statement?
That makes sense. Whereas my statement roughly meant “Pascal’s wager isn’t about someone writing BusyBeaver(3^^^3)”—that’s not even a decision problem worth mentioning.
The threat has to be credible at the level of probability it is assigned. It doesn’t have to be likely.
How are you defining credible? It may be that we are using different notions of what this means. I’m using it to mean something like “capable of being believed” or “could be plausibly believed by a somewhat rational individual” but these have meanings that are close to “likely”.
I’m afraid I don’t follow. I don’t quite see how this negates the point I was making.
While it is conceivable that I simply lack the math to understand what you’re getting at, it seems to me that a simply-worded explanation of what you mean (or alternately a simple explanation of why you cannot give one) would be more suitable in this forum. Or if this has already been explained in such terms anywhere, a link or reference would likewise be helpful.
Incidentally: How would it affect your intuition if you instead could participate in the Intergalactic Utilium Lottery, where probabilities and payoffs are the same but where you trust the organizers that they do what they promise?
If I actually trust the lottery officials, that means that I have certain knowledge of the utility probabilities and costs for each of my choices. Thus, I guess I’d choose whichever option generated the most utility, and it wouldn’t be a matter of “intuition” any more.
Applying that logic to the initial Mugger problem, if I calculated, and was certain of, there being at least a 1 in 3^^^^3 chance that the mugger was telling the truth, then I’d pay him. In fact, I could mentally reformulate the problem to have the mugger saying “If you don’t give me $5, I will use the powers vested in me by the Intergalactic Utilium Lottery Commission to generate a random number between 1 and N, and if it’s a 7, then I kill K people.” I then divide K by N to get an idea of the full moral force of what’s going on. If K/N is even within several orders of magnitude of 1, I’d better pay up.
The problem is the uncertainty. Solomonoff induction gives the claim “I can kill 3^^^^3 people any time I want” a substantial probability, whereas “common sense” will usually give it literally zero. If we trust the lottery guys, questions of induction versus common sense become moot—we know the probability, and must act on it.
I think this is actually the core of the issue—not certainty of your probability, per se, but rather how it is derived. I think I may have finally solved this!
See if you can follow me on this… If Pascal Muggers were completely independent instances of each other—that is, every person attempting a Pascal’s Mugging has their own unique story and motivation for initiating it, without it correlating to you or the other muggers, then you have no additional information to go on. You shut up and multiply, and if the utility calculation comes out right, you pay the mugger. Sure, you’re almost certainly throwing money away, but the off-chance more than offsets this by definition. Note that the probability calculation itself is complicated and not linear: Claiming higher numbers increases the probability that they are lying. However it’s still possible they would come up with a number high enough to override this function.
At which point we previously said: “Aha! So this is a losing strategy! The Mugger ought not be able to arbitrarily manipulate me in this manner!” Or: “So what’s stopping the mugger from upping the number arbitrarily, or mugging me multiple times?” …To which I answer, “check the assumptions we started with”.
Note that the assumption was that the Mugger is not influenced by me, nor by other muggings. The mugger’s reasons for making the claim are their own. So “not trying to manipulate me knowing my algorithm” was an explicit assumption here.
What if we get rid of the assumption? Why, then now an increasingly higher utility claim (or recurring muggings) don’t just raise the probability that the mugger is wrong/lying for their own inscrutable reasons. It additionally raises the probability that they are lying to manipulate me, knowing (or guessing) my algorithm.
Basically, I add in the question “why did the mugger choose the number 3^^^3 and not 1967? This makes it more likely that they are trying to overwhelm my algorithm, (mistakenly) thinking that it can thus be overwhelmed”. If the mugger chooses 4^^^4 instead, this further (and proportionally?) increases said suspicion. And so on.
I propose that the combined weight of these probabilities rises faster than the claimed utility. If that is the case, then for all claimed utilities x higher than N, where N is a number that prompts a negative expected utility result, x would likewise produce a negative expected utility result.
Presumably, an AI with good enough grasp of motives and manipulation, this would not pose a problem for very long. We can specifically test for this behavior, checking the AI’s analysis for increasingly higher claims and seeing whether the expected utility function really has a downward slope under these conditions.
I can try to further mathematize this (is this even a real word?). Is this necessary? The answer seems superficially satisfactory. Have I actually solved it? I don’t really have a lot of time to keep grappling with it (been thinking about this on and off for the past few months), so I would welcome criticism even more than usual.
This is a very good point—the higher the number chosen, the more likely it is that the mugger is lying—but I don’t think it quite solves the problem.
The probability that a person, out to make some money, will attempt a Pascal’s Mugging can be no greater than 1, so let’s imagine that it is 1. Every time I step out of my front door, I get mobbed by Pascal’s Muggers. My mail box is full of Pascal’s Chain Letters. Whenever I go online, I get popups saying “Click this link or 3^^^^3 people will die!”. Let’s say I get one Pascal-style threat every couple of minutes, so the probability of getting one in any given minute is 0.5.
Then, let the probability of someone genuinely having the ability to kill 3^^^^3 people, and then choosing to threaten me with that, be x per minute—that is, over the course of one minute, there’s an x chance that a genuine extra-Matrix being will contact me and make a Pascal Mugging style threat, on which they will actually deliver.
Naturally, x is tiny. But, if I receive a Pascal threat during a particular minute, the probability that it’s genuine is x/(0.5+x), or basically 2x. If 2x * 3^^^^3 is at all close to 1, then what can I do but pay up? Like it or not, Pascal muggings would be more common in a world where people can carry out the threat, than in a world where they can’t. No amount of analysis of the muggers’ psychology can change the prior probability that a genuine threat will be made—it just increases the amount of noise that hides the genuine threat in a sea of opportunistic muggings.
But that is precisely it—it’s no longer a Pascal mugging if the threat is credible. That is, in order to be successful, the mugger needs to be able to up the utility claim arbitrarily! It is assumed that we already know how to handle a credible threat, what we didn’t know how to deal with was a mugger who could always make up a bigger number, to a degree where the seeming impossibility of the claim no longer offsets the claimed utility. But as I showed, this only works if you don’t enter the mugger’s thought process into the calculation.
This actually brings up an important corollary to my earlier point: The higher the number, the less likely the coupling is between the mugger’s claim and the mugger’s intent.
A person who can kill another person might well want 5$, for whatever reason. In contrast, a person who can use power from beyond the Matrix to torture 3^^^3 people already has IMMENSE power. Clearly such a person has all the money they want, and even more than that in the influence that money represents. They can probably create the money out of nothing. So already their claims don’t make sense if taken at face value.
Maybe the mugger just wants me to surrender to an arbitrary threat? But in that case, why me? If the mugger really has immense power, they could create a person they know would cave in to their demands.
Maybe I’m special for some reason. But if the mugger is REALLY that powerful, wouldn’t they be able to predict my actions beforehand, a-la Omega?
Each rise in claimed utility brings with it a host of assumptions that need to be made for the action-claimed reaction link to be maintained. And remember, the mugger’s ability is not the only thing dictating expected utility, but also the mugger’s intentions. Each such assumption not only weakens the probability of the mugger carrying out their threat because they can’t, it also raises the probability of the mugger rewarding refusal and/or punishing compliance. Just because the off-chance comes true and the mugger contacting me actually CAN carry out the threat, does not make them sincere; the mugger might be testing my rationality skills, for instance, and could severely punish me for failing the test.
As the claimed utility approaches infinity, so does the scenario approach Pascal’s Wager: An unknowable, symmetrical situation, where an infinite number of possible outcomes cancel each other out. The one outcome that isn’t canceled out is the loss of 5$. So the net utility is negative. So I don’t comply with the mugger.
I’m still not sure I’m fully satisfied with the level of math my explanation has, even though I’ve tried to set the solution in terms of limits and attractors. But I think I can draw a graph that dips under zero utility fairly quickly (or maybe doesn’t really ever go over it?), and never goes back up—asymptotic at −5$ utility. Am I wrong?
This is known as the “What does God need with a starship?” problem.
Indeed. I was going to write that as part of my original post, and apparently forgot… Thanks for the addition :)
Ah, my mistake. You’re arguing based on the intent of a legitimate mugger, rather than the fakes. Yes, that makes sense. If we let f(N) be the probability that somebody has the power to kill N people on demand, and g(N) be the probability that somebody who has the power to kill N people on demand would threaten to do so if he doesn’t get his $5, then it seems highly likely that Nf(N)g(N) approaches zero as N approaches infinity. What’s even better news is that, while f(N) may only approach zero slowly for easily constructed values of N like 3^^^^3 and 4^^^^4 because of their low Kolmogorov complexity, g(N) should scale with 1/N or something similar, because the more power someone has, the less likely they are to execute such a miniscule, petty threat. You’re also quite right in stating that the more power the mugger has, the more likely it is that they’ll reward refusal, punish compliance or otherwise decouple the wording of the threat from their actual intentions, thus making g(N) go to zero even more quickly.
So, yeah, I’m pretty satisfied that Nf(N)g(N) will asymptote to zero, taking all of the above into account.
(In more unrelated news, my boyfriend claims that he’d pay the mugger, on account of him obviously being mentally ill. So that’s two out of three in my household. I hope this doesn’t catch on.)
That is backward. It is only a Pascal mugging if the threat is credible. Like one made by Omega, who you mention later on.
No, then it’s just a normal mugging.
If the threat is not credible from the perspective of the target it may only be an attempted mugging—not a proper mugging at all.
Which relates to this heuristic.
Huh? Isn’t the whole point of Pascal’s mugging that it isn’t likely and the mugger makes up for the lack of credibility by making the threat massive? If the mugger is making a credible threat we just call that a mugging.
The threat has to be credible at the level of probability it is assigned. It doesn’t have to be likely.
“The threat has to be credible at the level of probability it is assigned. ”
And what, precisely, does THAT mean? If I try to taboo some words here, I get “we must evaluate the likelihood of something happening as the likelihood we assigned for it to happen”. That’s simply tautological.
No probability is exactly zero except for self-contradictory statements. So “credible” can’t mean “of zero probability” or “impossible to believe”. To me, “credible” means “something I would not have a hard time believing without requiring extraordinary evidence”, which in itself translates pretty much to “>0.1% probability”. If you have some reason for distinguishing between a threat that is not credible and a threat with exceedingly low probability of being carried out, please state it. Also please note that my use of the word makes sense within the original context of my reply to HopeFox, who was discussing the implications of a world where such threats were not incredible.
Pascal’s mugging happens when the probability you would assign disregarding manipulation is very low (not a credible threat by normal standards), with the claimed utility being arbitrarily high to offset this. If that is not the case, it’s a non-challenge and is not particularly relevant to our discussion. Does that clarify my original statement?
That makes sense. Whereas my statement roughly meant “Pascal’s wager isn’t about someone writing BusyBeaver(3^^^3)”—that’s not even a decision problem worth mentioning.
How are you defining credible? It may be that we are using different notions of what this means. I’m using it to mean something like “capable of being believed” or “could be plausibly believed by a somewhat rational individual” but these have meanings that are close to “likely”.
Yes with plausible priors, e.g. Solomonoff induction, as discussed in this paper.
I’m afraid I don’t follow. I don’t quite see how this negates the point I was making.
While it is conceivable that I simply lack the math to understand what you’re getting at, it seems to me that a simply-worded explanation of what you mean (or alternately a simple explanation of why you cannot give one) would be more suitable in this forum. Or if this has already been explained in such terms anywhere, a link or reference would likewise be helpful.