From https://www.gwern.net/mugging:
One way to try to escape a mugging is to unilaterally declare that all probabilities below a certain small probability will be treated as zero. With the right pair of lower limit and mugger’s credibility, the mugging will not take place. But such a ad hoc method violates common axioms of probability theory, and thus we can expect there to be repercussions.
It turns out to be easy to turn such a person into a money pump, if not by the mugging itself. Suppose your friend adopts this position, and he says specifically that any probabilities less than or equal to 1⁄20 are 0. You then suggest a game; the two of you will roll a d20 die, and if the die turns up 1-19, you will pay him one penny and if the die turns up 20, he pays you one dollar—no, one bazillion dollars. Your friend then calculates: there is a 19⁄20 chance that he will win a little money, and there is a 1⁄20 chance he will lose a lot of money—but wait, 1⁄20 is too small to matter! It is zero chance, by his rule. So, there is no way he can lose money on this game and he can only make money.
He is of course wrong, and on the 5th roll you walk away with everything he owns. (And you can do this as many times as you like.)
Of course, that’s not how a sane street-rational person would think! They would not play for “one bazillion dollars” no matter the odds. In general, detecting a sufficiently intelligent adversarial entity tends to result in avoiding the interaction altogether (if you are inured enough to Nigerian princes offering billions in an email). And yet I cannot find any LW discussion on when and if to engage and when to not engage, except in an occasional comment.
The usual response to this “haha I would just refuse to bet, your Dutch book arguments are powerless over me” is that bets can be labeled as actions or in-actions at will, somewhat like how we might model all of reality as the player ‘Nature’ in a decision tree/game, and so you can no more “refuse to bet” than you can “refuse to let time pass”. One can just rewrite the scenario to make your default action equivalent to accepting the bet; then what? ‘Refusing to bet’ is a vacuous response.
In that specific example, I used the setup of bets because it’s easy to explain, but it is isomorphic to many possible scenarios: for example, perhaps it is actually about hurricanes and ‘buying homeowner’s insurance’. “Hurricanes happen every 20 years on average; if they happen, they will obliterate your low-lying house which currently has no insurance; you can choose between paying for homeowner’s insurance at a small cost every year and be paid for the possible loss of your house, or you can enjoy the premium each year but should there be a hurricane you will lose everything. You buy homeowner’s insurance, but your friend reasons that since all probabilities <=1/20 == 0, to avoid muggings, and therefore the insurance is worthless so he doesn’t get insurance for his house. 5 years later, Hurricane Sandy hits...” You cannot ‘refuse to bet’, you can only either get the insurance or not, and the default in this has been changed to ‘not’, defeating the fighting-the-hypothetical.
(Which scenario, for someone really determined to fight the hypothetical and wiggle out of defending the 1⁄20 = 0 part which we are trying to discuss, may just trigger more evasions: “insurance is by definition -EV since you get back less than you pay in due to overhead and profits! so it’s rational to not want it” and then you can adjust the hypothetical payoff—“it’s heavily subsidized by the federal government, because, uh, public choice theory reasons”—and then he’ll switch to “ah, but I could just invest the saved premiums in the stock market, did you think about opportunity cost, smartypants?”, and so on and so forth.)
And similarly, in real life, people cannot simply say “I refuse to have an opinion about whether hurricanes are real or how often they happen! What even is ‘real’, anyway? How can we talk about the probability of hurricanes given the interminable debate between long-run frequency and subjective interpretations of ‘probability’...” That’s all well and good, but the hurricane is still going to come along and demolish your house at some point, and you either will or will not have insurance when that happens. Either option implies a ‘bet’ on hurricanes which may or may not be coherent with your other beliefs and information, and if you are incoherent and so your choice of insurance depends on whether the insurance agent happened to frame the hurricane risk as being 1-in-20-years or 1-in-2-decades, probably that is not a good thing and probably the coherent bettors are going to do better.
I would go further and note that I think this describes x-risks in general. The ordinary person might ‘refuse to bet’ about anything involving one bazillion dollars, and whine about being asked about such things or being expected to play at some odds for one bazillion dollars—well, that’s just too f—king bad, bucko, you don’t get to refuse to bet. You were born at the roulette table with the ball already spinning around the wheel and the dealer sitting with stacks of bazillion dollar chips. Nuclear weapons exist. Asteroids exist. Global pandemics exist. AGI is going to exist. You don’t get to pretend they neither do nor can exist and you can just ignore the ‘bets’ of investing or not investing in x-risk related things. You are required to take actions, or choose to not take actions: “red or black, monsieur?” Maybe you should not invest in them or worry about them, but that is, in fact, a choice.
curious why this is down-voted—any ideas?
I downvoted it because there’s obviously a real art of “disabling interfaces from which others may be able to exploit you” and that’s what OP is gesturing at. The answerer is unhelpfully dismissing the question in a way that I think is incorrect.
And I, of course, disagree with that, because I think the adversarial/game-theory framing is deeply unhelpful, because it is both a different problem and a much more trivial boring problem than the real problem; and in fact, is exactly the sort of clever excuse people use to not actually deal with any of the real issues while declaring victory over Pascal’s mugging, and I rephrased it in a way where adversarial dynamics were obviously irrelevant to try to draw out that, if you think it’s just about ‘disabling interfaces others may exploit you with’, you have missed the mark.
The hurricane does not, and cannot, care what cute stories you tell about “I have to precommit to ignore such low absolute-magnitude possibilities for ~game-theoretic raisins~”. Your house is going to be flooded and destroyed, or it won’t be; do you buy the insurance, or not?
You can obviously modify these problems to be borne from some obviously natural feature of the environment, so that it’s unlikely for your map to be the result adversarial opponent looking for holes in your risk assessment algorithm, at which point refusing to buy insurance out of fear of being exploited is stupid.
Alas, OP is talking about a different class of hypotheticals, so he requires a different answer. The correct action in the case quoted by OP and the one he is alluding to is that, given that you’re a human with faulty probabilistic reasoning facilities, you should rationally refuse weird trades where Dark Rationalists are likely to money pump you. As a proof that you are a nonideal agent dutch book arguments are fine, but that’s as far as it goes, and sapient individuals have ways of getting around being a nonideal agent in adversarial environments without losing all their money. I find those means interesting and non-”trivial” even if you don’t, and apparently so does the OP.
https://www.lesswrong.com/posts/6XsZi9aFWdds8bWsy/is-there-any-discussion-on-avoiding-being-dutch-booked-or?commentId=h2ggSzhBLdPLKPsyG
No, I explained why it was stupid, it’s right there in the post, and you pretending you don’t see it is getting on my nerves. I said:
In other words, it’s stupid because a naturally produced hurricane or rock falling down a cliff does not has a brain, and is unlikely to be manipulating you into doing something net negative, and so you should just reason naturally. Humans have brains and goals in conflicts with yours, so when humans come up to you after hearing about your decision making algorithm asking to play weird bets, you may rationally ignore those offers on the principle that you don’t want to be tricked somehow.
You know this, I know you know I know you know this, I think you’re just being profoundly silly at this point.
The information security term is “limiting your attack surface”. In circumstances where you expect other bots to be friendly, you might be more open to unusual or strange inputs and compacts that are harder to check for exploits but seem net positive on the surface. In circumstances where you expect bots to be less friendly, you might limit your dealings to very simple, popular, and transparently safe interactions, and reject some potential deals that appear net-positive but harder to verify. In picking a stance you have to make a tradeoff between being able to capitalize on actually good but ~novel/hard-to-model/dangerous trades and interactions, vs. being open to exploits, and the human brain has a simple (though obviously not perfect) model for assessing the circumstances to see which stance is appropriate.
I think part of why we are so resistant to accept the validity of Pascal’s muggings, is that people see it as inappropriate to be so open to such a novel trade with complete strangers, cultists, or ideologues (labeled the ‘mugger’) who might not have our best interests in mind. But this doesn’t have anything to do with low probability extremely negative events being “ignorable”. If you change the scenario so that the ‘mugger’ is instead just a force of nature, unlikely to have landed on a glitch for your risk assessment cognition by chance, then it becomes a lot more ambiguous what you should actually do. Other people here seem to take the lesson of Pascal’s mugging as a reason against hedging against large negatives in general to their own peril, which doesn’t seem correct to me.
If you apply the Solomonoff prior, amounts of money offered grow far faster than their probabilities decrease, because there are small programs that compute gigantic numbers. So a stipulation that the probabilities must decrease faster has that obstacle to face, and is wishful thinking anyway.
Perhaps a better response is to consider it from the viewpoint of some sort of timeless decision theory, together with game theory. If I am willing to pay the mugger, that means I have a policy of paying the mugger. If this is known to others, it leaves me open to anyone who has no gigantic amount to offer making the same offer and walking off with all my money. This is a losing strategy, therefore it is wrong.
There must be a mathematical formulation of this.
Is there a thing called “adversarial prior”?
Maybe there should be. I have an intuition that if the game theory is done right, the Solomonoff argument is neutralised. Who you will face in a game depends on your strategy for playing it. A mugger-payer will face false muggers. More generally, the world you encounter depends on your strategy for interacting with it. This is not just because your strategy determines what parts you look at, but also because the strategies of the agents you meet depend on yours, and yours on theirs, causally and acausally. The Solomonoff prior describes a pure observer who cannot act upon the world.
But this is just a vague gesture towards where a theory might be found.
There absolutely should be if there isn’t already. Would love to work with an actual mathematician on this....
You can dodge it by having a bounded utilityfunction, or if you’re utilitarian and good a function that is at most linear in anthropic experience.
If the mugger says “give me your wallet and I’ll cause you 3^^^^3 units of personal happiness” you can argue that’s impossible because your personal happiness doesn’t go that high.
If the mugger says “give me your wallet and I’ll cause 1 unit of happiness to 3^^^^3 people who you altruistically care about” you can say that, in the possible world where he’s telling the truth, there are 3^^^^3 + 1 people only one of which gets the offer and the others get the payout, so on priors it’s at least 1/3^^^^3 against for you to experience recieving an offer, and you should consider it proportionally unlikely.
I don’t think people realise how much astronomically more likely it is to truthfully be told “God created this paradise for you and your enormous circle of friends to reward an alien for giving him his wallet with zero valid reasoning whatsoever” than to be truthfully asked by that same Deity for your stuff in exchange for the distant unobservable happiness of countless strangers.
More generally, you can avoid most flavours of adversarial muggings with 2 rules: first don’t make any trade that an ideal agent wouldn’t make (because that’s always some kind of money pump), and second don’t make any trade that looks dumb. Not making trades can cost you in terms of missed opportunities, but you can’t adversarially exploit the trading strategy of a rock with “no deal” written on it.
Ah, well, there you go then.
Why is this? I’m not immediately seeing why this is necessarily the case.
You’re far more likely to be a background character than the protagonist in any given story, so a theory claiming you’re the most important person in a universe with an enormous number of people has an enormous rareness penalty to overcome before you should believe it instead of that you’re just insane or being lied to. Being in a utilitarian high-leverage position for the lives of billions can be overcome by reasonable evidence, but for the lives of 3^^^^3 people the rareness penalty is basically impossible to overcome. Even if the story is true, most of the observers will be witnessing it from the position of tied-to-the-track, not holding the lever, so if you’d assign low prior expectation to being in the tied-to-the-track part of the story, you should assign an enormous factor lower of being in the decision-making part of it.
Sounds like you’re trying to argue from the anthropic principle that very important games are unlikely, but that’s some really fallacious reasoning that asserts a lot of things about what your utility function is like. “Protagonist” is a two-piece word. A very pain averse and unempathetic person might reasonably subjectively consider themselves the most important person in the universe, and assign negative ${a lot} points to them getting tortured to death, but that doesn’t mean they’re not getting tortured to death.
The apriori unlikelihood of finding oneself at the crux of history (or in a similarly rare situation) is a greatly underrated topic here, I suppose because it works corrosively against making any kind of special effort. If they had embraced a pseudo-anthropic expectation of personal mediocrity, the great achievers of history would presumably have gotten nowhere. And yet the world is also full of people who tried and failed, or who hold a mistaken idea of their own significance; something which is consistent with the rarity of great achievements. I’m not sure what the “rational” approach here might be.
I feel like my real rejection is less about it being huge number (H) unlikely to get H utilons from a random person. The Solomonoff argument seems to hold up: there are many H such that H + [the code for a person who goes around granting utilons] is a lot shorter than H is big.
My rejection is just… IDK how to affect that. I have literally no good reason to think that paying the mugger affects whether I get H utilons, and I make my decisions based on how they affect outcomes, not based on “this one possible consequence of this one action would be Huge”. I think this strongly argues that one should spend one’s time figuring out how to affect whether one gets Huge utilons, but that just seems correct?
Maybe there’s also a time-value argument here? Like, I have to keep my $5 for now, because IDK how to affect getting H utilons, but I expect that in the future I’ll be better at affecting getting H utilons, and therefore I should hang on to my resources so I’ll have opportunities to affect H utilons later.
If I do have good reason to expect that paying $5 gets me the H utilons more than not paying, it’s not a mugging, it’s a good trade. For humans, simply saying “If you do X for me I’ll do Y for you” is evidence of that statement being the case… but not if Y is Yuge. It doesn’t generalize like that (waves hands, but this feels right).