Reply To (Eliezer Yudkowsky): Pascal’s Muggle Infinitesimal Priors and Strong Evidence
Inspired to Finally Write This By (Lesser Wrong): Against the Linear Utility Hypothesis and the Leverage Penalty.
The problem of Pascal’s Muggle begins:
Suppose a poorly-dressed street person asks you for five dollars in exchange for doing a googolplex’s worth of good using his Matrix Lord powers.
“Well,” you reply, “I think it very improbable that I would be able to affect so many people through my own, personal actions – who am I to have such a great impact upon events? Indeed, I think the probability is somewhere around one over googolplex, maybe a bit less. So no, I won’t pay five dollars – it is unthinkably improbable that I could do so much good!”
“I see,” says the Mugger.
At this point, I note two things. I am not paying. And my the probability the mugger is a Matrix Lord is much higher than five in a googolplex.
That looks like a contradiction. It’s positive expectation to pay, by a lot, and I’m not paying.
Let’s continue the original story.
A wind begins to blow about the alley, whipping the Mugger’s loose clothes about him as they shift from ill-fitting shirt and jeans into robes of infinite blackness, within whose depths tiny galaxies and stranger things seem to twinkle. In the sky above, a gap edged by blue fire opens with a horrendous tearing sound – you can hear people on the nearby street yelling in sudden shock and terror, implying that they can see it too – and displays the image of the Mugger himself, wearing the same robes that now adorn his body, seated before a keyboard and a monitor.
“That’s not actually me,” the Mugger says, “just a conceptual representation, but I don’t want to drive you insane. Now give me those five dollars, and I’ll save a googolplex lives, just as promised. It’s easy enough for me, given the computing power my home universe offers. As for why I’m doing this, there’s an ancient debate in philosophy among my people – something about how we ought to sum our expected utilities – and I mean to use the video of this event to make a point at the next decision theory conference I attend. Now will you give me the five dollars, or not?”
“Mm… no,” you reply.
“No?” says the Mugger. “I understood earlier when you didn’t want to give a random street person five dollars based on a wild story with no evidence behind it. But now I’ve offered you evidence.”
“Unfortunately, you haven’t offered me enough evidence,” you explain.
I’m paying.
So are you.
What changed?
I
The probability of Matrix Lord went up, but the odds were already there, and he’s probably not a Matrix Lord (I’m probably dreaming or hypnotized or nuts or something).
At first the mugger could benefit by lying to you. More importantly, people other than the mugger could benefit by trying to mug you and others who reason like you, if you pay such muggers. They can exploit taking large claims seriously.
Now the mugger cannot benefit by lying to you. Matrix Lord or not, there’s a cost to doing what he just did and it’s higher than five bucks. He can extract as many dollars as he wants in any number of ways. A decision function that pays the mugger need not create opportunity for others.
In theory Matrix Lord could derive some benefit like having data at the decision theory conference, or a bet with another Matrix Lord, and be lying. Sure. But if I’m even 99.999999999% confident this isn’t for real, that seems nuts.
(Also, he could have gone for way more than five bucks. I pay.)
(Also, this guy gave me way more than five dollars worth of entertainment. I pay.)
(Also, this guy gave me way more than five dollars worth of good story. I pay.)
II
The leverage penalty is a crude hack. Our utility function is given, so our probability function had to move or the Shut Up and Multiply would do crazy things like pay muggers.
The way out is our decision algorithm. As per Logical Decision Theory, our decision algorithm is correlated to lots of things, including the probability of muggers approaching you on the street and what benefits they offer. The reason real muggers use a gun rather than a banana is mostly that you’re far less likely to hand cash over to someone holding a banana. The fact that we pay muggers holding guns is why muggers hold guns. If we paid muggers holding bananas, muggers would happily point bananas.
There is a natural tendency to slip out of Functional Decision Theory into Causal Decision Theory. If I give this guy five dollars, how often will it save all these lives? If I give five dollars to this charity, what will that marginal dollar be spent on?
There’s a tendency for some, often economists or philosophers, to go all lawful stupid about expected utility and berate us for not making this slip. They yell at us for voting, and/or asking us to justify not living in a van down by the river on microwaved ramen noodles in terms of our expected additional future earnings from our resulting increased motivation and the networking effects of increased social status.
To them, we must reply: We are choosing the logical output of our decision function, which changes the probability that we’re voting on reasonable candidates, changes the probability there will be mysterious funding shortfalls with concrete actions that won’t otherwise get taken, changes the probability of attempted armed robbery by banana, and changes the probability of random people in the street claiming to be Matrix Lords. It also changes lots of other things that may or may not seem related to the current decision.
Eliezer points out humans have bounded computing power, which does weird things to one’s probabilities, especially for things that can’t happen. Agreed, but you can defend yourself without making sure you never consider benefits multiplied by 3↑↑↑3 without also dividing by 3↑↑↑3. You can have a logical algorithm that says not to treat differently claims of 3↑↑↑3 and 3↑↑↑↑3 if the justification for that number is someone telling you about it. Not because the first claim is so much less improbable, but because you don’t want to get hacked in this way. That’s way more important than the chance of meeting a Matrix Lord.
Betting on your beliefs is a great way to improve and clarify your beliefs, but you must think like a trader. There’s a reason logical induction relies on markets. If you book bets on your beliefs at your fair odds without updating, you will get dutch booked. Your decision algorithm should not accept all such bets!
People are hard to dutch book.
Status quo bias can be thought of as evolution’s solution to not getting dutch booked.
III
Split the leverage penalty into two parts.
The first is ‘don’t reward saying larger numbers’. Where are these numbers coming from? If the numbers come from math we can check, and we’re offered the chance to save 20,000 birds, we can care much more than about 2,000 birds. A guy designing pamphlets picking arbitrary numbers, not so much.
Scope insensitivity can be thought of as evolution’s solution to not getting Pascal’s mugged. The one child is real. Ten thousand might not be. Both scope insensitivity and probabilistic scope sensitivity get you dutch booked.
Scope insensitivity and status quo bias cause big mistakes. We must fight them, but by doing so we make ourselves vulnerable.
You also have to worry about fooling yourself. You don’t want to give your own brain reason to cook the books. There’s an elephant in there. If you give it reason to, it can write down larger exponents.
The second part is applying Bayes’ Rule properly. Likelihood ratios for seeming high leverage are usually large. Discount accordingly. How much is a hard problem. I won’t go into detail here, except to say that if calculating a bigger impact doesn’t increase how excited you are about an opportunity, you are doing it wrong.
This could be true, but I don’t think it is and I think the difference is probably important.
Concern A: I’m just really skeptical that the ancestral environment gave people opportunity to be confronted with Big Numbers that needed to avoid getting mugged over.
Concern B: Scope insensitivity doesn’t kick in at 1000 people. It kicks in at… (drumroll)… two people. (i.e. we have studies where advertisements for donations for a single child make more money than advertisements for 2 children or 8)
This suggests something else is going on than simply numbers getting too big to count, and I think it’s more likely to have to do with how a strong a connection you feel with the recipient. (I notice I’m confused about why this should matter. My first guesses have something to do with how likely the recepient is to return the favor in some fashion)
My response would be that Concerns A and B answer each other. Scope insensitivity kicks in at two people, and two people happened in the ancestral enviornment all the time. The effect extending to a thousand is what happens when you implement a general solution.
Not that I think this is a complete list of benefits (or costs) of such a policy...
This sounds more like the handiwork of evolution to me.
Unless I’m missing something, the trouble with this is that, absent a leverage penalty, all of the reasons you’ve listed for not having a muggable decision algorithm… drumroll… center on the real world, which, absent a leverage penalty, is vastly outweighed by tiny probabilities of googolplexes and ackermann numbers of utilons. If you don’t already consider the Mugger’s claim to be vastly improbable, then all the considerations of “But if I logically decide to let myself be mugged that retrologically increases his probability of lying” or “If I let myself mugged this real-world scenario will be repeated many times” are vastly outweighed by the tiny probability that the Mugger is telling the truth.
I thought for a while about the best way to formalize what I’m thinking in a way that works here.
I do observe that I keep getting tempted by “hey there’s an obvious leverage penalty here” by the “this can only happen for real one time in X where X is the number of lives saved times the percent of people who agree” because of the details of the mugging. Or alternatively, the maximum total impact people together can expect to have in paying off things like this (before the Mugger shows up) seems reasonably capped at one life per person, so our collective real-world decisions clearly matter far more than that very-much-not-hard upper bound.
I think that points to the answer I like most, which is that my reasons aren’t tied only to the real world. They’re also tied to the actions of other logically correlated agents, which includes the people on other worlds that I’m possibly being offered the chance to save, and also the Matrix lords. I mean, we’d hate to have bad results presented at the Matrix Lord decision theory conference he’s doing research for, that seems rather important. Science. My decision reaches into the origin worlds of the Matrix Lords, and if they spent all their days going around Mugging each other with lies about Matrix Lords during their industrial period, it’s doubtful the Matrix gets constructed at all. Then I wouldn’t even exist to pay the guy. It reaches into all the other worlds in the Matrix, which have to make the same decisions I do, and we can’t all be given this opportunity to win ackermann numbers of utilons.
I think I don’t need to do that, and I can get out of this simply by saying that before I see the Mugger I have to view potential genuine Muggers offering googleplexes of utility without evidence as not a large source of potential wins (e.g. at this point my probability really is that low). I mean, assuming the potential Mugger hasn’t seen any of these discussions (I do think there’s at least a one in a ten chance someone tries this Mugging on me within the year, for the fun of it, and it’s probably a favorite), the likelihood ratio of someone trying this even with no evidence is really high. But it’s not that high, it’s not googleplex high. So now we get back to the question of infinitesimal priors and how to react to evidence, and here I think we’re close but have a subtle disagreement about how to handle the question...
(I encourage others reading this to go back and read/remember Eliezer’s response to the Mugger, where he explains that he is allowed to be logically inconsistent due to incomplete computing power, which basically I agree with.)
I take the view that the fact that someone is making a claim at all is good enough to pop me out of my 1:3↑↑↑↑3 level prior, and into numbers that can be expressed reasonably with only one up arrow, and allows me to be inconsistent, simply because (I believe that) someone is saying it at all. Then the sky rift does it again, and I can get us into no-exponents-needed range. But I chose how I make decisions before the statement was made, and I did that without attaching much value to the “actual Matrix lord” branch of utility while putting a lot of utility on the “people might be crazy or lie to me” branch. So I’ve shifted my probability a lot but I still won’t pay until I see the rift. This feels potentially important, that I can and must be inconsistent in my probabilities in response to evidence, due to limited compute, but I shouldn’t throw out the decision algorithm (or at least, not unless this points to a logical mistake in it, or otherwise justifies doing that in similar fashion).
So one answer is that I’m putting my logical inconsistency after I choose my decision algorithm, so the fact that I’m then acting in a non-optimal way in reaction to the new utilon math is justified. The second answer is that if you’re allowed to offer probabilistic off-world utilon considerations then so am I, and mine (through correlated decision functions) still win both in the Matrix and non-Matrix cases. The third answer, as noted in III, is that Bayes’ rule and using logic often does imply something that’s effectively a lot like a leverage penalty in many cases, due to the nature of the propositions, although I think it’s quite reasonable to think that in general most people could in fact find ways to have quite oversize (and possibly not very bounded) impact if they put their minds to it. (Hey, I’m trying, guys.)
(Note to answer other comments I’ve seen: Yes, I can also get out of this via either risk aversion or bounded utility, or similar concepts, if I’m willing to use them. Granted. I’m exploring the case where my utility isn’t bounded, or it’s bounded stupidly large, partly because I think it shouldn’t be very bounded, and partly because it’s important for AGI cases.)
Why? If humans don’t have unbounded utility functions, then presumably we wouldn’t want our AIs to have unbounded utility functions either.
If I build an AGI and get to choose its utility function, I could choose to copy my own (or what mine would be under reflection, a personal CEV), and that’s far from the worst outcome, but as a group we have solutions that we prefer a lot more and that don’t prioritize me overly much, such as CEV. The CEV of an effectively unbounded group of potential agents (yes, laws of physics bounded, but I assume we both agree that’s not a bound small enough to matter here) is effectively unbounded even if each individual agent’s function is tightly bounded.
This should (I think) make intuitive sense; a group of people who individually want basic human-level stuff discover the best way to do that is to coordinate to build a great nation/legacy/civilization, and break the bound on what they care about. The jumps we’re considering don’t seem different in kind to that.
People who don’t exist until after an AGI is created don’t have much influence over how that AGI is designed, and I don’t see any need to make concessions to them (except for the fact that we care about their preferences being satisfied, of course, but that will be reflected in our utility functions).
If you already care about a legacy as a human and you do something to make an advanced computer system aligned with you. Then the advanced computer system should also care about legacy. I don’t see anything as being lost .
I find attempts to reify the legacy as anything more than a shared agreement between near peers deeply disturbing. Mainly because our understanding of nature, humans and reality are tentative and leaky abstractions, and will remain the same way for the foreseeable. Any reification of civilisation will need a method of revising it in some way.
So much will be lost if humans are no longer capable of being active participants in the revision of the greater system. Conversations like this would be pointless. I have more to say on this subject, I’ll try to write something in a bit
Great post. Fantastic. Dutch booking, in particular, is fascinating — first time I heard it.
On a related note, I think there’s a particular type of error that smart people are actually more prone to make — discounting variance.
If I said,
“There’s some very real situations where taking a 99% chance of good outcome X produces better outcomes than taking a seeming 1% chance of 1000x” — nobody would disagree with that, it’s obvious on the surface. There’s secondary effects, morale effects, future resource allocation judgements, etc.
But suddenly, if you say,
“There’s some very real situations where a 99% chance of saving a life produces better outcomes than taking a seeming 1% chance of saving 1000 lives” — some people don’t accept that that’s ever true if you were certain your probabalistic estimates are correct.
Variance is a real thing. It matters. A lot. There’s many situations where a ‘W’ of any size helps get you to the next stage, whereas a “I followed good process but it didn’t work” does not — even among the most stoic, far-sighted, analytical, and rational of people.
The $5 for (googolplex of gains)*(likelihood of the googolplex outcome) has a terribly large amount of variance built into it. Sure, $5 might be below the “screw it” threshold, but as soon you as you substitute a number you can do something significantly large with — say, $10,000 — you start to recognize why you wouldn’t take the bet even if the EV is great on it. All the upside is caught up in the vanishingly small percentage of times that this highly unlikely positive event happens, and there’s a lot of much lower variance activities that produce good outcomes.
If you were the sort of person who would pay Pascal’s mugger, you would still not get solicited for money by people pretending to control the fates of 3↑↑↑3 people, because people don’t do this in real life, and no one would ever discover that it would work on you. I could have written a LessWrong post arguing that it is rational to pay Pascal’s mugger and that I would pay Pascal’s mugger if the scenario ever actually happened to me, and I still probably wouldn’t get Pascal-mugged because of the people who’d read it, none of them would both want to take advantage of me for $5 and be able to get to me for a cost of less than $5. If you were worried about facing more false Pascal’s muggers as a result of being the sort of person who pays Pascal’s mugger, but you would actually pay Pascal’s mugger if not for this concern, then you’d be more likely to come up with some sort of compromise solution, like paying Pascal’s mugger unless they ask for too much or unless you’ve faced too many of them, rather than just not paying Pascal’s mugger. And of course, if you really treated the fates of 3↑↑↑3 people as 3↑↑↑3 times as important as the fate of one person, you’d just keep paying all these Pascal’s muggers who figured out it works on you anyway, and if you ran out of money, you’d try to get more money to keep paying them by whatever means you can with the desparation of a broke heroin addict going through withdrawl, because the unimaginably tiny chance that one of these Pascal’s muggers is actually legit would still be enough to outweigh the fact that you’re attracting false Pascal’s muggers to you by acting that way.
This is *really* good.
>They yell at us for voting, and/or asking us to justify not living in a van down by the river on microwaved ramen noodles in terms of our expected additional future earnings from our resulting increased motivation and the networking effects of increased social status.
Could I ask how FDT justifies not living in a van down by the river?
This feels sort of minor, but for your second link you should really credit the writer (AlexMennen), not the site it’s published on.
Minor but correct, fixed in original (which should update this one as well).
As I said in the other thread:
Realising that your utility function is bounded is sufficient to reject Pascal’s mugging.
That said, I agree with you.
Wonderful post! As I mentioned in the comments on Against the Linear Utility Hypothesis and the Leverage Penalty I am in the process of writing a reply to this as well, and about half of what I had planned to write was this. Thank you for writing this up better than I would have!
I would like to add that not paying the muggler before the rift in the sky, but paying him afterwards, might not be a sole result of the decision theory (although it’s certainly very related): there is also something interesting going on with the probabilities. It is a priori pretty likely that given that a poorly-dressed person walks up to you on the street, you are about to hear some outlandish claim and a request for money. Therefore your updated probability of this being a random muggler instead of a matrix lord is still pretty high even after hearing the offer. However this probabiliy vanishes to almost nothingness after observing the reality-breaking sky rift—and how close to nothingness this vanishes to is directly dependent on the absurdity of the event witnessed.
In practice I would just pony up the money at the end of your scenario, for the same reasons as you give. So I guess my true rejection to Pascal’s Mugging is the one you give. I think I’ll still write out the other half of my ideas though, for agents with slightly more computing power than we have.
See this.
I think AlexMennen nailed it on the head—our utility is bounded and at a low number.