Down with Solomonoff Induction, up with the Presumptuous Philosopher
Followup to “The Presumptuous Philosopher, self-locating information, and Solomonoff induction”
In the comments last time, everyone seemed pretty content to ditch the Presumptuous Philosopher (who said that the universe with a trillion times the copies of you was a trillion times as likely), and go with Solomonoff induction (which says that the universe with a trillion times the copies of you is only a few times as likely). To rally some support for the Presumptuous Philosopher, here’s a decision game for you.
The Game:
If you start out with a universe that has only one of you, and then Omega comes along and copies you, you now need an extra bit to say which you you are. If Omega has flipped a coin (or consulted a high-entropy bit string) to decide whether to copy you, you now think there is a 50% chance you’re un-copied, and a 25% chance you’re each of the copies (if you’re not a Solomonoff inductor, pretend to be one for now).
So far, this seems, although not obvious, at least familiar. It’s the “halfer” position in Sleeping Beauty, which you might also call Bostrom’s “SSA” position. It’s not widely accepted because of frequentist / pragmatic arguments like “if you did this experiment a bunch of times, each person would converge to a frequency of 2⁄3 copied and 1⁄3 non-copied” and abstract arguments about symmetry of information. But of course Solomonoff induction would reply that there’s no symmetry between the different hypotheses about the universe, and so it’s okay to make an asymmetrical assumption under which the more you’re copied, the less your universe matters to the average.
But suppose now that Omega, in addition to maybe copying you at t_0, comes along and shoots the copy (or a deterministic one of the two yous) at t_1. After this happens, you now no longer need that extra bit to describe which you is you, and so the probabilities go back to being the same for the two hypotheses about the universe and Omega’s coin.
So here’s what Omega does. He has all of you play for chocolate bars, or some other short-term satisfying experience. After t_0, you get offered this deal: If you accept, then the no-copy you will get 2 chocolate bars and both copy-yous will lose 1.9 chocolate bars. Since you’re so much more likely to be in the no-copy universe, you accept.
Next, still before t_1, Omega gives a bonus for the two copies transferring chocolate. You all get offered a deal where the copy that will happen to survive loses 0.5 chocolate bars, and the copy that will get shot gains 1 chocolate bar. Since you’re not sure which you are but think they’re equally likely, and this doubles the chocolate, you accept.
Then, after t_1, Omega deterministically shoots one copy but still doesn’t tell the survivor that they’re in the copy-universe. The two of you are offered a deal to lose 2 chocolate bars if you’re in the no-copy universe, but gain 2.1 if you’re in the copy universe. Since this is more chocolate among equally likely possibilities, you accept a third time.
You are now neutral on chocolate in the no-copy universe, the you that died in the copy universe paid 0.9 chocolate bars, and the you that survived in the copy universe paid 0.3 chocolate bars. You have now been Dutch-process booked.
If we treat the hypotheses of Solomonoff induction as if they are descriptions of physical universes (specifically, things with a property of continuity), then Solomonoff induction seems to be doing something wrong. And if we don’t treat Turing machines as specifications of physical universes, then this removes much of the bite to applying Solomonoff induction in the Presumptuous Philosopher problem.
I didn’t follow all of that, but your probability count after shooting seems wrong. IIUC, you claim that the probabilities go from {1/2 uncopied, 1⁄4 copy #1, 1⁄4 copy #2} to {1/2 uncopied, 1⁄2 surviving copy}. This is not right. The hypotheses considered in Solomonoff induction are supposed to describe your entire history of subjective experiences. Some hypotheses are going to produce histories in which you get shot. Updating on not being shot is just a Bayesian update, it doesn’t change the complexity count. Therefore, the resulting probabilities are {2/3 uncopied, 1⁄3 surviving copy}.
This glosses over what Solomonoff induction thinks you will experience if you do get shot, which requires a formalism for embedded agency to treat properly (in which case the answer becomes, ofc, that you don’t experience anything), but the counting principle remains the same.
Hm. I agree that this seems reasonable. But how do you square this with what happens if you locate yourself in a physical hypothesis by some property of yourself? Then it seems straightforward that when there are two things that match the property, they need a bit to distinguish the two results. And the converse of this is that when there’s only one thing that matches, it doesn’t need a bit to distinguish possible results.
I think it’s very possible that I’m sneaking in an anthropic assumption that breaks the property of Bayesian updating. For example, if you do get shot, then Solomonoff induction is going to expect the continuation to look like incomprehensible noise that corresponds to the bridging law that worked really well so far. But if you make an anthropic assumption and ask for the continuation that is still a person, you’ll get something like “waking up in the hospital after miraculously surviving” that has experienced a “mysterious” drop in probability relative to just before getting shot.
The fact that one copy gets shot doesn’t mean that “there’s only one thing that matches”. In spacetime the copy that got shot still exists. You do have hypotheses of the form “locate a person that still lives after time t and track their history to the beginning and forward to the future”, but those hypotheses are suppressed by 2−K(t).
I’ve written a post that argues that Solomonoff Induction actually is a thirder, not a halfer, and sketches an explanation.
https://www.lesswrong.com/posts/Jqwb7vEqEFyC6sLLG/solomonoff-induction-and-sleeping-beauty
This example seems a little unfair on Solomonoff Induction, which after all is only supposed to predict future sensory input, not answer decision theory problems. To get it to behave as in the post, you need to make some unstated assumptions about the utility functions of the agents in question(e.g. why do they care about other copies and universes? AIXI, the most natural agent defined in terms of Solomonoff induction, wouldn’t behave like that)
It seems that in general, anthropic reasoning and decision theory end up becoming unavoidably intertwined(e.g.) and we still don’t have a great solution.
I favor Solomonoff induction as the solution to (epistemic) anthropic problems because it seems like any other approach ends up believing crazy things in mathematical(or infinite) universes. It also solves other problems like the Born rule ‘for free’, and of course induction from sense data generally. This doesn’t mean it’s infallible, but it inclines me to update towards S.I.’s answer on questions I’m unsure about, since it gets so much other stuff right while being very simple to express mathematically.
I think that the problem here is that you still need info to distinguish yourself from shot-you. Consider a universe that contains one copy of every possible thing. In this universe, the information to locate you is identical to the information to describe you. In this case, describing you includes describing every memory you have. But if you have memories of reading a physics textbook and then doing some experiments, then the shortest description of your memories is going to include a description of the laws of physics. One copy of everything theory is a bad theory compared to physics.
If you have a simple physical law that predicts 1 human surrounded by 3^^^3 paperclips, then locating yourself is easy. Many simple algorithms, like looking for places where hydrogen is more abundant than iron, will locate you. In this universe, if Omega duplicates you, its twice as hard to point to each copy. And if he shoots one copy, it’s still twice as hard to point to the other copy. (the search for things that aren’t paperclips comes up with 2 items) In this universe, you would reject the third part of the deal.
Suppose that the universe was filled with interesting stuff in the sense that the shortest way to point to you was coordinates. Being duplicated gives 2 places to point to, so you expect that you were duplicated with probability 2⁄3. Once one copy is shot, you expect that your prob of being the other copy is 1⁄2. In this universe you would reject the first part of the deal.
In both cases you perform a Bayesian update based on the fact that you are still alive.
Yeah, this implicitly depends on how you’re being pointed to. I’m assuming that you’re being pointed to as “person who has had history [my history]”. In which case you don’t need to worry about doing extra work to distinguish yourself from shot-you once your histories (ballistically) diverge. (EDIT: I think. This is weird.)
If you’re pointed to with coordinates, it’s more likely that copies are treated as not adding any complexity. But this is disanalogous to the Presumptuous Philosopher problem, so I’m avoiding that case.
You do need to distinguish, it’s part of your history. If you are including your entire history as uncompressed sensory data, that will contain massive redundancy. The universe does not contain all possible beings in equal proportions. Imagine being the only person in an otherwise empty universe, very easy to point to. Now imagine that Omega makes 10100 copies, and tells each copy their own 100 digit id number. It takes 100 digits more complexity to point to any one person. The process of making the copies makes each copy harder to point to. The process of telling them id numbers doesn’t change the complexity. You only have 2 copies with id’s of “shot” and “not shot”.
Right, but suppose that everybody knows beforehand that Omega is going to preserve copy number 0 only (or that it’s otherwise a consequence of things you already know).
This “pays in advance” for the complexity of the number of the survivor. After t_1, it’s not like they’ve been exposed to anything they would be surprised about if they were never copied.
Ah, wait, is that the resolution? If there’s a known designated survivor, are they required to be simple to specify even before t_1, when they’re “mixed in” with the rest?
The centermost person and the person numbered 0 are simple to specify beforehand.
Given that you know what’s going on in the rest of the universe, the one that doesn’t get shot is also simple to specify.