It seems everyone who commented so far isn’t interested in copies at all, under the conditions stipulated (identical and non-interacting). I’m not interested myself. If anyone is interested, could you tell us about it? Thanks.
I would place positive value on extra copies, as an extension of the finding that it is better to be alive than not. (Of course, I subscribe to the pattern philosophy of identity—those who subscribe to the thread philosophy of identity presumably won’t consider this line of reasoning valid.)
How much I would be willing to pay per copy, I don’t know, it depends on too many other unspecified factors. But it would be greater than zero.
In your pattern philosophy of identity, what counts as a pattern? In particular, a simulation or our world (of the kind we are likely to run) doesn’t contain all the information needed to map it to our (simulating) world. Some of the information that describes this mapping resides in the brains of those who look at and interpret the simulation.
It’s not obvious to me that there couldn’t be equally valid mappings from the same simulation to different worlds, and perhaps in such a different world is a copy of you being tortured. Or perhaps there is a mapping of our own world to itself that would produce such a thing.
Is there some sort of result that says this is very improbable given sufficiently complex patterns, or something of the kind, that you rely on?
Know in what sense? If you’re asking for a formal proof, of course there isn’t one because Kolmogorov complexity is incomputable. But if you take a radically skeptical position about that, you have no basis for using induction at all, which in turn means you have no basis for believing you know anything whatsoever; Solomonoff’s lightsaber is the only logical justification anyone has ever come up with for using experience as a guide instead of just acting entirely at random.
I’m not arguing with Solomonoff as a means for learning and understanding the world. But when we’re talking about patterns representing selves, the issue isn’t just to identify the patterns represented and the complexity of their interpretation, but also to assign utility to these patterns.
Suppose that I’m choosing whether to run a new simulation. It will have a simple (‘default’) interpretation, which I have identified, and which has positive utility to me. It also has alternative interpretations, whose decoder complexities are much higher (but still lower than the complexity of specifying the simulation itself). It would be computationally intractable for me to identify all of them. These alternatives may well have highly negative utility to me.
To choose whether the run the simulation, I need to sum the utilities of these alternatives. More complex interpretations will carry lower weight. But what is the guarantee that my utility function is built in such a way that the total utility will still be positive?
I’m guessing this particular question has probably been answered in the context of analyzing behavior of utility functions. I haven’t read all of that material, and a specific pointer would be helpful.
The reason this whole discussion arises is that we’re talking about running simulations that can’t be interacted with. You say that you assign utility to the mere existence of patterns, even non-interacting. A simpler utility function specified only in terms of affecting our single physical world would not have that difficulty.
ETA: as Nisan helped me understand in comments below, I myself in practical situations do accept the ‘default’ interpretation of a simulation. I still think non-human agents could behave differently.
These are interesting questions. They might also apply to a utility function that only cares about things affecting our physical world.
If there were a person in a machine, isolated from the rest of the world and suffering, would we try to rescue it, or would we be satisfied with ensuring that the person never interacts with the real world?
I understood the original stipulation that the simulation doesn’t interact with our world to mean that we can’t affect it to rescue the suffering person.
Let’s consider your alternative scenario: the person in the simulation can’t affect our universe usefully (the simulating machine is well-wrapped and looks like a uniform black body from the outside), and we can’t observe it directly, but we know there’s a suffering person inside and we can choose to break in and modify (or stop) the simulation.
In this situation I would indeed choose to intervene to stop the suffering. Your question is a very good one. Why do I choose here to accept the ‘default’ interpretation which says that inside the simulation is a suffering person?
The simple answer is that I’m human, and I don’t have an explicit or implicit-and-consistent utility function anyway. If people around me tell me there’s a suffering person inside the simulation, I’d be inclined to accept this view.
How much effort or money would I be willing to spend to help that suffering simulated person? Probably zero or near zero. There are many real people alive today who are suffering and I’ve never done anything to explicitly help anyone anonymously.
In my previous comments I was thinking about utility functions in general—what is possible, self-consistent, and optimizes something—rather than human utility functions or my own. As far as I personally am concerned, I do indeed accept the ‘default’ interpretation of a simulation (when forced to make a judgement) because it’s easiest to operate that way and my main goal (in adjusting my utility function) is to achieve my supergoals smoothly, rather than to achieve some objectively correct super-theory of morals. Thanks for helping me see that.
In Solomonoff induction, the weight of a program is the inverse of the exponential of its length. (I have an argument that says this doesn’t need to be assumed a priori, it can be derived, though I don’t have a formal proof of this.) Given that, it’s easy to see that the total weight of all the weird interpretations is negligible compared to that of the normal interpretation.
It’s true that some things become easier when you try to restrict your attention to “our single physical world”, but other things become less easy. Anyway, that’s a metaphysical question, so let’s leave it aside; in which case, to be consistent, we should also forget about the notion of simulations and look at an at least potentially physical scenario.
Suppose the copy took the form of a physical duplicate of our solar system, with the non-interaction requirement met by flinging same over the cosmic event horizon. Now do you think it makes sense to assign this a positive utility?
Given that, it’s easy to see that the total weight of all the weird interpretations is negligible compared to that of the normal interpretation.
I don’t see why. My utility function could also assign a negative utility to (some, not necessarily all) ‘weird’ interpretations whose magnitude would scale exponentially with the bit-lengths of the interpretations.
Is there a proof that this is inconsistent? if I understand correctly, you’re saying that any utility function that assigns very large-magnitude negative utility to alternate interpretations of patterns in simulations, is directly incompatible with Solomonoff induction. That’s a pretty strong claim.
Suppose the copy took the form of a physical duplicate of our solar system, with the non-interaction requirement met by flinging same over the cosmic event horizon. Now do you think it makes sense to assign this a positive utility?
I don’t assign positive utility to it myself. Not above the level of “it might be a neat thing to do”. But I find your utility function much more understandable (as well as more similar to that of many other people) when you say you’d like to create physical clone worlds. It’s quite different from assigning utility to simulated patterns requiring certain interpretations.
Well, not exactly; I’m saying Solomonoff induction has implications for what degree of reality (weight, subjective probability, magnitude, measure, etc.) we should assign certain worlds (interpretations, patterns, universes, possibilities, etc.).
Utility is a different matter. You are perfectly free to have a utility function that assigns Ackermann(4,4) units of disutility to each penguin that exists in a particular universe, whereupon the absence of penguins will presumably outweigh all other desiderata. I might feel this utility function is unreasonable, but I can’t claim it to be inconsistent.
I would spend one day’s hard labor (8-12 hours) to create one copy of me, just because I’m uncertain enough about how the multiverse works that having an extra copy would be vaguely reassuring. I might do another couple of hours on another day for copy #3. After that I think I’m done.
I’m interested, but suspicious of fraud—how do I know the copy really exists?
Also, it seems like as posed, my copies will live in identical universes and have identical futures as well as present state—i.e. I’m making an exact copy of everyone and everything else as well. If that’s the offer, then I’d need more information about the implications of universe cloning. If there are none, then the question seems like nonsense to me.
I was only initially interested at the thought of my copies diverging, even without interaction (I suppose MWI implies this is what goes on behind the scenes all the time).
If the other universe(s) are simulated inside our own, then there may be relevant differences between the simulating universe and the simulated ones.
In particular, how do we create universes identical to the ‘master copy’? The easiest way is to observe our universe, and run the simulations a second behind, reproducing whatever we observe. That would mean decisions in our universe control events in the simulated worlds, so they have different weights under some decision theories.
I assumed we couldn’t observe our copies, because if we could, then they’d be observing them too. In other words, somebody’s experience of observing a copy would have to be fake—just a view of their present reality and not of a distinct copy.
This all follows from the setup, where there can be no difference between a copy (+ its environment) and the original. It’s hard to think about what value that has.
I assume Mass Driver is uncertain between certain specifiable classes of “ways the multiverse could work” (with some probability left for “none of the above”), and that in the majority of the classified hypotheses, having a copy either helps you or doesn’t hurt.
Thus on balance, they should expect positive expected value, even considering that some of the “none of the above” possibilities might be harmful to copying.
Because scenarios where having an extra copy hurts seem… engineered, somehow. Short of having a deity or Dark Lord of the Matrix punish those with so much hubris as to copy themselves, I have a hard time imagining how it could hurt, while I can easily think of simple rules for anthropic probabilities in the multiverse under which it would (1) help or (2) have no effect.
I realize that the availability heuristic is not something in which we should repose much confidence on such problems (thus the probability mass I still assign to “none of the above”), but it does seem to be better than assuming a maxentropy prior on the consequences of all novel actions.
I think, in general, the LW community often errs by placing too much weight on a maxentropy prior as opposed to letting heuristics or traditions have at least some input. Still, it’s probably an overcorrection that comes in handy sometimes; the rest of the world massively overvalues heuristics and tradition, so there are whole areas of possibility-space that get massively underexplored, and LW may as well spend most of its time in those areas.
You could be right about the LW tendency to err… but this thread isn’t the place where it springs to mind as a possible problem! I am almost certain that neither the EEA nor current circumstance are such that heuristics and tradition are likely to give useful decisions about clone trenches.
Well, short of having a deity reward those who copy themselves with extra afterlife, I’m having difficulty imagining how creating non-interacting identical copies could help, either.
The problem with the availability heuristic here isn’t so much that it’s not a formal logical proof. It’s that it fails to convince me, because I don’t happen to have the same intuition about it, which is why we’re having this conversation in the first place.
I don’t see how you could assign positive utility to truly novel actions without being able to say something about their anticipated consequences. But non-interacting copies are pretty much specified to have no consequences.
Well, in my understanding of the mathematical universe, this sort of copying could be used to change anthropic probabilities without the downsides of quantum suicide. So there’s that.
Robin Hanson probably has his own justification for lots of noninteracting copies (assuming that was the setup presented to him as mentioned in the OP), and I’d be interested to hear that as well.
I’m interested. As a question of terminal value, and focusing only on the quality and quantity of life of me and my copies, I’d value copies’ lives the same as my own. Suppose pick-axing for N years is the only way I can avoid dying right now, where N is large enough that I feel that pick-axing is just barely the better choice. Then I’ll also pick-ax for N years to create a copy.
For what it’s worth, I subscribe to the thread philosophy of identity per se, but the pattern philosophy of what Derek Parfit calls “what matters in survival”.
It seems everyone who commented so far isn’t interested in copies at all, under the conditions stipulated (identical and non-interacting). I’m not interested myself. If anyone is interested, could you tell us about it? Thanks.
I would place positive value on extra copies, as an extension of the finding that it is better to be alive than not. (Of course, I subscribe to the pattern philosophy of identity—those who subscribe to the thread philosophy of identity presumably won’t consider this line of reasoning valid.)
How much I would be willing to pay per copy, I don’t know, it depends on too many other unspecified factors. But it would be greater than zero.
In your pattern philosophy of identity, what counts as a pattern? In particular, a simulation or our world (of the kind we are likely to run) doesn’t contain all the information needed to map it to our (simulating) world. Some of the information that describes this mapping resides in the brains of those who look at and interpret the simulation.
It’s not obvious to me that there couldn’t be equally valid mappings from the same simulation to different worlds, and perhaps in such a different world is a copy of you being tortured. Or perhaps there is a mapping of our own world to itself that would produce such a thing.
Is there some sort of result that says this is very improbable given sufficiently complex patterns, or something of the kind, that you rely on?
Yes, Solomonoff’s Lightsaber: the usual interpretations need much shorter decoder programs.
Why? How do we know this?
Know in what sense? If you’re asking for a formal proof, of course there isn’t one because Kolmogorov complexity is incomputable. But if you take a radically skeptical position about that, you have no basis for using induction at all, which in turn means you have no basis for believing you know anything whatsoever; Solomonoff’s lightsaber is the only logical justification anyone has ever come up with for using experience as a guide instead of just acting entirely at random.
I’m not arguing with Solomonoff as a means for learning and understanding the world. But when we’re talking about patterns representing selves, the issue isn’t just to identify the patterns represented and the complexity of their interpretation, but also to assign utility to these patterns.
Suppose that I’m choosing whether to run a new simulation. It will have a simple (‘default’) interpretation, which I have identified, and which has positive utility to me. It also has alternative interpretations, whose decoder complexities are much higher (but still lower than the complexity of specifying the simulation itself). It would be computationally intractable for me to identify all of them. These alternatives may well have highly negative utility to me.
To choose whether the run the simulation, I need to sum the utilities of these alternatives. More complex interpretations will carry lower weight. But what is the guarantee that my utility function is built in such a way that the total utility will still be positive?
I’m guessing this particular question has probably been answered in the context of analyzing behavior of utility functions. I haven’t read all of that material, and a specific pointer would be helpful.
The reason this whole discussion arises is that we’re talking about running simulations that can’t be interacted with. You say that you assign utility to the mere existence of patterns, even non-interacting. A simpler utility function specified only in terms of affecting our single physical world would not have that difficulty.
ETA: as Nisan helped me understand in comments below, I myself in practical situations do accept the ‘default’ interpretation of a simulation. I still think non-human agents could behave differently.
These are interesting questions. They might also apply to a utility function that only cares about things affecting our physical world.
If there were a person in a machine, isolated from the rest of the world and suffering, would we try to rescue it, or would we be satisfied with ensuring that the person never interacts with the real world?
I understood the original stipulation that the simulation doesn’t interact with our world to mean that we can’t affect it to rescue the suffering person.
Let’s consider your alternative scenario: the person in the simulation can’t affect our universe usefully (the simulating machine is well-wrapped and looks like a uniform black body from the outside), and we can’t observe it directly, but we know there’s a suffering person inside and we can choose to break in and modify (or stop) the simulation.
In this situation I would indeed choose to intervene to stop the suffering. Your question is a very good one. Why do I choose here to accept the ‘default’ interpretation which says that inside the simulation is a suffering person?
The simple answer is that I’m human, and I don’t have an explicit or implicit-and-consistent utility function anyway. If people around me tell me there’s a suffering person inside the simulation, I’d be inclined to accept this view.
How much effort or money would I be willing to spend to help that suffering simulated person? Probably zero or near zero. There are many real people alive today who are suffering and I’ve never done anything to explicitly help anyone anonymously.
In my previous comments I was thinking about utility functions in general—what is possible, self-consistent, and optimizes something—rather than human utility functions or my own. As far as I personally am concerned, I do indeed accept the ‘default’ interpretation of a simulation (when forced to make a judgement) because it’s easiest to operate that way and my main goal (in adjusting my utility function) is to achieve my supergoals smoothly, rather than to achieve some objectively correct super-theory of morals. Thanks for helping me see that.
In Solomonoff induction, the weight of a program is the inverse of the exponential of its length. (I have an argument that says this doesn’t need to be assumed a priori, it can be derived, though I don’t have a formal proof of this.) Given that, it’s easy to see that the total weight of all the weird interpretations is negligible compared to that of the normal interpretation.
It’s true that some things become easier when you try to restrict your attention to “our single physical world”, but other things become less easy. Anyway, that’s a metaphysical question, so let’s leave it aside; in which case, to be consistent, we should also forget about the notion of simulations and look at an at least potentially physical scenario.
Suppose the copy took the form of a physical duplicate of our solar system, with the non-interaction requirement met by flinging same over the cosmic event horizon. Now do you think it makes sense to assign this a positive utility?
I don’t see why. My utility function could also assign a negative utility to (some, not necessarily all) ‘weird’ interpretations whose magnitude would scale exponentially with the bit-lengths of the interpretations.
Is there a proof that this is inconsistent? if I understand correctly, you’re saying that any utility function that assigns very large-magnitude negative utility to alternate interpretations of patterns in simulations, is directly incompatible with Solomonoff induction. That’s a pretty strong claim.
I don’t assign positive utility to it myself. Not above the level of “it might be a neat thing to do”. But I find your utility function much more understandable (as well as more similar to that of many other people) when you say you’d like to create physical clone worlds. It’s quite different from assigning utility to simulated patterns requiring certain interpretations.
Well, not exactly; I’m saying Solomonoff induction has implications for what degree of reality (weight, subjective probability, magnitude, measure, etc.) we should assign certain worlds (interpretations, patterns, universes, possibilities, etc.).
Utility is a different matter. You are perfectly free to have a utility function that assigns Ackermann(4,4) units of disutility to each penguin that exists in a particular universe, whereupon the absence of penguins will presumably outweigh all other desiderata. I might feel this utility function is unreasonable, but I can’t claim it to be inconsistent.
I would spend one day’s hard labor (8-12 hours) to create one copy of me, just because I’m uncertain enough about how the multiverse works that having an extra copy would be vaguely reassuring. I might do another couple of hours on another day for copy #3. After that I think I’m done.
I’m interested, but suspicious of fraud—how do I know the copy really exists?
Also, it seems like as posed, my copies will live in identical universes and have identical futures as well as present state—i.e. I’m making an exact copy of everyone and everything else as well. If that’s the offer, then I’d need more information about the implications of universe cloning. If there are none, then the question seems like nonsense to me.
I was only initially interested at the thought of my copies diverging, even without interaction (I suppose MWI implies this is what goes on behind the scenes all the time).
If the other universe(s) are simulated inside our own, then there may be relevant differences between the simulating universe and the simulated ones.
In particular, how do we create universes identical to the ‘master copy’? The easiest way is to observe our universe, and run the simulations a second behind, reproducing whatever we observe. That would mean decisions in our universe control events in the simulated worlds, so they have different weights under some decision theories.
I assumed we couldn’t observe our copies, because if we could, then they’d be observing them too. In other words, somebody’s experience of observing a copy would have to be fake—just a view of their present reality and not of a distinct copy.
This all follows from the setup, where there can be no difference between a copy (+ its environment) and the original. It’s hard to think about what value that has.
If you’re uncertain about how the universe works, why do you think that creating a clone is more likely to help you than to harm you?
I assume Mass Driver is uncertain between certain specifiable classes of “ways the multiverse could work” (with some probability left for “none of the above”), and that in the majority of the classified hypotheses, having a copy either helps you or doesn’t hurt.
Thus on balance, they should expect positive expected value, even considering that some of the “none of the above” possibilities might be harmful to copying.
I understand that that’s what Mass_Driver is saying. I’m asking, why think that?
Because scenarios where having an extra copy hurts seem… engineered, somehow. Short of having a deity or Dark Lord of the Matrix punish those with so much hubris as to copy themselves, I have a hard time imagining how it could hurt, while I can easily think of simple rules for anthropic probabilities in the multiverse under which it would (1) help or (2) have no effect.
I realize that the availability heuristic is not something in which we should repose much confidence on such problems (thus the probability mass I still assign to “none of the above”), but it does seem to be better than assuming a maxentropy prior on the consequences of all novel actions.
I think, in general, the LW community often errs by placing too much weight on a maxentropy prior as opposed to letting heuristics or traditions have at least some input. Still, it’s probably an overcorrection that comes in handy sometimes; the rest of the world massively overvalues heuristics and tradition, so there are whole areas of possibility-space that get massively underexplored, and LW may as well spend most of its time in those areas.
You could be right about the LW tendency to err… but this thread isn’t the place where it springs to mind as a possible problem! I am almost certain that neither the EEA nor current circumstance are such that heuristics and tradition are likely to give useful decisions about clone trenches.
Well, short of having a deity reward those who copy themselves with extra afterlife, I’m having difficulty imagining how creating non-interacting identical copies could help, either.
The problem with the availability heuristic here isn’t so much that it’s not a formal logical proof. It’s that it fails to convince me, because I don’t happen to have the same intuition about it, which is why we’re having this conversation in the first place.
I don’t see how you could assign positive utility to truly novel actions without being able to say something about their anticipated consequences. But non-interacting copies are pretty much specified to have no consequences.
Well, in my understanding of the mathematical universe, this sort of copying could be used to change anthropic probabilities without the downsides of quantum suicide. So there’s that.
Robin Hanson probably has his own justification for lots of noninteracting copies (assuming that was the setup presented to him as mentioned in the OP), and I’d be interested to hear that as well.
I’m interested. As a question of terminal value, and focusing only on the quality and quantity of life of me and my copies, I’d value copies’ lives the same as my own. Suppose pick-axing for N years is the only way I can avoid dying right now, where N is large enough that I feel that pick-axing is just barely the better choice. Then I’ll also pick-ax for N years to create a copy.
For what it’s worth, I subscribe to the thread philosophy of identity per se, but the pattern philosophy of what Derek Parfit calls “what matters in survival”.