I am using a future light cone whereas your alternatives seem to be formulated in terms of a past light cone.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis). In that case, the radius of the light cone “now” (at t = 13.8 billion years since Big Bang) is the same as the particle horizon “now” of the observable universe (roughly 45 billion light-years). So the light-cone so far (starting at Big Bang and running up to 13.8 billion years) will be bigger than Earth’s past light-cone (starting now and running back to the Big Bang) but not massively bigger.
This means that there might be a few expanded simulations who are outside our past light-cone (so we don’t see them now, but could run into them in the future). Still if there are lots of civilisations in your light cone, and only a few have expanded, that still implies a very strong future filter. So my main point remains: given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with (so not many to get filtered out; a modest filter does the trick).
The effective time discount function is of rather slow decay because the sum over universes includes time translated
versions of the same universe. As a result, the effective discount falls off as 2^{-Kolmogorov complexity of t} which is only
slightly faster than 1/t.
Ahh… I was assuming you discounted faster than that, since you said the utilities converged. There is a problem with Kolmogorov discounting of t. Consider what happens at t = 3^^^3 years from now. This has Kolmogorov complexity K(t) much much less than log(3^^3) : in most models of computation K(t) will be a few thousand bits or less. But the width of the light-cone at t is around 3^^^3, so the utility at t is dominated by around 3^^^3 Boltzmann Brains, and the product U(t) 2^-K(t) is also going to be around 3^^^3. You’ll get similar large contributions at t = 4^^^^4 and so on; in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
One way to fix this may be to discount each location in space and time (s,t) by 2^-K(s,t) and then let u(s,t) represent a utility density (say the average utility per Planck volume). Then sum over u(s,t).2^-K(s, t) for all values of (s,t) in the future light-cone. Provided the utility density is bounded (which seems reasonable), then the whole sum converges.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis).
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
...given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
...in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
It is guaranteed to converge and seems to be pretty harsh on BBs either. Here is how it works. Every “universe” is an infinite sequence of bits encoding a future light cone. The weight of the sequence is 2^{-K-complexity}. More precisely I sum over all programs producing such sequences and give weight 2^{-length} to each. Since sum of
2^-{length} over all programs is 1 I get a well-defined probability measures. Each sequence gets assigned a utility by a computable function that looks like integral over space-time with temporal discount. The temporal discount here can be fast e.g. exponential. So the utility function is bounded and its expectation value converges. However the effective temporal discount is slow since for every universe, its sub-light-cones are also within the sum. Nevertheless its not so slow that BBs come ahead. If you put the vertex of the light cone at any given point (e.g. time 4^^^^4) there will be few BBs within the fast cutoff time and most far points are suppressed due to high K-complexity.
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
Ah, I see what you’re getting at. If the vertex is at the Big Bang, then the shortest programs basically simulate a history of the observable universe. Just start from a description of the laws of physics and some (low entropy) initial conditions, then read in random bits whenever there is an increase in entropy. (For technical reasons the programs will also need to simulate a slightly larger region just outside the light cone, to predict what will cross into it).
If the vertex lies elsewhere, the shortest programs will likely still simulate starting from the Big Bang, then “truncate” i.e. shift the vertex to a new point (s, t) and throw away anything outside the reduced light cone. So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang. Probably most of the weight comes from programs which shift in t but not much in s.
The temporal discount here can be fast e.g. exponential.
That’s what I thought you meant originally: this would ensures that the utility in any given light-cone is bounded, and hence that the expected utility converges.
...given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet). The problem is still that there is a poor fit to observations: yes we could be in a sim, and it could look like this, but on the other hand it could look like more or less anything.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance thesepapers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe. These “cut-down” interpretations make either the same physical predictions as the “expansive” interpretations, or better predictions, so I can’t see any real reason to believe in the expansive versions.
So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang.
In some sense it does, but we must be wary of technicalities. In initial singularity models I’m not sure it makes sense to speak of “light cone with vertex in singularity” and it certainly doesn’t make sense to speak of a privileged point in space. In eternal inflation models there is no singularity so it might make space to speak of the “Big Bang” point in space-time, however it is slightly “fuzzy”.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet).
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance these papers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe.
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility. The “U” in UDT is for “updateless” which means that you don’t update on being in a certain branch of the wavefunction to conclude other branches “don’t exist”, otherwise you lose in counterfactual mugging.
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility.
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t.
It surely can: just give more weight to humans of a very particular type (“you”).
What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right?
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor?
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is not the case if the money can be utilized in a manner with long term impact.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
It is not the case if the money can be utilized in a manner with long term impact.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does).
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis). In that case, the radius of the light cone “now” (at t = 13.8 billion years since Big Bang) is the same as the particle horizon “now” of the observable universe (roughly 45 billion light-years). So the light-cone so far (starting at Big Bang and running up to 13.8 billion years) will be bigger than Earth’s past light-cone (starting now and running back to the Big Bang) but not massively bigger.
This means that there might be a few expanded simulations who are outside our past light-cone (so we don’t see them now, but could run into them in the future). Still if there are lots of civilisations in your light cone, and only a few have expanded, that still implies a very strong future filter. So my main point remains: given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with (so not many to get filtered out; a modest filter does the trick).
Ahh… I was assuming you discounted faster than that, since you said the utilities converged. There is a problem with Kolmogorov discounting of t. Consider what happens at t = 3^^^3 years from now. This has Kolmogorov complexity K(t) much much less than log(3^^3) : in most models of computation K(t) will be a few thousand bits or less. But the width of the light-cone at t is around 3^^^3, so the utility at t is dominated by around 3^^^3 Boltzmann Brains, and the product U(t) 2^-K(t) is also going to be around 3^^^3. You’ll get similar large contributions at t = 4^^^^4 and so on; in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
One way to fix this may be to discount each location in space and time (s,t) by 2^-K(s,t) and then let u(s,t) represent a utility density (say the average utility per Planck volume). Then sum over u(s,t).2^-K(s, t) for all values of (s,t) in the future light-cone. Provided the utility density is bounded (which seems reasonable), then the whole sum converges.
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
It is guaranteed to converge and seems to be pretty harsh on BBs either. Here is how it works. Every “universe” is an infinite sequence of bits encoding a future light cone. The weight of the sequence is 2^{-K-complexity}. More precisely I sum over all programs producing such sequences and give weight 2^{-length} to each. Since sum of 2^-{length} over all programs is 1 I get a well-defined probability measures. Each sequence gets assigned a utility by a computable function that looks like integral over space-time with temporal discount. The temporal discount here can be fast e.g. exponential. So the utility function is bounded and its expectation value converges. However the effective temporal discount is slow since for every universe, its sub-light-cones are also within the sum. Nevertheless its not so slow that BBs come ahead. If you put the vertex of the light cone at any given point (e.g. time 4^^^^4) there will be few BBs within the fast cutoff time and most far points are suppressed due to high K-complexity.
Ah, I see what you’re getting at. If the vertex is at the Big Bang, then the shortest programs basically simulate a history of the observable universe. Just start from a description of the laws of physics and some (low entropy) initial conditions, then read in random bits whenever there is an increase in entropy. (For technical reasons the programs will also need to simulate a slightly larger region just outside the light cone, to predict what will cross into it).
If the vertex lies elsewhere, the shortest programs will likely still simulate starting from the Big Bang, then “truncate” i.e. shift the vertex to a new point (s, t) and throw away anything outside the reduced light cone. So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang. Probably most of the weight comes from programs which shift in t but not much in s.
That’s what I thought you meant originally: this would ensures that the utility in any given light-cone is bounded, and hence that the expected utility converges.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet). The problem is still that there is a poor fit to observations: yes we could be in a sim, and it could look like this, but on the other hand it could look like more or less anything.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance these papers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe. These “cut-down” interpretations make either the same physical predictions as the “expansive” interpretations, or better predictions, so I can’t see any real reason to believe in the expansive versions.
In some sense it does, but we must be wary of technicalities. In initial singularity models I’m not sure it makes sense to speak of “light cone with vertex in singularity” and it certainly doesn’t make sense to speak of a privileged point in space. In eternal inflation models there is no singularity so it might make space to speak of the “Big Bang” point in space-time, however it is slightly “fuzzy”.
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility. The “U” in UDT is for “updateless” which means that you don’t update on being in a certain branch of the wavefunction to conclude other branches “don’t exist”, otherwise you lose in counterfactual mugging.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
It depends on the stakes of the best.
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
It surely can: just give more weight to humans of a very particular type (“you”).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
It is not the case if the money can be utilized in a manner with long term impact.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.