I think that I care about a time-discounted utility integral within a future light-cone. Large civilizations entering this cone don’t reduce the utility of small civilizations.
I’m not sure how you apply that in a big universe model… most of it is lies outside any given light-cone, so which one do you pick? Imagine you don’t yet know where you are: do you sum utility across all light-cones (a sum which could still diverge in a big universe) or take the utility of an average light cone. Also, how do you do the time-discounting if you don’t yet know when you are?
My initial guess is that this utility function won’t encourage betting on really big universes (as there is no increase in utility of the average lightcone from winning the bet), but it will encourage betting on really dense universes (packed full of people or simulations of people). So you should maybe bet that you are in a simulation, running on a form of dense “computronium” in the underlying universe.
I’m not sure how you apply that in a big universe model… most of it is lies outside any given light-cone, so which one do you pick? Imagine you don’t yet know where you are: do you sum utility across all light-cones (a sum which could still diverge in a big universe) or take the utility of an average light cone. Also, how do you do the time-discounting if you don’t yet know when you are?
The possible universes I am considering already come packed into a future light cone (I don’t consider large universes directly). The probability of a universe is proportional to 2^{-its Kolmogorov complexity} so expected utility converges. Time-discounting is relative to the vertex of the light-cone.
...it will encourage betting on really dense universes (packed full of people or simulations of people).
Not really. Additive terms in the utility don’t “encourage” anything, multiplicative factors do.
The possible universes I am considering already come packed into a future light cone (I don’t consider large universes directly).
I was a bit surprised by this… if your possible models only include one light-cone (essentially just the observable universe) then they don’t look too different from those of my stated hypothesis (at the start of the thread). What is your opinion then on other civilisations in the light-cone? How likely are these alternatives?
No other civilisations exist or have existed in the light-cone apart from us.
A few have existed apart from us, but none have expanded (yet)
A few have existed, and a few have expanded, but we can’t see them (yet)
Lots have existed, but none have expanded (very strong future filter)
Lots have existed, and a few have expanded (still a strong future filter), but we can’t see the expanded ones (yet)
Lots have existed, and lots have expanded, so the light-cone is full of expanded civilisations; we don’t see that, but that’s because we are in a zoo or simulation of some sort.
..it will encourage betting on really dense universes (packed full of people or simulations of people).
Not really. Additive terms in the utility don’t “encourage” anything, multiplicative factors do.
Here’s how it works. Imagine the “mugger” offers all observers a bet (e.g. at your 1000:1 on odds) on whether they believe they are in a simulation, within a dense “computronium” universe packed full of computers simulating observers. Suppose only a tiny fraction (less than 1 in a trillion) universe models are like that, and the observers all know this (so this is equivalent to a very heavily weighted coin landing against its weight). But still, by your proposed utility function, UDT observers should accept the bet, since in the freak universes where they win, huge numbers of observers win $1 each, adding a colossal amount of total utility to the light-cone. Whereas in the more regular universes where they lose the bet, relatively fewer observers will lose $1000 each. Hence accepting the bet creates more expected utility than rejecting it.
Another issue you might have concerns the time-discounting. Suppose 1 million observers live early on in the light-cone, and 1 trillion live late in the light-cone (and again all observers know this). The mugger approaches all observers before they know whether they are “early” or “late” and offers them a 50:50 bet on whether they are “early” rather than “late”. The observers all decide to accept the bet, knowing that 1 million will win and 1 trillion will lose: however the utility of the losers is heavily discounted, relative to the winners, so the total expected time-discounted utility is increased by accepting the bet.
I was a bit surprised by this… if your possible models only include one light-cone (essentially just the observable universe) then they don’t look too different from those of my stated hypothesis (at the start of the thread).
My disagreement is that the anthropic reasoning you use is not a good argument for non-existence of large civilizations.
How likely are these alternatives? …
I am using a future light cone whereas your alternatives seem to be formulated in terms of a past light cone. Let me say that I think the probability to ever encounter another civilization is related to the ratio {asymptotic value of Hubble time} / {time since appearance of civilizations became possible}. I can’t find the numbers this second, but my feeling is such an occurrence is far from certain.
Here’s how it works...
Very good point! I think that if the “computronium universe” is not suppressed by some huge factor due to some sort of physical limit / great filter, then there is a significant probability such a universe arises from post-human civilization (e.g. due to FAI). All decisions with possible (even small) impact on the likelihood of and/or the properties of this future get a huge utility boost. Therefore I think decisions with long term impact should be made as if we are not in a simulation whereas decisions which involve purely short term optimizations should be made as if we are in a simulation (although I find it hard to imagine such a decision in which it is important whether we are in a simulation).
Another issue you might have concerns the time-discounting...
The effective time discount function is of rather slow decay because the sum over universes includes time translated versions of the same universe. As a result, the effective discount falls off as
2^{-Kolmogorov complexity of t} which is only slightly faster than 1/t. Nevertheless, for huge time differences your argument is correct. This is actually a good thing, since otherwise your decisions would be dominated by the Boltzmann brains appearing far after heat death.
As a result, the effective discount falls off as 2^{-Kolmogorov complexity of t} which is only slightly faster than 1/t.
It is about 1/t x 1/log t x 1/log log t etc. for most values of t (taking base 2 logarithms). There are exceptions for very regular values of t.
Incidentally, I’ve been thinking about a similar weighting approach towards anthropic reasoning, and it seems to avoid a strong form of the Doomsday Argument (one where we bet heavily against our civilisation expanding). Imagine listing all the observers (or observer moments) in order of appearance since the Big Bang (use cosmological proper time). Then assign a prior probability 2^-K(n) to being the nth observer (or moment) in that sequence.
Now let’s test this distribution against my listed hypotheses above:
1. No other civilisations exist or have existed in the universe apart from us.
Fit to observations: Not too bad. After including the various log terms in 2^-K(n), the probability of me having an observer rank n between 60 billion and 120 billion (we don’t know it more precisely than that) seems to be about 1/log (60 billion) x 1/log (36) or roughly 1⁄200.
Still, the hypothesis seems a bit dodgy. How could there be exactly one civilisation over such a large amount of space and time? Perhaps the evolution of intelligence is just extraordinarily unlikely, a rare fluke that only happened once. But then the fact that the “fluke” actually happened at all makes this hypothesis a poor fit. A better hypothesis is that the chance of intelligence evolving is high enough to ensure that it will appear many times in the universe: Earth-now is just the first time it has happened. If observer moments were weighted uniformly, we would rule that out (we’d be very unlikely to be first), but with the 2^-K(n) weighting, there is rather high probability of being a smaller n, and so being in the first civilisation. So this hypothesis does actually work. One drawback is that living 13.8 billion years after the Big Bang, and with only 5% of stars still to form, we may simply be too late to be the first among many. If there were going to be many civilisations, we’d expect a lot of them to have already arrived.
Predictions for Future of Humanity: No doomsday prediction at all; the probability of my n falling in the range 60-120 billion is the same sum over 2^-K(n) regardless of how many people arrive after me. This looks promising.
2. A few have existed apart from us, but none have expanded (yet)
Fit to observations: Pretty good e.g. if the average number of observers per civilisation is less than 1 trilllion. In this case, I can’t know what my n is (since I don’t know exactly how many civilisations existed before human beings, or how many observers they each had). What I can infer is that my relative rank within my own civilisation will look like it fell at random between 1 and the average population of a civilisation. If that average population is less than 1 trillion, there will be a probability of > 1 in 20 of seeing a relative rank like my current one.
Predictions for Future of Humanity: There must be a fairly low probability of expanding, since other civilisations before us didn’t expand. If there were 100 of them, our own estimated probability of expanding would be less than 0.01 and so on. But notice that we can’t infer anything in particular about whether our own civilisation will expand: if it does expand (against the odds) then there will be a very large number of observer moments after us, but these will fall further down the tail of the Kolmogorov distribution. The probability of my having a rank n where it is (at a number before the expansion) doesn’t change. So I shouldn’t bet against expansion at odds much different from 100:1.
3. A few have existed, and a few have expanded, but we can’t see them (yet)
Fit to observations: Poor. Since some civilisations have already expanded, my own n must be very high (e.g. up in the trillions of trillions). But then most values of n which are that high and near to my own rank will correspond to observers inside one of the expanded civilisations. Since I don’t know my own n, I can’t expect it to just happen to fall inside one of the small civilisations. My observations look very unlikely under this model.
Predictions for Future of Humanity: Similar to 2
4. Lots have existed, but none have expanded (very strong future filter)
Fit to observations: Mixed. It can be made to fit if the average number of observers per civilisation is less than 1 trilllion; this is for reasons simlar to 2. While that gives a reasonable degree of fit, the prior likelihood of such a strong filter seems low.
Predictions for Future of Humanity: Very pessimistic, because of the strong universal filter.
5. Lots have existed, and a few have expanded (still a strong future filter), but we can’t see the expanded ones (yet)
Fit to observations: Poor. Things could still fit if the average population of a civilisation is less than a trillion. But that requires that the small, unexpanded, civilisations massively outnumber the big, expanded ones: so much so that most of the population is in the small ones. This requires an extremely strong future filter. Again, the prior likelihood of this strength of filter seems very low.
Predictions for Future of Humanity: Extremely pessimistic, because of the strong universal filter.
6. Lots have existed, and lots have expanded, so the uinverse is full of expanded civilisations; we don’t see that, but that’s because we are in a zoo or simulation of some sort.
Fit to observations: Poor: even worse than in case 5. Most values of n close to my own (enormous) value of n will be in one of the expanded civilisations. The most likely case seems to be that I’m in a simulation; but still there is no reason at all to suppose the simulation would look like this.
Predictions for Future of Humanity: Uncertain. A significant risk is that someone switches our simulation off, before we get a chance to expand and consume unavailable amounts of simulation resources (e.g. by running our own simulations in turn). This switch-off risk is rather hard to estimate. Most simulations will eventually get switched off, but the Kolmogorov weighting may put us into one of the earlier simulations, one which is running when lots of resources are still available, and doesn’t get turned off for a long time.
I am using a future light cone whereas your alternatives seem to be formulated in terms of a past light cone.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis). In that case, the radius of the light cone “now” (at t = 13.8 billion years since Big Bang) is the same as the particle horizon “now” of the observable universe (roughly 45 billion light-years). So the light-cone so far (starting at Big Bang and running up to 13.8 billion years) will be bigger than Earth’s past light-cone (starting now and running back to the Big Bang) but not massively bigger.
This means that there might be a few expanded simulations who are outside our past light-cone (so we don’t see them now, but could run into them in the future). Still if there are lots of civilisations in your light cone, and only a few have expanded, that still implies a very strong future filter. So my main point remains: given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with (so not many to get filtered out; a modest filter does the trick).
The effective time discount function is of rather slow decay because the sum over universes includes time translated
versions of the same universe. As a result, the effective discount falls off as 2^{-Kolmogorov complexity of t} which is only
slightly faster than 1/t.
Ahh… I was assuming you discounted faster than that, since you said the utilities converged. There is a problem with Kolmogorov discounting of t. Consider what happens at t = 3^^^3 years from now. This has Kolmogorov complexity K(t) much much less than log(3^^3) : in most models of computation K(t) will be a few thousand bits or less. But the width of the light-cone at t is around 3^^^3, so the utility at t is dominated by around 3^^^3 Boltzmann Brains, and the product U(t) 2^-K(t) is also going to be around 3^^^3. You’ll get similar large contributions at t = 4^^^^4 and so on; in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
One way to fix this may be to discount each location in space and time (s,t) by 2^-K(s,t) and then let u(s,t) represent a utility density (say the average utility per Planck volume). Then sum over u(s,t).2^-K(s, t) for all values of (s,t) in the future light-cone. Provided the utility density is bounded (which seems reasonable), then the whole sum converges.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis).
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
...given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
...in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
It is guaranteed to converge and seems to be pretty harsh on BBs either. Here is how it works. Every “universe” is an infinite sequence of bits encoding a future light cone. The weight of the sequence is 2^{-K-complexity}. More precisely I sum over all programs producing such sequences and give weight 2^{-length} to each. Since sum of
2^-{length} over all programs is 1 I get a well-defined probability measures. Each sequence gets assigned a utility by a computable function that looks like integral over space-time with temporal discount. The temporal discount here can be fast e.g. exponential. So the utility function is bounded and its expectation value converges. However the effective temporal discount is slow since for every universe, its sub-light-cones are also within the sum. Nevertheless its not so slow that BBs come ahead. If you put the vertex of the light cone at any given point (e.g. time 4^^^^4) there will be few BBs within the fast cutoff time and most far points are suppressed due to high K-complexity.
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
Ah, I see what you’re getting at. If the vertex is at the Big Bang, then the shortest programs basically simulate a history of the observable universe. Just start from a description of the laws of physics and some (low entropy) initial conditions, then read in random bits whenever there is an increase in entropy. (For technical reasons the programs will also need to simulate a slightly larger region just outside the light cone, to predict what will cross into it).
If the vertex lies elsewhere, the shortest programs will likely still simulate starting from the Big Bang, then “truncate” i.e. shift the vertex to a new point (s, t) and throw away anything outside the reduced light cone. So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang. Probably most of the weight comes from programs which shift in t but not much in s.
The temporal discount here can be fast e.g. exponential.
That’s what I thought you meant originally: this would ensures that the utility in any given light-cone is bounded, and hence that the expected utility converges.
...given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet). The problem is still that there is a poor fit to observations: yes we could be in a sim, and it could look like this, but on the other hand it could look like more or less anything.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance thesepapers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe. These “cut-down” interpretations make either the same physical predictions as the “expansive” interpretations, or better predictions, so I can’t see any real reason to believe in the expansive versions.
So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang.
In some sense it does, but we must be wary of technicalities. In initial singularity models I’m not sure it makes sense to speak of “light cone with vertex in singularity” and it certainly doesn’t make sense to speak of a privileged point in space. In eternal inflation models there is no singularity so it might make space to speak of the “Big Bang” point in space-time, however it is slightly “fuzzy”.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet).
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance these papers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe.
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility. The “U” in UDT is for “updateless” which means that you don’t update on being in a certain branch of the wavefunction to conclude other branches “don’t exist”, otherwise you lose in counterfactual mugging.
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility.
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t.
It surely can: just give more weight to humans of a very particular type (“you”).
What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right?
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor?
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is not the case if the money can be utilized in a manner with long term impact.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
It is not the case if the money can be utilized in a manner with long term impact.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does).
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.
I’m not sure how you apply that in a big universe model… most of it is lies outside any given light-cone, so which one do you pick? Imagine you don’t yet know where you are: do you sum utility across all light-cones (a sum which could still diverge in a big universe) or take the utility of an average light cone. Also, how do you do the time-discounting if you don’t yet know when you are?
My initial guess is that this utility function won’t encourage betting on really big universes (as there is no increase in utility of the average lightcone from winning the bet), but it will encourage betting on really dense universes (packed full of people or simulations of people). So you should maybe bet that you are in a simulation, running on a form of dense “computronium” in the underlying universe.
The possible universes I am considering already come packed into a future light cone (I don’t consider large universes directly). The probability of a universe is proportional to 2^{-its Kolmogorov complexity} so expected utility converges. Time-discounting is relative to the vertex of the light-cone.
Not really. Additive terms in the utility don’t “encourage” anything, multiplicative factors do.
I was a bit surprised by this… if your possible models only include one light-cone (essentially just the observable universe) then they don’t look too different from those of my stated hypothesis (at the start of the thread). What is your opinion then on other civilisations in the light-cone? How likely are these alternatives?
No other civilisations exist or have existed in the light-cone apart from us.
A few have existed apart from us, but none have expanded (yet)
A few have existed, and a few have expanded, but we can’t see them (yet)
Lots have existed, but none have expanded (very strong future filter)
Lots have existed, and a few have expanded (still a strong future filter), but we can’t see the expanded ones (yet)
Lots have existed, and lots have expanded, so the light-cone is full of expanded civilisations; we don’t see that, but that’s because we are in a zoo or simulation of some sort.
Here’s how it works. Imagine the “mugger” offers all observers a bet (e.g. at your 1000:1 on odds) on whether they believe they are in a simulation, within a dense “computronium” universe packed full of computers simulating observers. Suppose only a tiny fraction (less than 1 in a trillion) universe models are like that, and the observers all know this (so this is equivalent to a very heavily weighted coin landing against its weight). But still, by your proposed utility function, UDT observers should accept the bet, since in the freak universes where they win, huge numbers of observers win $1 each, adding a colossal amount of total utility to the light-cone. Whereas in the more regular universes where they lose the bet, relatively fewer observers will lose $1000 each. Hence accepting the bet creates more expected utility than rejecting it.
Another issue you might have concerns the time-discounting. Suppose 1 million observers live early on in the light-cone, and 1 trillion live late in the light-cone (and again all observers know this). The mugger approaches all observers before they know whether they are “early” or “late” and offers them a 50:50 bet on whether they are “early” rather than “late”. The observers all decide to accept the bet, knowing that 1 million will win and 1 trillion will lose: however the utility of the losers is heavily discounted, relative to the winners, so the total expected time-discounted utility is increased by accepting the bet.
My disagreement is that the anthropic reasoning you use is not a good argument for non-existence of large civilizations.
I am using a future light cone whereas your alternatives seem to be formulated in terms of a past light cone. Let me say that I think the probability to ever encounter another civilization is related to the ratio {asymptotic value of Hubble time} / {time since appearance of civilizations became possible}. I can’t find the numbers this second, but my feeling is such an occurrence is far from certain.
Very good point! I think that if the “computronium universe” is not suppressed by some huge factor due to some sort of physical limit / great filter, then there is a significant probability such a universe arises from post-human civilization (e.g. due to FAI). All decisions with possible (even small) impact on the likelihood of and/or the properties of this future get a huge utility boost. Therefore I think decisions with long term impact should be made as if we are not in a simulation whereas decisions which involve purely short term optimizations should be made as if we are in a simulation (although I find it hard to imagine such a decision in which it is important whether we are in a simulation).
The effective time discount function is of rather slow decay because the sum over universes includes time translated versions of the same universe. As a result, the effective discount falls off as 2^{-Kolmogorov complexity of t} which is only slightly faster than 1/t. Nevertheless, for huge time differences your argument is correct. This is actually a good thing, since otherwise your decisions would be dominated by the Boltzmann brains appearing far after heat death.
It is about 1/t x 1/log t x 1/log log t etc. for most values of t (taking base 2 logarithms). There are exceptions for very regular values of t.
Incidentally, I’ve been thinking about a similar weighting approach towards anthropic reasoning, and it seems to avoid a strong form of the Doomsday Argument (one where we bet heavily against our civilisation expanding). Imagine listing all the observers (or observer moments) in order of appearance since the Big Bang (use cosmological proper time). Then assign a prior probability 2^-K(n) to being the nth observer (or moment) in that sequence.
Now let’s test this distribution against my listed hypotheses above:
1. No other civilisations exist or have existed in the universe apart from us.
Fit to observations: Not too bad. After including the various log terms in 2^-K(n), the probability of me having an observer rank n between 60 billion and 120 billion (we don’t know it more precisely than that) seems to be about 1/log (60 billion) x 1/log (36) or roughly 1⁄200.
Still, the hypothesis seems a bit dodgy. How could there be exactly one civilisation over such a large amount of space and time? Perhaps the evolution of intelligence is just extraordinarily unlikely, a rare fluke that only happened once. But then the fact that the “fluke” actually happened at all makes this hypothesis a poor fit. A better hypothesis is that the chance of intelligence evolving is high enough to ensure that it will appear many times in the universe: Earth-now is just the first time it has happened. If observer moments were weighted uniformly, we would rule that out (we’d be very unlikely to be first), but with the 2^-K(n) weighting, there is rather high probability of being a smaller n, and so being in the first civilisation. So this hypothesis does actually work. One drawback is that living 13.8 billion years after the Big Bang, and with only 5% of stars still to form, we may simply be too late to be the first among many. If there were going to be many civilisations, we’d expect a lot of them to have already arrived.
Predictions for Future of Humanity: No doomsday prediction at all; the probability of my n falling in the range 60-120 billion is the same sum over 2^-K(n) regardless of how many people arrive after me. This looks promising.
2. A few have existed apart from us, but none have expanded (yet)
Fit to observations: Pretty good e.g. if the average number of observers per civilisation is less than 1 trilllion. In this case, I can’t know what my n is (since I don’t know exactly how many civilisations existed before human beings, or how many observers they each had). What I can infer is that my relative rank within my own civilisation will look like it fell at random between 1 and the average population of a civilisation. If that average population is less than 1 trillion, there will be a probability of > 1 in 20 of seeing a relative rank like my current one.
Predictions for Future of Humanity: There must be a fairly low probability of expanding, since other civilisations before us didn’t expand. If there were 100 of them, our own estimated probability of expanding would be less than 0.01 and so on. But notice that we can’t infer anything in particular about whether our own civilisation will expand: if it does expand (against the odds) then there will be a very large number of observer moments after us, but these will fall further down the tail of the Kolmogorov distribution. The probability of my having a rank n where it is (at a number before the expansion) doesn’t change. So I shouldn’t bet against expansion at odds much different from 100:1.
3. A few have existed, and a few have expanded, but we can’t see them (yet)
Fit to observations: Poor. Since some civilisations have already expanded, my own n must be very high (e.g. up in the trillions of trillions). But then most values of n which are that high and near to my own rank will correspond to observers inside one of the expanded civilisations. Since I don’t know my own n, I can’t expect it to just happen to fall inside one of the small civilisations. My observations look very unlikely under this model.
Predictions for Future of Humanity: Similar to 2
4. Lots have existed, but none have expanded (very strong future filter)
Fit to observations: Mixed. It can be made to fit if the average number of observers per civilisation is less than 1 trilllion; this is for reasons simlar to 2. While that gives a reasonable degree of fit, the prior likelihood of such a strong filter seems low.
Predictions for Future of Humanity: Very pessimistic, because of the strong universal filter.
5. Lots have existed, and a few have expanded (still a strong future filter), but we can’t see the expanded ones (yet)
Fit to observations: Poor. Things could still fit if the average population of a civilisation is less than a trillion. But that requires that the small, unexpanded, civilisations massively outnumber the big, expanded ones: so much so that most of the population is in the small ones. This requires an extremely strong future filter. Again, the prior likelihood of this strength of filter seems very low.
Predictions for Future of Humanity: Extremely pessimistic, because of the strong universal filter.
6. Lots have existed, and lots have expanded, so the uinverse is full of expanded civilisations; we don’t see that, but that’s because we are in a zoo or simulation of some sort.
Fit to observations: Poor: even worse than in case 5. Most values of n close to my own (enormous) value of n will be in one of the expanded civilisations. The most likely case seems to be that I’m in a simulation; but still there is no reason at all to suppose the simulation would look like this.
Predictions for Future of Humanity: Uncertain. A significant risk is that someone switches our simulation off, before we get a chance to expand and consume unavailable amounts of simulation resources (e.g. by running our own simulations in turn). This switch-off risk is rather hard to estimate. Most simulations will eventually get switched off, but the Kolmogorov weighting may put us into one of the earlier simulations, one which is running when lots of resources are still available, and doesn’t get turned off for a long time.
I was assuming that the “vertex” of your light cone is situated at or shortly after the Big Bang (e.g. maybe during the first few minutes of nucleosynthesis). In that case, the radius of the light cone “now” (at t = 13.8 billion years since Big Bang) is the same as the particle horizon “now” of the observable universe (roughly 45 billion light-years). So the light-cone so far (starting at Big Bang and running up to 13.8 billion years) will be bigger than Earth’s past light-cone (starting now and running back to the Big Bang) but not massively bigger.
This means that there might be a few expanded simulations who are outside our past light-cone (so we don’t see them now, but could run into them in the future). Still if there are lots of civilisations in your light cone, and only a few have expanded, that still implies a very strong future filter. So my main point remains: given that a super-strong future filter looks very unlikely, most of the probability will be concentrated on models where there are only a few civilisations to start with (so not many to get filtered out; a modest filter does the trick).
Ahh… I was assuming you discounted faster than that, since you said the utilities converged. There is a problem with Kolmogorov discounting of t. Consider what happens at t = 3^^^3 years from now. This has Kolmogorov complexity K(t) much much less than log(3^^3) : in most models of computation K(t) will be a few thousand bits or less. But the width of the light-cone at t is around 3^^^3, so the utility at t is dominated by around 3^^^3 Boltzmann Brains, and the product U(t) 2^-K(t) is also going to be around 3^^^3. You’ll get similar large contributions at t = 4^^^^4 and so on; in short I believe your summed discounted utility is diverging (or in any case dominated by the Boltzmann Brains).
One way to fix this may be to discount each location in space and time (s,t) by 2^-K(s,t) and then let u(s,t) represent a utility density (say the average utility per Planck volume). Then sum over u(s,t).2^-K(s, t) for all values of (s,t) in the future light-cone. Provided the utility density is bounded (which seems reasonable), then the whole sum converges.
No, it can be located absolutely anywhere. However you’re right that the light cones with vertex close to Big Bang will probably have large weight to low K-complexity.
This looks correct, but it is different from your initial argument. In particular there’s no reason to believe MWI is wrong or anything like that.
It is guaranteed to converge and seems to be pretty harsh on BBs either. Here is how it works. Every “universe” is an infinite sequence of bits encoding a future light cone. The weight of the sequence is 2^{-K-complexity}. More precisely I sum over all programs producing such sequences and give weight 2^{-length} to each. Since sum of 2^-{length} over all programs is 1 I get a well-defined probability measures. Each sequence gets assigned a utility by a computable function that looks like integral over space-time with temporal discount. The temporal discount here can be fast e.g. exponential. So the utility function is bounded and its expectation value converges. However the effective temporal discount is slow since for every universe, its sub-light-cones are also within the sum. Nevertheless its not so slow that BBs come ahead. If you put the vertex of the light cone at any given point (e.g. time 4^^^^4) there will be few BBs within the fast cutoff time and most far points are suppressed due to high K-complexity.
Ah, I see what you’re getting at. If the vertex is at the Big Bang, then the shortest programs basically simulate a history of the observable universe. Just start from a description of the laws of physics and some (low entropy) initial conditions, then read in random bits whenever there is an increase in entropy. (For technical reasons the programs will also need to simulate a slightly larger region just outside the light cone, to predict what will cross into it).
If the vertex lies elsewhere, the shortest programs will likely still simulate starting from the Big Bang, then “truncate” i.e. shift the vertex to a new point (s, t) and throw away anything outside the reduced light cone. So I suspect that this approach gives a weighting rather like 2^-K(s,t) for light-cones which are offset from the Big Bang. Probably most of the weight comes from programs which shift in t but not much in s.
That’s what I thought you meant originally: this would ensures that the utility in any given light-cone is bounded, and hence that the expected utility converges.
I disagree. If models like MWI and/or eternal inflation are taken seriously, then they imply the existence of a huge number of civilisations (spread across multiple branches or multiple inflating regions), and a huge number of expanded civilisations (unless the chance of expansion is exactly zero). Observers should then predict that they will be in one of the expanded civilisations. (Or in UDT terms, they should take bets that they are in such a civilisation). Since our observations are not like that, this forces us into simulation conclusions (most people making our observations are in sims, so that’s how we should bet). The problem is still that there is a poor fit to observations: yes we could be in a sim, and it could look like this, but on the other hand it could look like more or less anything.
Incidentally, there are versions of inflation and many worlds which don’t run into that problem. You can always take a “local” view of inflation (see for instance these papers), and a “modal” interpretation of many worlds (see here). Combined, these views imply that all that actually exists is within one branch of a wave function constructed over one observable universe. These “cut-down” interpretations make either the same physical predictions as the “expansive” interpretations, or better predictions, so I can’t see any real reason to believe in the expansive versions.
In some sense it does, but we must be wary of technicalities. In initial singularity models I’m not sure it makes sense to speak of “light cone with vertex in singularity” and it certainly doesn’t make sense to speak of a privileged point in space. In eternal inflation models there is no singularity so it might make space to speak of the “Big Bang” point in space-time, however it is slightly “fuzzy”.
I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility. The “U” in UDT is for “updateless” which means that you don’t update on being in a certain branch of the wavefunction to conclude other branches “don’t exist”, otherwise you lose in counterfactual mugging.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
It depends on the stakes of the best.
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
It surely can: just give more weight to humans of a very particular type (“you”).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
It is not the case if the money can be utilized in a manner with long term impact.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.