I don’t think it does. If we are not in a sim, our actions have potentially huge impact since they can affect the probability and the properties of a hypothetical expanded post-human civilization.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
In UDT it doesn’t make sense to speak of what “actually exists”. Everything exists, you just assign different weights to different parts of “everything” when computing utility.
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t.
It surely can: just give more weight to humans of a very particular type (“you”).
What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right?
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim?
It depends on the stakes of the best.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor?
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
It is not the case if the money can be utilized in a manner with long term impact.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
It is not the case if the money can be utilized in a manner with long term impact.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does).
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.
So: if a bet is offered that you are a sim (in some form of computronium) and it becomes possible to test that (and so decide the bet one way or another), you would bet heavily on being a sim? But on the off-chance that you are not a sim, you’re going to make decisions as if you were in the real world, because those decisions (when suitably generalized across all possible light-cones) have a huge utility impact. Is that right?
The problem I have is this only works if your utility function is very impartial (it is dominated by “pro bono universo” terms, rather than “what’s in it for me” or “what’s in it for us” terms). Imagine for instance that you work really hard to ensure a positive singularity, and succeed. You create a friendly AI, it starts spreading, and gathering huge amounts of computational resources… and then our simulation runs out of memory, crashes, and gets switched off. This doesn’t sound like it is a good idea “for us” does it?
This all seems to be part of a general problem with asking UDT to model selfish (or self-interested) preferences. Perhaps it can’t. In which case UDT might be a great decision theory for saints, but not for regular human beings. And so we might not want to program UDT into our AI in case that AI thinks it’s a good idea to risk crashing our simulation (and killing us all in the process).
I’ve remarked elsewhere that UDT works best against a background of modal realism, and that’s essentially what you’ve said here. But here’s something for you to ponder. What if modal realism is wrong? What if there is, in fact, evidence that it is wrong, because the world as we see it is not what we should expect to see if it was right? Isn’t it maybe a good idea to then—er—update on that evidence?
Or does a UDT agent have to stay dogmatically committed to modal realism in the face of whatever it sees? That doesn’t seem very rational does it?
It depends on the stakes of the best.
It’s not an “off-chance”. It is meaningless to speak of the “chance I am a sim”: some copies of me are sims, some copies of me are not sims.
It surely can: just give more weight to humans of a very particular type (“you”).
Subjective expectations are meaningless in UDT. So there is no “what we should expect to see”.
Does it have to stay dogmatically committed to Occam’s razor in the face of whatever it sees? If not, how would it arrive at a replacement without using Occam’s razor? There must be some axioms at the basis of any reasoning system.
I thought we discussed an example earlier in the thread? The gambler pays $1000 if not in a simulation; the bookmaker pays $1 if the gambler is in a simulation. In terms of expected utility, it is better for “you” (that is, all linked instances of you) to take the gamble, even if the vast majority of light-cones don’t contain simulations.
No it isn’t meaningless: chances simply become operationalised in terms of bets, or other decisions with variable payoff. The “chance you are a sim” becomes equal to the fraction of a util you are prepared to pay for a betting slip which pays out one util if you are a sim, and pays nothing otherwise. (Lots of linked copies of “you” take the gamble; some win, some lose.)
Incidentally, in terms of original modal realism (due to David Lewis), “you” are a concrete unique individual who inhabits exactly one world, but it is unknown which one. Other versions of “you” are your “counterparts”. It is usually not possible to group all your counterparts together and treat them as a single (distributed) being, YOU, because the counterpart relation is not an equivalence relation (it doesn’t partition possible people into neat equivalence classes). As one example, imagine a long chain of possible people whose experiences and memories are indistinguishable from immediate neighbours in the chain (and they are counterparts of their neighbours). But there is a cumulative “drift” along the chain, so that the ends are very different from each other (and not counterparts).
A subjective expectation is rather like a bet: it is a commitment of mental resource to modelling certain lines of future observations (and preparing decisions for such a case). If you spend most of your modelling resource on a scenario which doesn’t materialise, this is like losing the bet. So it is reasonable to talk about subjective expectations in UDT; just model them as bets.
Occam’s razor here is just a method for weighting hypotheses in the prior. It is only “dogmatic” if the prior assigns weights in such an unbalanced way that no amount of evidence will ever shift the weights. If your prior had truly massive weight (e.g, infinite weight) in favour of many worlds, then it will never shift, so that looks dogmatic. But to be honest, I rather doubt this. You weren’t born believing in the many worlds interpretation (or in modal realism) and if you are a normal human being you most likely regarded it as quite outlandish at some point. Then some line of evidence or reasoning caused you to shift your opinion (e.g. because it seemed simpler, or overall a better explanation for physical evidence). If it shifted one way, then considering other evidence could shift it back again.
It is not the case if the money can be utilized in a manner with long term impact.
This doesn’t give an unambiguous recipe to compute probabilities since it depends on how the results of the bets are accumulated to influence utility. An unambiguous recipe cannot exist since it would have to give precise answers to ambiguous questions such as: if there are two identical simulations of you running on two computers, should they be counted as two copies or one?
UDT doesn’t seem to work this way. In UDT, “you” are not a physical entity but an abstract decision algorithm. This abstract decision algorithm is correlated to different extent with different physical entities in different worlds. This leads to the question of whether some algorithms are more “conscious” than others. I don’t think UDT currently has an answer for this, but neither do other frameworks.
If we think of knowledge as a layered pie, with lower layers corresponding to knowledge which is more “fundamental”, then somewhere near the bottom we have paradigms of reasoning such as Occam’s razor / Solomonoff induction and UDT. Below them lie “human reasoning axioms” which are something we cannot formalize due to our limited introspection ability. In fact the paradigms of reasoning are our current best efforts at formalizing this intuition. However, when we build an AI we need to use something formal, we cannot just transfer our reasoning axioms to it (at least I don’t know how to do it; meseems every way to do it would be “ingenuine” since it would be based on a formalism). So, for the AI, UDT (or whatever formalism we use) is the lowest layer. Maybe it’s a philosophical limitation of any AGI, but I doubt it can be overcome and I doubt it’s a good reason not to build an (F)AI.
OK, I was using $ here as a proxy for utils, but technically you’re right: the bet should be expressed in utils (as for the general definition of a chance that I gave in my comment). Or if you don’t know how to bet in utils, use another proxy which is a consumptive good and can’t be invested (e.g. chocolate bars or vouchers for a cinema trip this week). A final loop-hole is the time discounting: the real versions of you mostly live earlier than the sim versions of you, so perhaps a chocolate bar for the real “you” is worth many chocolate bars for sim “you”s? However we covered that earlier in the thread as well: my understanding is that your effective discount rate is not high enough to outweigh the huge numbers of sims.
Well this is your utility function, so you tell me! Imagine a hacker is able to get into the simulations and replace pleasant experiences by horrible torture. Does your utility function care twice as much if he hacks both simulations versus hacking just one of them? (My guess is that it does). And this style of reasoning may cover limit cases like a simulation running on a wafer which is then cut in two (think about whether the sims are independently hackable, and how much you care.)
It wouldn’t be exactly twice but you’re more or less right. However, it has no direct relation to probability. To see this, imagine you’re a paperclip maximizer. In this case you don’t care about torture or anything of the sort: you only care about paperclips. So your utility function specifies a way of counting paperclips but no way of counting copies of you.
From another angle, imagine your two simulations are offered a bet. How should they count themselves? Obviously it depends on the rules of the bet: whether the payoff is handed out once or twice. Therefore, the counting is ambiguous.
What you’re trying to do is writing the utility function as a convex linear combination of utility functions associated with different copies of you. Once you accomplish that, the coefficients of the combination can be interpreted as probabilities. However, there is no such canonical decomposition.
I think it works quite well with “you” as a concrete entity. Simply use the notion that “your” decisions are linked to those of your counterparts (and indeed, to other agents), such that if you decide in a certain way in given circumstances, your counterparts will decide that way as well. The linkage will be very tight for neighbours in the chain, but diminishing gradually with distance, and such that the ends of the chain are not linked at all. This—I think—addresses the problem of trying to identify what algorithm you are implementing, or partitioning possible people into those who are running “the same” algorithm.
Actually I was speaking of a different problem, namely the philosophical problem of which abstract algorithms should be regarded as conscious (assuming the concept makes sense at all).
The identification of oneself’s algorithm is an introspective operation whose definition is not obvious for humans. For AIs the situation is clearer if we assume the AI has access to its own source code.