Your “UDT anti-Newcomb problem” reminds me of an open problem with TDT/UDT that was identified in the early days and still unsolved AFAIK. See the discussion here, but in short, it was pointed out that if most of the agents in the universe/multiverse are going to be UDT agents, they should be willing to play C in a one-shot PD game against a random agent sampled from the universe/multiverse. But human beings are not yet UDT agents, so it might be better to build some other kind of agent so we (or our successors) can play D instead. The approach suggested here looks like it could be relevant to solving that problem. What do you think?
One thing that worries me here is that if A has more computing power than P, it can compute P’s actual output just by simulating it, so with the equations given in this post, you’d have to define how to condition the expectation of U() on statements that are known by the agent to be false. I was hoping that this is not actually a problem that needs to be solved in order to solve decision theory.
The “game of 3” problem indeed seems related. Apparently my decision theories solve it correctly, if we consider the precursor to be the “AI submitter”. The discussion there is of enormous size, can you please point out the part about most of the agents in the multiverse?
Regarding successor-simulating-precursor, I think that any way of applying logical uncertainty that solves logical counterfactual mugging should handle this as well, since the precursor’s “Tegmark V” should be preserved by the successor in some sense, and the precursor cannot simulate itself (see also this).
A possible solution is evaluating the expectation value at a depth of analysis (reference amount of computing resources) corresponding to the precursor’s computing power. This doesn’t mean the successor’s superior computing power is wasted since it allows for more exhaustive maximization. In case of equation (4′) the extra power is also used for evaluating pi.
Apparently my decision theories solve it correctly, if we consider the precursor to be the “AI submitter”.
Can you go into some detail on how your decision theories solve it?
The discussion there is of enormous size, can you please point out the part about most of the agents in the multiverse?
I guess it wasn’t mentioned explicitly in that discussion, but it’s how I’ve come to think of the problem. Perhaps the most relevant part of that discussion is Eliezer’s direct reply, here.
) is your player and PD_alpha is the payoff of the first player in the Prisoner Dilemma with the (1,5,6) payoff matrix.
Omega’s players end up playing C regardless of A. The agent can either understand this or at least fail to find a strong dependence of the logical probabilities of Omega’s players’ strategy on either their input (the agent’s source) or the conditions in the expectation values it is evaluating (since the conditions are of the form
=X) which seems to be correlated with )) only in the obvious way i.e. by determining the input to B_i).
Therefore, the highest expectation values will be computed for conditions of the form
=DefectBot). Therefore the agent will defect.
I guess it wasn’t mentioned explicitly in that discussion, but it’s how I’ve come to think of the problem. Perhaps the most relevant part of that discussion is Eliezer’s direct reply, here.
I see. However, your problem doesn’t seem to be a realistic model of acausal bargaining with agents in other universes, since in such bargaining you know who you’re cooperating with. For example, when an agent considers filling its universe with human utility, it does it in order to cooperate with a human FAI, not in order to cooperate with a paperclip maximizer (which would require a very different strategy namely filling its universe with paperclips).
It’s more of a model for FAI meeting an alien AI in space. Suppose each side then has the choice of doing an arms buildup or not, and the payoffs for these choices are analogous to PD. (If one side builds up while the other doesn’t, it can attack and conquer the other. If both sides build up, it’s a stalemate and just wastes resources.) BTW, what was your “UDT anti-Newcomb problem” intended to be a model for?
I guess if both sides are using your decision theory, then whether the human FAI plays C or D against the alien AI depends on how much logical correlation the FAI thinks exists between its human designer and the alien AI’s designer, which does make sense (assuming we solve the problem of the FAI just simulating its designer and already knowing what their output is).
It’s more of a model for FAI meeting an alien AI in space.
Makes sense.
BTW, what was your “UDT anti-Newcomb problem” intended to be a model for?
Frankly, I didn’t have a specific realistic scenario in mind. I came up with the anti-Newcomb problem as simple semi-artificial problem demonstrating the problems with quining in UDT. The reason I started thinking about these problems is that it doesn’t seem “classical” UDT can be translated to realistic AGI architecture. UDT takes a finite number of bits and produces a finite number of bits whereas a realistic AGI has continuous input and output streams. Such an AGI has to somehow take into account a formal specification of its own hardware, and the natural way of introducing such a specification seems to me to go through introducing a precursor, specifically a precursor which is a Solomonoff average over all “theories of physics” containing the formal specification.
Your “UDT anti-Newcomb problem” reminds me of an open problem with TDT/UDT that was identified in the early days and still unsolved AFAIK. See the discussion here, but in short, it was pointed out that if most of the agents in the universe/multiverse are going to be UDT agents, they should be willing to play C in a one-shot PD game against a random agent sampled from the universe/multiverse. But human beings are not yet UDT agents, so it might be better to build some other kind of agent so we (or our successors) can play D instead. The approach suggested here looks like it could be relevant to solving that problem. What do you think?
One thing that worries me here is that if A has more computing power than P, it can compute P’s actual output just by simulating it, so with the equations given in this post, you’d have to define how to condition the expectation of U() on statements that are known by the agent to be false. I was hoping that this is not actually a problem that needs to be solved in order to solve decision theory.
Hi Wei Dai, thx for commenting!
The “game of 3” problem indeed seems related. Apparently my decision theories solve it correctly, if we consider the precursor to be the “AI submitter”. The discussion there is of enormous size, can you please point out the part about most of the agents in the multiverse?
Regarding successor-simulating-precursor, I think that any way of applying logical uncertainty that solves logical counterfactual mugging should handle this as well, since the precursor’s “Tegmark V” should be preserved by the successor in some sense, and the precursor cannot simulate itself (see also this).
A possible solution is evaluating the expectation value at a depth of analysis (reference amount of computing resources) corresponding to the precursor’s computing power. This doesn’t mean the successor’s superior computing power is wasted since it allows for more exhaustive maximization. In case of equation (4′) the extra power is also used for evaluating pi.
Need to think more on this!
Can you go into some detail on how your decision theories solve it?
I guess it wasn’t mentioned explicitly in that discussion, but it’s how I’ve come to think of the problem. Perhaps the most relevant part of that discussion is Eliezer’s direct reply, here.
Your problem can be written as
%20=%20\frac{1}{3}(PD_\alpha(A,%20B_1(A))+PD_\alpha(A,B_2(A))))where B_1 and B_2 are Omega’s players,
) is your player and PD_alpha is the payoff of the first player in the Prisoner Dilemma with the (1,5,6) payoff matrix.Omega’s players end up playing C regardless of A. The agent can either understand this or at least fail to find a strong dependence of the logical probabilities of Omega’s players’ strategy on either their input (the agent’s source) or the conditions in the expectation values it is evaluating (since the conditions are of the form
=X) which seems to be correlated with )) only in the obvious way i.e. by determining the input to B_i).Therefore, the highest expectation values will be computed for conditions of the form
=DefectBot). Therefore the agent will defect.I see. However, your problem doesn’t seem to be a realistic model of acausal bargaining with agents in other universes, since in such bargaining you know who you’re cooperating with. For example, when an agent considers filling its universe with human utility, it does it in order to cooperate with a human FAI, not in order to cooperate with a paperclip maximizer (which would require a very different strategy namely filling its universe with paperclips).
It’s more of a model for FAI meeting an alien AI in space. Suppose each side then has the choice of doing an arms buildup or not, and the payoffs for these choices are analogous to PD. (If one side builds up while the other doesn’t, it can attack and conquer the other. If both sides build up, it’s a stalemate and just wastes resources.) BTW, what was your “UDT anti-Newcomb problem” intended to be a model for?
I guess if both sides are using your decision theory, then whether the human FAI plays C or D against the alien AI depends on how much logical correlation the FAI thinks exists between its human designer and the alien AI’s designer, which does make sense (assuming we solve the problem of the FAI just simulating its designer and already knowing what their output is).
Makes sense.
Frankly, I didn’t have a specific realistic scenario in mind. I came up with the anti-Newcomb problem as simple semi-artificial problem demonstrating the problems with quining in UDT. The reason I started thinking about these problems is that it doesn’t seem “classical” UDT can be translated to realistic AGI architecture. UDT takes a finite number of bits and produces a finite number of bits whereas a realistic AGI has continuous input and output streams. Such an AGI has to somehow take into account a formal specification of its own hardware, and the natural way of introducing such a specification seems to me to go through introducing a precursor, specifically a precursor which is a Solomonoff average over all “theories of physics” containing the formal specification.