What is the remaining Problem that you’re referring to? Why can’t we apply the formalism of UDT1 to the various examples people seem to be puzzled about and just get the answers out? Or is cousin_it right about the focus having shifted to how human beings ought to reason about these problems?
The anthropic problem was a remaining problem for TDT, although not UDT.
UDT has its own problems, possibly. For example, in the Counterfactual Mugging, it seems that you want to be counterfactually mugged whenever Omega has a well-calibrated distribution and has a systematic policy of offering high-payoff CMs according to that distribution, even if your own prior has a different distribution. In other words, the key to the CM isn’t your own distribution, it’s Omega’s. And it’s not possible to interpret UDT as epistemic advice, which leaves anthropic questions open. So I haven’t yet shifted to UDT outright.
(The reason I did not answer your question earlier was that it seemed to require a response at greater length than the above.)
Well, you’re right in the sense that I can’t understand the example you gave. (I waited a couple of days to see if it would become clear, but it didn’t) But the rest of the response is helpful.
Did he ever get around to explaining this in more detail? I don’t remember reading a reply to this, but I think I’ve just figured out the idea: Suppose you get word that Omega is coming to the neighbourhood and going to offer counterfactual muggings. What sort of algorithm do you want to self-modify into? You don’t know what CMs Omega is going to offer; all you know is that it will offer odds according to its well-calibrated prior. Thus, it has higher expected utility to be a CM-accepter than a CM-rejecter, and even a CDT agent would want to self-modify.
I don’t think that’s a problem for UDT, though. What UDT will compute when asked to pay is the expected utility under its prior of paying up when Omega asks it to; thus, the condition for UDT to pay up is NOT
prior probability of heads * Omega's offered payoff > prior of tails * Omega's price
but
prior of (heads and Omega offers a CM for this coin) * payoff > prior of (tails and CM) * price.
In other words, UDT takes the quality of Omega’s predictions into account and acts as if updating on them (the same way you would update if Omega told you who it expects to win the next election, at 98% probability).
CDT agents, as usual, will actually want to self-modify into a UDT agent whose prior equals the CDT agent’s posterior [ETA: wait, sorry, no, they won’t act as if they can acausally control other instances of the same program, but they will self-modify so as to make future instances of themselves (which obviously they control causally) act in a way that maximizes EU according to the agent’s present posterior, and that’s what we need here], and will use the second formula above accordingly—they don’t want to be a general CM-rejecter, but they think that they can do even better than being a general CM-accepter if they refuse to pay up if at the time of self-modification they assigned low probability to tails, even conditional on Omega offering them a CM.
He never explained further, and actually I still don’t quite understand the example even given your explanation. Maybe you can reply directly to Eliezer’s comment so he can see it in his inbox, and let us know if he still thinks it’s a problem for UDT?
What is the remaining Problem that you’re referring to? Why can’t we apply the formalism of UDT1 to the various examples people seem to be puzzled about and just get the answers out? Or is cousin_it right about the focus having shifted to how human beings ought to reason about these problems?
The anthropic problem was a remaining problem for TDT, although not UDT.
UDT has its own problems, possibly. For example, in the Counterfactual Mugging, it seems that you want to be counterfactually mugged whenever Omega has a well-calibrated distribution and has a systematic policy of offering high-payoff CMs according to that distribution, even if your own prior has a different distribution. In other words, the key to the CM isn’t your own distribution, it’s Omega’s. And it’s not possible to interpret UDT as epistemic advice, which leaves anthropic questions open. So I haven’t yet shifted to UDT outright.
(The reason I did not answer your question earlier was that it seemed to require a response at greater length than the above.)
Hi, this is the 2-week reminder that you haven’t posted your longer response yet. :)
Well, you’re right in the sense that I can’t understand the example you gave. (I waited a couple of days to see if it would become clear, but it didn’t) But the rest of the response is helpful.
Did he ever get around to explaining this in more detail? I don’t remember reading a reply to this, but I think I’ve just figured out the idea: Suppose you get word that Omega is coming to the neighbourhood and going to offer counterfactual muggings. What sort of algorithm do you want to self-modify into? You don’t know what CMs Omega is going to offer; all you know is that it will offer odds according to its well-calibrated prior. Thus, it has higher expected utility to be a CM-accepter than a CM-rejecter, and even a CDT agent would want to self-modify.
I don’t think that’s a problem for UDT, though. What UDT will compute when asked to pay is the expected utility under its prior of paying up when Omega asks it to; thus, the condition for UDT to pay up is NOT
but
In other words, UDT takes the quality of Omega’s predictions into account and acts as if updating on them (the same way you would update if Omega told you who it expects to win the next election, at 98% probability).
CDT agents, as usual, will actually want to self-modify into a UDT agent whose prior equals the CDT agent’s posterior [ETA: wait, sorry, no, they won’t act as if they can acausally control other instances of the same program, but they will self-modify so as to make future instances of themselves (which obviously they control causally) act in a way that maximizes EU according to the agent’s present posterior, and that’s what we need here], and will use the second formula above accordingly—they don’t want to be a general CM-rejecter, but they think that they can do even better than being a general CM-accepter if they refuse to pay up if at the time of self-modification they assigned low probability to tails, even conditional on Omega offering them a CM.
He never explained further, and actually I still don’t quite understand the example even given your explanation. Maybe you can reply directly to Eliezer’s comment so he can see it in his inbox, and let us know if he still thinks it’s a problem for UDT?