Thank you so much for this. I really don’t feel like I understand decision theories very well, so this was helpful. But I’d like to ask a couple questions that I’ve had for awhile that this didn’t really answer.
It seems like TDT can pretty easily fail a version of Newcomb’s problem to me. (Maybe everyone knows this already and I haven’t just seen it anywhere.) Suppose there is a CDT AI that, over the course of a year, modifies to become a TDT. Suppose also that this AI is presented with a variation of Newcomb’s problem. The twist is this: Omega’s placement of money into the opaque box is determined not by the decision theory you currently operate by, but that which you operated by a year ago. As such, Omega will leave $0 in the box. But TDT, as I understand it, acts as though it controls all nodes of the decision making process simultaneously, which makes it vulnerable to processes that take extended periods of time, where the agents decision theory may well have changed. You can probably see where this is going: TDT/former CDT one boxes and finds $0, the inferior option to $1000. This isn’t really a rigorous critique of TDT, since it’s obviously predicated on not being a TDT at a prior point, but it was a question I thought about in the context of a self modifying CDT AI.
It’s actually tough to predict what EDT would do, since it depends on picking the right reference class for yourself, and we have no idea how to formalize those. But the explanations of why EDT would one-box on Newcomb’s Problem appear isomorphic to explanations of why EDT would forgo smoking in the Smoking Lesion problem, so it appears that a basic implementation would fail one or the other.
That shouldn’t happen. If TDT knows that the boxes are being filled by simulating a CDT algorithm (even if that algorithm was its ancestor), then it will two-box.
Thank you so much for this. I really don’t feel like I understand decision theories very well, so this was helpful. But I’d like to ask a couple questions that I’ve had for awhile that this didn’t really answer.
Why does evidential decision theory necessarily fail the Smoking Lesion Problem? That link is technically Solomon’s problem, not Smoking Lesion problem, but it’s related. If p(Cancer|smoking|lesion) = p(Cancer|not smoking|lesion), why is Evidential Decision Theory forbidden from using these probabilities? Evidential decision theory makes a lot of intuitive sense to me and I don’t really see why it’s demonstrably wrong.
It seems like TDT can pretty easily fail a version of Newcomb’s problem to me. (Maybe everyone knows this already and I haven’t just seen it anywhere.) Suppose there is a CDT AI that, over the course of a year, modifies to become a TDT. Suppose also that this AI is presented with a variation of Newcomb’s problem. The twist is this: Omega’s placement of money into the opaque box is determined not by the decision theory you currently operate by, but that which you operated by a year ago. As such, Omega will leave $0 in the box. But TDT, as I understand it, acts as though it controls all nodes of the decision making process simultaneously, which makes it vulnerable to processes that take extended periods of time, where the agents decision theory may well have changed. You can probably see where this is going: TDT/former CDT one boxes and finds $0, the inferior option to $1000. This isn’t really a rigorous critique of TDT, since it’s obviously predicated on not being a TDT at a prior point, but it was a question I thought about in the context of a self modifying CDT AI.
It’s actually tough to predict what EDT would do, since it depends on picking the right reference class for yourself, and we have no idea how to formalize those. But the explanations of why EDT would one-box on Newcomb’s Problem appear isomorphic to explanations of why EDT would forgo smoking in the Smoking Lesion problem, so it appears that a basic implementation would fail one or the other.
That shouldn’t happen. If TDT knows that the boxes are being filled by simulating a CDT algorithm (even if that algorithm was its ancestor), then it will two-box.