No, I mean I think CDT can one-box within the regular Newcomb’s problem situation, if its reasoning capabilities are sufficiently strong. In detail: here and in the thread here.
This might not satisfactorily answer your confusion but: CDT is defined by the fact that it has incorrect causal graphs. If it has correct causal graphs then it’s not CDT. Why bother talking about a “decision theory” that is arbitrarily limited to incorrect causal graphs? Because that’s the decision theory that academic decision theorists like to talk about and treat as default. Why did academic decision theorists never realize that their causal graphs were wrong? No one has a very good model of that, but check out Wei Dai’s related speculation here. Note that if you define causality in a technical Markovian way and use Bayes nets then there is no difference between CDT and TDT.
I used to get annoyed because CDT with a good enough world model should clearly one-box yet people stipulated that it wouldn’t; only later did I realize that it’s mostly a rhetorical thing and no one thinks that if you actually implemented an AGI with “CDT” that it’d be as dumb as academia/LessWrong’s version of CDT.
If I’m wrong about any of the above then someone please correct me as this is relevant to FAI strategy.
No, I mean I think CDT can one-box within the regular Newcomb’s problem situation, if its reasoning capabilities are sufficiently strong. In detail: here and in the thread here.
No, if you have an agent that is one boxing either it is not a CDT agent or the game it is playing is not Newcomb’s problem. More specifically, in your first link you describe a game that is not Newcomb’s problem and in the second link you describe an agent that does not implement CDT.
More specifically, in your first link you describe a game that is not Newcomb’s problem and in the second link you describe an agent that does not implement CDT
It would be a little more helpful, although probably not quite as cool-sounding, if you explained in what way the game is not Newcomb’s in the first link, and the agent not a CDT in the second. AFAIK, the two links describe exactly the same problem and exactly the same agent, and I wrote both comments.
It would be a little more helpful, although probably not quite as cool-sounding,
That doesn’t seem to make helping you appealing.
if you explained in what way the game is not Newcomb’s in the first link,
The agent believes that it is has 50% chance of being in an actual Newcomb’s problem and 50% chance of being in a simulation which will be used to present another agent with a Newcomb’s problem some time in the future.
and the agent not a CDT in the second.
Orthonormal already explained this in the context.
Yes, I have this problem, working on it. I’m sorry, and thanks for your patience!
The agent believes that it is has 50% chance of being in an actual Newcomb’s problem and 50% chance of being in a simulation which will be used to present another agent with a Newcomb’s problem some time in the future.
Yes, except for s/another agent/itself/. In what way this is not a correct description of a pure Newcomb’s problem from the agent’s point of view? This is my original still unanswered question.
Note: in the usual formulations of Newcomb’s problem for UDT, the agent knows exactly that—it is called twice, and when it is running it does not know which of the two calls is being evaluated.
Orthonormal already explained this in the context.
I answered his explanation in the context, and he appeared to agree. His other objection seems to be based on a mistaken understanding.
This is worth writing into its own post- a CDT agent with a non-self-centered utility function (like a paperclip maximizer) and a certain model of anthropics (in which, if it knows it’s being simulated, it views itself as possibly within the simulation), when faced with a Predictor that predicts by simulating (which is not always the case), one-boxes on Newcomb’s Problem.
This is a novel and surprising result in the academic literature on CDT, not the prediction they expected. But it seems to me that if you violate any of the conditions above, one-boxing collapses back into two-boxing; and furthermore, it won’t cooperate in the Prisoner’s Dilemma against a CDT agent with an orthogonal utility function. That, at least, is inescapable from the independence assumption.
And as I replied there, this depends on its utility function being such that “filling the box for my non-simulated copy” has utiity comparable to “taking the extra box when I’m not simulated”. There are utility functions for which this works (e.g. maximizing paperclips in the real world) and utility functions for which it doesn’t (e.g. maximizing hedons in my personal future, whether I’m being simulated or not), and Omega can slightly change the problem (simulate an agent with the same decision algorithm as X but a different utility function) in a way that makes CDT two-box again. (That trick wouldn’t stop TDT/UDT/ADT from one-boxing.)
Omega can slightly change the problem (simulate an agent with the same decision algorithm as X but a different utility function)
This is irrelevant. The agent is actually outside, thinking what to do in the Newcomb’s problem. But only we know this, the agent itself doesn’t. All the agent knows is that Omega always predicts correctly. Which means, the agent can model Omega as a perfect simulator. The actual method that Omega uses to make predictions does not matter, the world would look the same to the agent, regardless.
Errrr. The agent does not simulate anything in my argument. The agent has a “mental model” of Omega, in which Omega is a perfect simulator. It’s about representation of the problem within the agent’s mind.
In your link, Omega—the function U() - is a perfect simulator. It calls the agent function A() twice, once to get its prediction, and once for the actual decision.
The problem would work as well if the first call went not to A directly but querying the oracle whether A()=1. There are ways of predicting that aren’t simulation, and if that’s the case then your idea falls apart.
No, I mean I think CDT can one-box within the regular Newcomb’s problem situation, if its reasoning capabilities are sufficiently strong. In detail: here and in the thread here.
This might not satisfactorily answer your confusion but: CDT is defined by the fact that it has incorrect causal graphs. If it has correct causal graphs then it’s not CDT. Why bother talking about a “decision theory” that is arbitrarily limited to incorrect causal graphs? Because that’s the decision theory that academic decision theorists like to talk about and treat as default. Why did academic decision theorists never realize that their causal graphs were wrong? No one has a very good model of that, but check out Wei Dai’s related speculation here. Note that if you define causality in a technical Markovian way and use Bayes nets then there is no difference between CDT and TDT.
I used to get annoyed because CDT with a good enough world model should clearly one-box yet people stipulated that it wouldn’t; only later did I realize that it’s mostly a rhetorical thing and no one thinks that if you actually implemented an AGI with “CDT” that it’d be as dumb as academia/LessWrong’s version of CDT.
If I’m wrong about any of the above then someone please correct me as this is relevant to FAI strategy.
No, if you have an agent that is one boxing either it is not a CDT agent or the game it is playing is not Newcomb’s problem. More specifically, in your first link you describe a game that is not Newcomb’s problem and in the second link you describe an agent that does not implement CDT.
It would be a little more helpful, although probably not quite as cool-sounding, if you explained in what way the game is not Newcomb’s in the first link, and the agent not a CDT in the second. AFAIK, the two links describe exactly the same problem and exactly the same agent, and I wrote both comments.
That doesn’t seem to make helping you appealing.
The agent believes that it is has 50% chance of being in an actual Newcomb’s problem and 50% chance of being in a simulation which will be used to present another agent with a Newcomb’s problem some time in the future.
Orthonormal already explained this in the context.
Yes, I have this problem, working on it. I’m sorry, and thanks for your patience!
Yes, except for s/another agent/itself/. In what way this is not a correct description of a pure Newcomb’s problem from the agent’s point of view? This is my original still unanswered question.
Note: in the usual formulations of Newcomb’s problem for UDT, the agent knows exactly that—it is called twice, and when it is running it does not know which of the two calls is being evaluated.
I answered his explanation in the context, and he appeared to agree. His other objection seems to be based on a mistaken understanding.
This is worth writing into its own post- a CDT agent with a non-self-centered utility function (like a paperclip maximizer) and a certain model of anthropics (in which, if it knows it’s being simulated, it views itself as possibly within the simulation), when faced with a Predictor that predicts by simulating (which is not always the case), one-boxes on Newcomb’s Problem.
This is a novel and surprising result in the academic literature on CDT, not the prediction they expected. But it seems to me that if you violate any of the conditions above, one-boxing collapses back into two-boxing; and furthermore, it won’t cooperate in the Prisoner’s Dilemma against a CDT agent with an orthogonal utility function. That, at least, is inescapable from the independence assumption.
And as I replied there, this depends on its utility function being such that “filling the box for my non-simulated copy” has utiity comparable to “taking the extra box when I’m not simulated”. There are utility functions for which this works (e.g. maximizing paperclips in the real world) and utility functions for which it doesn’t (e.g. maximizing hedons in my personal future, whether I’m being simulated or not), and Omega can slightly change the problem (simulate an agent with the same decision algorithm as X but a different utility function) in a way that makes CDT two-box again. (That trick wouldn’t stop TDT/UDT/ADT from one-boxing.)
I think you missed my point.
This is irrelevant. The agent is actually outside, thinking what to do in the Newcomb’s problem. But only we know this, the agent itself doesn’t. All the agent knows is that Omega always predicts correctly. Which means, the agent can model Omega as a perfect simulator. The actual method that Omega uses to make predictions does not matter, the world would look the same to the agent, regardless.
Unless Omega predicts without simulating- for instance, this formulation of UDT can be formally proved to one-box without simulating.
Errrr. The agent does not simulate anything in my argument. The agent has a “mental model” of Omega, in which Omega is a perfect simulator. It’s about representation of the problem within the agent’s mind.
In your link, Omega—the function U() - is a perfect simulator. It calls the agent function A() twice, once to get its prediction, and once for the actual decision.
The problem would work as well if the first call went not to A directly but querying the oracle whether A()=1. There are ways of predicting that aren’t simulation, and if that’s the case then your idea falls apart.