In this new problem, CDT will decide to adopt a strategy that causes it to one-box (it will precommit).
Similarly, if a CDT agent is facing no immediate decision problem but has the capability to self modify it will modify itself to an agent that implements a new decision theory (call it, for example, CDT++). The self modified agent will then behave as if it implements a Reflective Decision Theory (UDT, TDT, etc) for the purpose of all influence over the universe after the time of self modification but like CDT for the purpose of all influence before the time of self modification. This means roughly that it will behave as if it had made all the correct ‘precommitments’ at that time. It’ll then cooperate against equivalent agents in prisoner’s dilemmas and one box on future Newcomb’s problems unless Omega says “Oh, and I made the prediction and filled the boxes back before you self modified away from CDT, I’m just showing them to you now”.
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
I am somewhat familiar with the topic but note that I am most familiar with the work that has already moved past CDT (ie. considers CDT irrational and inferior to a reflective decision theory along the lines of TDT or UDT). Thus far nobody has got around to formally writing up a “What CDT self modifies to” paper that I’m aware of (I wish they would!). It would be interesting to see what someone coming from the assumption that CDT is sane could come up with. Again I’m unfamiliar with such attempts but in this case that is far less evidence about such things existing.
I wasn’t asking for a concrete alternative for CDT. If anything, I’m interested in a proof that such a decision theory can possibly exist. Because trying to find an alternative when you haven’t proven this seems like a task with a very low chance of success.
I wasn’t asking for a concrete alternative for CDT.
I wasn’t offering alternatives—I was looking specifically at what CDT will inevitably self modify into (which is itself not optimal—just what CDT will do). The mention of alternatives was to convey to you that what I say on the subject and what I refer to would require making inferential steps that you have indicated you aren’t likely to make.
Incidentally, proving that CDT will (given the option) modify into something else is a very different thing than proving that there is a better alternative to CDT. Either could be true without implying the other.
That is true, and if you cannot prove that such a decision theory exists, then CDT modifying itself is not the necessarily correct answer to meta-Newcomb, correct?
Similarly, if a CDT agent is facing no immediate decision problem but has the capability to self modify it will modify itself to an agent that implements a new decision theory (call it, for example, CDT++). The self modified agent will then behave as if it implements a Reflective Decision Theory (UDT, TDT, etc) for the purpose of all influence over the universe after the time of self modification but like CDT for the purpose of all influence before the time of self modification. This means roughly that it will behave as if it had made all the correct ‘precommitments’ at that time. It’ll then cooperate against equivalent agents in prisoner’s dilemmas and one box on future Newcomb’s problems unless Omega says “Oh, and I made the prediction and filled the boxes back before you self modified away from CDT, I’m just showing them to you now”.
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
I am somewhat familiar with the topic but note that I am most familiar with the work that has already moved past CDT (ie. considers CDT irrational and inferior to a reflective decision theory along the lines of TDT or UDT). Thus far nobody has got around to formally writing up a “What CDT self modifies to” paper that I’m aware of (I wish they would!). It would be interesting to see what someone coming from the assumption that CDT is sane could come up with. Again I’m unfamiliar with such attempts but in this case that is far less evidence about such things existing.
I wasn’t asking for a concrete alternative for CDT. If anything, I’m interested in a proof that such a decision theory can possibly exist. Because trying to find an alternative when you haven’t proven this seems like a task with a very low chance of success.
I wasn’t offering alternatives—I was looking specifically at what CDT will inevitably self modify into (which is itself not optimal—just what CDT will do). The mention of alternatives was to convey to you that what I say on the subject and what I refer to would require making inferential steps that you have indicated you aren’t likely to make.
Incidentally, proving that CDT will (given the option) modify into something else is a very different thing than proving that there is a better alternative to CDT. Either could be true without implying the other.
That is true, and if you cannot prove that such a decision theory exists, then CDT modifying itself is not the necessarily correct answer to meta-Newcomb, correct?