Thanks for the post! Your problems look a little similar to Wei’s 2TDT-1CDT, but much simpler. Not sure about the other decision theory folks, but I’m quite puzzled by these problems and don’t see any good answer yet.
I’ve looked a bit at that thread, and the related follow-ups, and my head is now really spinning. You are correct that my problems were simpler!
My immediate best guess on 2TDT-1CDT is that the human player would do better to submit a simple defect-bot (rather than either CDT or TDT), and this is irrespective of whether the player themselves is running TDT or CDT. If the player has to submit his/her own decision algorithm (source-code) instead of a bot, then we get into a colossal tangle about “who defects first”, “whose decision is logically prior to whose” and whether the TDT agents will threaten to defect if they detect that the submitted agent may defect, or has already self-modified into unconditionally defecting, or if the TDT agents will just defect unconditionally anyway to even the score (e.g. through some form of utility trading / long term consequentialism principle that TDT has to beat CDT in the long run, therefore it had better just get on and beat CDT wherever possible...)
In short, I observe I am confused.
With all this logical priority vs temporal priority, and long term consequences feeding into short-term utilities, I’m reminded of the following from HPMOR Chapter 61:
There was a narrowly circulated proverb to the effect that only one Auror in thirty was qualified to investigate cases involving Time-Turners; and that of those few, the half who weren’t already insane, soon would be.
Thanks for this, and for the reference. I’ll have a look at 2TDT-1CDT to see if there are any insights there which could resolve these problems. I’ve got a couple of ideas myself, but will check up on the other work.
Thanks for the post! Your problems look a little similar to Wei’s 2TDT-1CDT, but much simpler. Not sure about the other decision theory folks, but I’m quite puzzled by these problems and don’t see any good answer yet.
I’ve looked a bit at that thread, and the related follow-ups, and my head is now really spinning. You are correct that my problems were simpler!
My immediate best guess on 2TDT-1CDT is that the human player would do better to submit a simple defect-bot (rather than either CDT or TDT), and this is irrespective of whether the player themselves is running TDT or CDT. If the player has to submit his/her own decision algorithm (source-code) instead of a bot, then we get into a colossal tangle about “who defects first”, “whose decision is logically prior to whose” and whether the TDT agents will threaten to defect if they detect that the submitted agent may defect, or has already self-modified into unconditionally defecting, or if the TDT agents will just defect unconditionally anyway to even the score (e.g. through some form of utility trading / long term consequentialism principle that TDT has to beat CDT in the long run, therefore it had better just get on and beat CDT wherever possible...)
In short, I observe I am confused.
With all this logical priority vs temporal priority, and long term consequences feeding into short-term utilities, I’m reminded of the following from HPMOR Chapter 61:
Thanks for this, and for the reference. I’ll have a look at 2TDT-1CDT to see if there are any insights there which could resolve these problems. I’ve got a couple of ideas myself, but will check up on the other work.
Here’s another similar problem; see also the solution.