If you ask a mathematician to find 0x + 1 for x = 3, they will answer 1. If you then ask the mathematician to find the 10th root of the factorial of the eighth Mersenne prime, multiplied by zero, plus one, they will answer 1. You may protest they didn’t actually calculate the eighth Mersenne prime, find its factorial, or calculate the tenth root of that, but you can’t deny they gave the right answer.
If you put CDT in a room with a million dollars in Box A and a thousand dollars in Box B (no Omega, just the boxes), and give it the choice of either A or both, it will take both, and walk away with one million and one thousand dollars. If you explain this whole Omega thing to CDT, then put it in the room, it will notice that it doesn’t actually need to calculate the eighth Mersenne prime, etc, because when Omega leaves you are effectively multiplying by zero—all the fancy simulating is irrelevant because the room is just two boxes that may contain money, and you can take both.
Yes, CDT doesn’t think it’s playing Newcomb’s Puzzle, it thinks it’s playing “enter a room with money”.
You’re completely right, except that (assuming I understand you correctly) you’re implying CDT only thinks it’s playing “room with money”, while in reality it would be playing Newcomb.
And that’s the issue; in reality Newcomb cannot exist, and if in theory you think you’re playing something, you are playing it.
Perfect sense. Theorising that CDT would lose because it’s playing a different game is uninteresting as a thought experiment; if I theorise that any decision theory is playing a different game it will also lose; this is not a property of CDT but of the hypothetical.
Let’s turn to the case of playing in reality, as it’s the interesting one.
If you grant that Newcomb paradoxes might exist in reality, then there is a real problem: CDT can’t distinguish between free money boxes and Newcomb paradoxes, so so when it encounters a Newcomb situation it underperforms.
If you claim Newcomb cannot exist in reality, then this is not a problem with CDT. I (and hopefully others, though I shan’t speak for them) would accept that this is not a problem with CDT if it is shown that Newcomb’s is not possible in real life—but we are arguing against you here because we think Newcomb is possible. (Okay, I did speak for them).
I disagree on two points: one, I think a simulator is possible (that is, Omega ’s impossibility comes from other powers we’ve given it, we can remove those powers and weaken Omega to a fits-in-reality definition without losing prediction), and two, I don’t think the priors-and-payoffs approach to an empirical predictor is correct (for game-theoretic reasons which I can explicate if you’d like, but if it’s not the point of contention it would only distract).
CDT can’t distinguish between free money boxes and Newcomb paradoxes
No, CDT can in fact distinguish very well. It always concludes that the money is there, and it is always right, because it never encounters Newcomb.
we think Newcomb is possible.
To clarify: You are talking about actual Newcomb with an omniscient being, yes? Because in that case, I think several posters have already stated they deem this impossible, and Nozick agrees.
If you’re talking about empirical Newcomb, that certainly is possible, but it is impossible to do better than CDT without choosing differently in other situations, because if you’ve acted like CDT in the past, Omega is going to assume you are CDT, even if you’re not.
I disagree on two points: one, I think a simulator is possible (that is, Omega ’s impossibility comes from other powers we’ve given it, we can remove those powers and weaken Omega to a fits-in-reality definition without losing prediction)
I agree on the “we can remove those powers and weaken Omega to a fits-in-reality definition without losing prediction” part, but this will change what the “correct” answer is. For example, you could substitute Omega with a coin toss and repeat the game if Omega is wrong. This is still a one-time problem, because Omega is a coin and therefore has no memory, but CDT, which would two-box in empirical Newcomb, one-boxes in this case and takes the $1,000,000.
and two, I don’t think the priors-and-payoffs approach to an empirical predictor is correct (for game-theoretic reasons which I can explicate if you’d like, but if it’s not the point of contention it would only distract).
I don’t think this is the point of contention, but after we’ve settled that, I would be interested in hearing your line of thought on this.
To clarify: You are talking about actual Newcomb with an omniscient being, yes? Because in that case, I think several posters have already stated they deem this impossible, and Nozick agrees.
How about the version where agents are computer programs, and Omega runs a simulation of the agent facing the choice, observes it’s behavior, and fills the boxes accordingly?
If you are a computer program that can be simulated, then the problem also becomes trivial, because either the simulation can be incorrect, in which case Omega is not omniscient, or the simulation cannot be incorrect, in which case you don’t have a choice.
If the simulation is correct, a program that chooses to one-box will get $1,000,000, and a program that chooses to two-box will get $1,000. I wouldn’t call that “not having a choice”.
So if a program is programmed to print zeroes on a screen, and another program is programmed to print ones, you would say both programs chose their number?
I hope you don’t, because that would be an insane statement. However if you disagree with this, I fail to see how you could be a computer program that can always be correctly simulated, but still has a choice.
You seem to be fighting the hypothetical, but I don’t know if you’re doing it out of mistrust or because some background would be helpful. I’ll assume helpful background would be helpful… :-)
A program could be designed to (1) search for relevant sensory data within a larger context, (2) derive a mixed strategy given the input data, (3) gets more bits of salt from local thermal fluctuations than log2(number of possible actions), (4) drop the salt into a pseudo-random number generator over its derived mixed strategy, and (5) output whatever falls out as its action. This rough algorithm seems strongly deterministic in some ways, and yet also strongly reminiscent of “choice” in others.
This formulation reduces the “magic” of Omega to predicting the relatively fixed elements of the agent (ie, steps 1, 2, and 4) which seems roughly plausible as a matter of psychology and input knowledge and so on, and also either (A) knowing from this that the strategy that will be derived isn’t actually mixed so the salt is irrelevant, or else (B) having access/control of the salt in step 3.
In AI design, steps 1 and 2 are under the programmer’s control to some degree. Some ways of writing the program might make the AI more or less tractable/benevolent/functional/wise and it seems like it would be good to know which ways are likely to produce better outcomes before any such AI is built and achieves takeoff rather than after. Hence the interest in this thought experiment as an extreme test case. The question is not whether step 3 is pragmatically possible for an imaginary Omega to hack in real life. The question is how to design steps 1 and 2 in toy scenarios where the program’s ability to decide how to pre-commit and self-edit are the central task, so that harder scenarios can be attacked as “similar to a simpler solved problem”.
If you say “Your only choices are flipping a coin or saying a predetermined answer” you’re dodging the real question. You can be dragged back to the question by simply positing “Omega predicts the coin flip, what then?” If there’s time and room for lots and lots of words (rather than just seven words) then another way to bring attention back to the question is to explain about fighting the hypothetical, try to build rapport, see if you can learn to play along so that you can help advance a useful intellectual project.
If you still “don’t get it”, then please, at least don’t clog up the channel. If you do get it, please offer better criticism. Like, if you know of a different but better thought experiment where effectively-optimizing self-modifying pre-commitment is the central feature of study, that would be useful.
I don’t fight any hypothesis. If backwards causality is possible, one-boxing obviously wins.
But backwards causality cannot exist in reality, and therefore my decision cannot affect Omega’s prediction of that decision. I would be very surprised if the large majority of LW posters would disagree with that statement; most of them seem to just ignore this level of the problem.
A program could be designed to (1) search for relevant sensory data within a larger context, (2) derive a mixed strategy given the input data, (3) gets more bits of salt from local thermal fluctuations than log2(number of possible actions), (4) drop the salt into a pseudo-random number generator over its derived mixed strategy, and (5) output whatever falls out as its action. This rough algorithm seems strongly deterministic in some ways, and yet also strongly reminiscent of “choice” in others.
This formulation reduces the “magic” of Omega to predicting the relatively fixed elements of the agent (ie, steps 1, 2, and 4) which seems roughly plausible as a matter of psychology and input knowledge and so on, and also either (A) knowing from this that the strategy that will be derived isn’t actually mixed so the salt is irrelevant, or else (B) having access/control of the salt in step 3.
In this example, the correct solution would not be to “choose” to one-box, but to choose to adopt a strategy that causes you to one-box before Omega makes its prediction, and therefore before you know you’re playing Newcomb. This is not Newcomb anymore, this is a new problem. In this new problem, CDT will decide to adopt a strategy that causes it to one-box (it will precommit).
In this new problem, CDT will decide to adopt a strategy that causes it to one-box (it will precommit).
Similarly, if a CDT agent is facing no immediate decision problem but has the capability to self modify it will modify itself to an agent that implements a new decision theory (call it, for example, CDT++). The self modified agent will then behave as if it implements a Reflective Decision Theory (UDT, TDT, etc) for the purpose of all influence over the universe after the time of self modification but like CDT for the purpose of all influence before the time of self modification. This means roughly that it will behave as if it had made all the correct ‘precommitments’ at that time. It’ll then cooperate against equivalent agents in prisoner’s dilemmas and one box on future Newcomb’s problems unless Omega says “Oh, and I made the prediction and filled the boxes back before you self modified away from CDT, I’m just showing them to you now”.
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
I am somewhat familiar with the topic but note that I am most familiar with the work that has already moved past CDT (ie. considers CDT irrational and inferior to a reflective decision theory along the lines of TDT or UDT). Thus far nobody has got around to formally writing up a “What CDT self modifies to” paper that I’m aware of (I wish they would!). It would be interesting to see what someone coming from the assumption that CDT is sane could come up with. Again I’m unfamiliar with such attempts but in this case that is far less evidence about such things existing.
I wasn’t asking for a concrete alternative for CDT. If anything, I’m interested in a proof that such a decision theory can possibly exist. Because trying to find an alternative when you haven’t proven this seems like a task with a very low chance of success.
I wasn’t asking for a concrete alternative for CDT.
I wasn’t offering alternatives—I was looking specifically at what CDT will inevitably self modify into (which is itself not optimal—just what CDT will do). The mention of alternatives was to convey to you that what I say on the subject and what I refer to would require making inferential steps that you have indicated you aren’t likely to make.
Incidentally, proving that CDT will (given the option) modify into something else is a very different thing than proving that there is a better alternative to CDT. Either could be true without implying the other.
That is true, and if you cannot prove that such a decision theory exists, then CDT modifying itself is not the necessarily correct answer to meta-Newcomb, correct?
Consider programs that, given the description of a situation (possibly including a chain of events leading to it) and a list of possible actions, returns one of the actions. It doesn’t seem to be a stretch of language to say that such programs are “choosing”, because the way those programs react to their situation can be very similar to the way humans react (consider: finding the shortest path between two points; playing a turn-based strategy game, etc.).
Whether programs that are hard-coded to always return a particular answer “choose” or not is a very boring question of semantics, like “does a tree falling in the forest make a sound if no-one is around to hear it”.
Given a description of Newcomb’s problem, a well-written program will one-box, and a badly-written one will two-box. The difference between the two is not trivial.
Given a description of Newcomb’s problem, a well-written program will one-box, and a badly-written one will two-box. The difference between the two is not trivial.
I see your point now, and I agree with the quoted statement. However, there’s a difference between Newcomb, where you make your decision after Omega made its prediction, and “meta-Newcomb”, where you’re allowed to precommit before Omega makes its prediction, for example by choosing your programming. In meta-Newcomb, I don’t even have to consider being a computer program that can be simulated; I can just give my good friend Epsilon, who always exactly does what he is told, a gun and tell him to shoot me if I lie, then tell Omega I’m going to one-box, and then Omega would make its prediction. I would one-box, get $1,000,000 and, more importantly, not shot.
This is a decision that CDT would make, given the opportunity.
there’s a difference between Newcomb, where you make your decision after Omega made its prediction, and “meta-Newcomb”, where you’re allowed to precommit before Omega makes its prediction, for example by choosing your programming.
I agree that meta-Newcomb is not the same problem, and that in meta-newcomb CDT would precommit to one-box.
However, even in normal Newcomb, it’s possible to have agents that behave as if they had precommited when they realize precomitting would have been better for them. More specifically, in pseudocode:
function take_decision(information_about_world, actions):
for each action:
calculate the utility that an agent that always returns that action would have got
return the action that got the highest utility
There are some subtleties, notably about how to take the information about the world into account, but an agent built along this model should one-box on problems like Newcomb’s, while two-boxing in cases where Omega decides by flipping a coin.
(such an agent; however, doesn’t cooperate with itself in prisonner’s dilemma, you need a better agent for that)
You are 100% correct. However, if you say “it’s possible to have agents that behave as if they had precommited”, then you are not talking about what’s the best decision to make in this situation, but what’s the best decision theory to have in this situation, and that is, again, meta-Newcomb, because the decision which decision theory you’re going to follow is a decision you have to make before Omega makes its prediction. Switching to this decision theory after Omega makes its prediction doesn’t work, obviously, so this is not a solution for Newcomb.
I can just give my good friend Epsilon, who always exactly does what he is told, a gun and tell him to shoot me if I lie, then tell Omega I’m going to one-box, and then Omega would make its prediction. I would one-box, get $1,000,000 and, more importantly, not shot.
When I first read this I took it literally, as using Epsilon directly as a lie detector. That had some interesting potential side effects (like death) for a CDT agent. On second reading I take it to mean “Stay around with the gun until after everything is resolved and if I forswear myself kill me”. As a CDT agent you need to be sure that Epsilon will stay with the gun until you have abandoned the second box. If Epsilon just scans your thoughts, detects whether you are lying and then leaves then CDT will go ahead and take both boxes anyway. (It’s mind-boggling to think of agents that couldn’t even manage cooperation with themselves with $1m on the line and a truth oracle right there to help them!)
Yeah, I meant that Epsilon would shoot if you two-box after having said you would one-box. In the end, “Epsilon with a gun” is just a metaphor for / specific instance of precommitting, as is “computer program that can choose its programming”.
If you ask a mathematician to find 0x + 1 for x = 3, they will answer 1. If you then ask the mathematician to find the 10th root of the factorial of the eighth Mersenne prime, multiplied by zero, plus one, they will answer 1. You may protest they didn’t actually calculate the eighth Mersenne prime, find its factorial, or calculate the tenth root of that, but you can’t deny they gave the right answer.
If you put CDT in a room with a million dollars in Box A and a thousand dollars in Box B (no Omega, just the boxes), and give it the choice of either A or both, it will take both, and walk away with one million and one thousand dollars. If you explain this whole Omega thing to CDT, then put it in the room, it will notice that it doesn’t actually need to calculate the eighth Mersenne prime, etc, because when Omega leaves you are effectively multiplying by zero—all the fancy simulating is irrelevant because the room is just two boxes that may contain money, and you can take both.
Yes, CDT doesn’t think it’s playing Newcomb’s Puzzle, it thinks it’s playing “enter a room with money”.
You’re completely right, except that (assuming I understand you correctly) you’re implying CDT only thinks it’s playing “room with money”, while in reality it would be playing Newcomb.
And that’s the issue; in reality Newcomb cannot exist, and if in theory you think you’re playing something, you are playing it.
Does that make sense?
Perfect sense. Theorising that CDT would lose because it’s playing a different game is uninteresting as a thought experiment; if I theorise that any decision theory is playing a different game it will also lose; this is not a property of CDT but of the hypothetical.
Let’s turn to the case of playing in reality, as it’s the interesting one.
If you grant that Newcomb paradoxes might exist in reality, then there is a real problem: CDT can’t distinguish between free money boxes and Newcomb paradoxes, so so when it encounters a Newcomb situation it underperforms.
If you claim Newcomb cannot exist in reality, then this is not a problem with CDT. I (and hopefully others, though I shan’t speak for them) would accept that this is not a problem with CDT if it is shown that Newcomb’s is not possible in real life—but we are arguing against you here because we think Newcomb is possible. (Okay, I did speak for them).
I disagree on two points: one, I think a simulator is possible (that is, Omega ’s impossibility comes from other powers we’ve given it, we can remove those powers and weaken Omega to a fits-in-reality definition without losing prediction), and two, I don’t think the priors-and-payoffs approach to an empirical predictor is correct (for game-theoretic reasons which I can explicate if you’d like, but if it’s not the point of contention it would only distract).
No, CDT can in fact distinguish very well. It always concludes that the money is there, and it is always right, because it never encounters Newcomb.
To clarify: You are talking about actual Newcomb with an omniscient being, yes? Because in that case, I think several posters have already stated they deem this impossible, and Nozick agrees.
If you’re talking about empirical Newcomb, that certainly is possible, but it is impossible to do better than CDT without choosing differently in other situations, because if you’ve acted like CDT in the past, Omega is going to assume you are CDT, even if you’re not.
I agree on the “we can remove those powers and weaken Omega to a fits-in-reality definition without losing prediction” part, but this will change what the “correct” answer is. For example, you could substitute Omega with a coin toss and repeat the game if Omega is wrong. This is still a one-time problem, because Omega is a coin and therefore has no memory, but CDT, which would two-box in empirical Newcomb, one-boxes in this case and takes the $1,000,000.
I don’t think this is the point of contention, but after we’ve settled that, I would be interested in hearing your line of thought on this.
How about the version where agents are computer programs, and Omega runs a simulation of the agent facing the choice, observes it’s behavior, and fills the boxes accordingly?
I see no violation of causality in that version.
If you are a computer program that can be simulated, then the problem also becomes trivial, because either the simulation can be incorrect, in which case Omega is not omniscient, or the simulation cannot be incorrect, in which case you don’t have a choice.
If the simulation is correct, a program that chooses to one-box will get $1,000,000, and a program that chooses to two-box will get $1,000. I wouldn’t call that “not having a choice”.
So if a program is programmed to print zeroes on a screen, and another program is programmed to print ones, you would say both programs chose their number?
I hope you don’t, because that would be an insane statement. However if you disagree with this, I fail to see how you could be a computer program that can always be correctly simulated, but still has a choice.
You seem to be fighting the hypothetical, but I don’t know if you’re doing it out of mistrust or because some background would be helpful. I’ll assume helpful background would be helpful… :-)
A program could be designed to (1) search for relevant sensory data within a larger context, (2) derive a mixed strategy given the input data, (3) gets more bits of salt from local thermal fluctuations than log2(number of possible actions), (4) drop the salt into a pseudo-random number generator over its derived mixed strategy, and (5) output whatever falls out as its action. This rough algorithm seems strongly deterministic in some ways, and yet also strongly reminiscent of “choice” in others.
This formulation reduces the “magic” of Omega to predicting the relatively fixed elements of the agent (ie, steps 1, 2, and 4) which seems roughly plausible as a matter of psychology and input knowledge and so on, and also either (A) knowing from this that the strategy that will be derived isn’t actually mixed so the salt is irrelevant, or else (B) having access/control of the salt in step 3.
In AI design, steps 1 and 2 are under the programmer’s control to some degree. Some ways of writing the program might make the AI more or less tractable/benevolent/functional/wise and it seems like it would be good to know which ways are likely to produce better outcomes before any such AI is built and achieves takeoff rather than after. Hence the interest in this thought experiment as an extreme test case. The question is not whether step 3 is pragmatically possible for an imaginary Omega to hack in real life. The question is how to design steps 1 and 2 in toy scenarios where the program’s ability to decide how to pre-commit and self-edit are the central task, so that harder scenarios can be attacked as “similar to a simpler solved problem”.
If you say “Your only choices are flipping a coin or saying a predetermined answer” you’re dodging the real question. You can be dragged back to the question by simply positing “Omega predicts the coin flip, what then?” If there’s time and room for lots and lots of words (rather than just seven words) then another way to bring attention back to the question is to explain about fighting the hypothetical, try to build rapport, see if you can learn to play along so that you can help advance a useful intellectual project.
If you still “don’t get it”, then please, at least don’t clog up the channel. If you do get it, please offer better criticism. Like, if you know of a different but better thought experiment where effectively-optimizing self-modifying pre-commitment is the central feature of study, that would be useful.
I don’t fight any hypothesis. If backwards causality is possible, one-boxing obviously wins.
But backwards causality cannot exist in reality, and therefore my decision cannot affect Omega’s prediction of that decision. I would be very surprised if the large majority of LW posters would disagree with that statement; most of them seem to just ignore this level of the problem.
In this example, the correct solution would not be to “choose” to one-box, but to choose to adopt a strategy that causes you to one-box before Omega makes its prediction, and therefore before you know you’re playing Newcomb. This is not Newcomb anymore, this is a new problem. In this new problem, CDT will decide to adopt a strategy that causes it to one-box (it will precommit).
Similarly, if a CDT agent is facing no immediate decision problem but has the capability to self modify it will modify itself to an agent that implements a new decision theory (call it, for example, CDT++). The self modified agent will then behave as if it implements a Reflective Decision Theory (UDT, TDT, etc) for the purpose of all influence over the universe after the time of self modification but like CDT for the purpose of all influence before the time of self modification. This means roughly that it will behave as if it had made all the correct ‘precommitments’ at that time. It’ll then cooperate against equivalent agents in prisoner’s dilemmas and one box on future Newcomb’s problems unless Omega says “Oh, and I made the prediction and filled the boxes back before you self modified away from CDT, I’m just showing them to you now”.
A CDT agent will do this, if it can be proven that it cannot make worse decisions after the modification than if it had not modified itself. I actually tried to find literature on this a while back, but couldn’t find any, so I assigned a very low probability to the possibility that this could be proven. Seeing how you seem to be familiar with the topic, do you know of any?
I am somewhat familiar with the topic but note that I am most familiar with the work that has already moved past CDT (ie. considers CDT irrational and inferior to a reflective decision theory along the lines of TDT or UDT). Thus far nobody has got around to formally writing up a “What CDT self modifies to” paper that I’m aware of (I wish they would!). It would be interesting to see what someone coming from the assumption that CDT is sane could come up with. Again I’m unfamiliar with such attempts but in this case that is far less evidence about such things existing.
I wasn’t asking for a concrete alternative for CDT. If anything, I’m interested in a proof that such a decision theory can possibly exist. Because trying to find an alternative when you haven’t proven this seems like a task with a very low chance of success.
I wasn’t offering alternatives—I was looking specifically at what CDT will inevitably self modify into (which is itself not optimal—just what CDT will do). The mention of alternatives was to convey to you that what I say on the subject and what I refer to would require making inferential steps that you have indicated you aren’t likely to make.
Incidentally, proving that CDT will (given the option) modify into something else is a very different thing than proving that there is a better alternative to CDT. Either could be true without implying the other.
That is true, and if you cannot prove that such a decision theory exists, then CDT modifying itself is not the necessarily correct answer to meta-Newcomb, correct?
Consider programs that, given the description of a situation (possibly including a chain of events leading to it) and a list of possible actions, returns one of the actions. It doesn’t seem to be a stretch of language to say that such programs are “choosing”, because the way those programs react to their situation can be very similar to the way humans react (consider: finding the shortest path between two points; playing a turn-based strategy game, etc.).
Whether programs that are hard-coded to always return a particular answer “choose” or not is a very boring question of semantics, like “does a tree falling in the forest make a sound if no-one is around to hear it”.
Given a description of Newcomb’s problem, a well-written program will one-box, and a badly-written one will two-box. The difference between the two is not trivial.
I see your point now, and I agree with the quoted statement. However, there’s a difference between Newcomb, where you make your decision after Omega made its prediction, and “meta-Newcomb”, where you’re allowed to precommit before Omega makes its prediction, for example by choosing your programming. In meta-Newcomb, I don’t even have to consider being a computer program that can be simulated; I can just give my good friend Epsilon, who always exactly does what he is told, a gun and tell him to shoot me if I lie, then tell Omega I’m going to one-box, and then Omega would make its prediction. I would one-box, get $1,000,000 and, more importantly, not shot.
This is a decision that CDT would make, given the opportunity.
I agree that meta-Newcomb is not the same problem, and that in meta-newcomb CDT would precommit to one-box.
However, even in normal Newcomb, it’s possible to have agents that behave as if they had precommited when they realize precomitting would have been better for them. More specifically, in pseudocode:
There are some subtleties, notably about how to take the information about the world into account, but an agent built along this model should one-box on problems like Newcomb’s, while two-boxing in cases where Omega decides by flipping a coin.
(such an agent; however, doesn’t cooperate with itself in prisonner’s dilemma, you need a better agent for that)
You are 100% correct. However, if you say “it’s possible to have agents that behave as if they had precommited”, then you are not talking about what’s the best decision to make in this situation, but what’s the best decision theory to have in this situation, and that is, again, meta-Newcomb, because the decision which decision theory you’re going to follow is a decision you have to make before Omega makes its prediction. Switching to this decision theory after Omega makes its prediction doesn’t work, obviously, so this is not a solution for Newcomb.
When I first read this I took it literally, as using Epsilon directly as a lie detector. That had some interesting potential side effects (like death) for a CDT agent. On second reading I take it to mean “Stay around with the gun until after everything is resolved and if I forswear myself kill me”. As a CDT agent you need to be sure that Epsilon will stay with the gun until you have abandoned the second box. If Epsilon just scans your thoughts, detects whether you are lying and then leaves then CDT will go ahead and take both boxes anyway. (It’s mind-boggling to think of agents that couldn’t even manage cooperation with themselves with $1m on the line and a truth oracle right there to help them!)
Yeah, I meant that Epsilon would shoot if you two-box after having said you would one-box. In the end, “Epsilon with a gun” is just a metaphor for / specific instance of precommitting, as is “computer program that can choose its programming”.
Have you also already thought about free will?