I’m not the one addressed here, but the term “rational” can be replaced with “useful” here. The argument is that there is a difference between the questions “what is the most useful thing to do in this given situation?” and “what is the most generally useful decision algorithm, and what would it recommend in this situation?” Usefulness is a causal concept, but it is applied to different things here (actions versus decision algorithms that cause actions). CDT answers the first question, MIRI-style decision theories answer something similar to the second question.
What people like me claim is that the answer to the first question may be different from the second. E.g. for Newcomb’s problem, where having a decision algorithm that, in certain situations, picks non-useful actions, can be useful. Like when an entity can read your decision algorithm and predict what action you would pick, and then change its rewards based on that.
In fact, that the answers to the above questions can diverge was already discussed by Thomas Schelling and Derek Parfit. See Parfit in Reasons and Persons:
Consider Schelling’s Answer to Armed Robbery. A man breaks into my house.
He hears me calling the police. But, since the nearest town is far away, the
police cannot arrive in less then fifteen minutes. The man orders me to open the safe in which I hoard my gold. He threatens that, unless he
gets the gold in the next five minutes, he will start shooting my
children, one by one.
What is it rational for me to do? I need the answer fast. I realize that
it would not be rational to give this man the gold. The man knows that,
if he simply takes the gold, either I or my children could tell the police
the make and number of the car in which he drives away. So there is a
great risk that, if he gets the gold, he will kill me and my children
before he drives away.
Since it would be irrational to give this man the gold, should I ignore
his threat? This would also be irrational. There is a great risk that he
will kill one of my children, to make me believe his threat that, unless
he gets the gold, he will kill my other children.
What should I do? It is very likely that, whether or not I give this
man the gold, he will kill us all. I am in a desperate position.
Fortunately, I remember reading Schelling’s The Strategy of Conflict.
I also have a special drug, conveniently at hand. This drug causes one to be, for a brief period, very irrational. I reach for the bottle and drink
a mouthful before the man can stop me. Within a few seconds, it
becomes apparent that I am crazy. Reeling about the room, I say to the
man: ‘Go ahead. I love my children. So please kill them.’ The man tries
to get the gold by torturing me. I cry out: ‘This is agony. So please go
on.’
Given the state that I am in, the man is now powerless. He can do
nothing that will induce me to open the safe. Threats and torture
cannot force concessions from someone who is so irrational. The man
can only flee, hoping to escape the police. And, since I am in this state,
the man is less likely to believe that I would record the number on his
car. He therefore has less reason to kill me.
While I am in this state, I shall act in ways that are very irrational.
There is a risk that, before the police arrive, I may harm myself or my
children. But, since I have no gun, this risk is small. And making myself
irrational is the best way to reduce the great risk that this man will kill
us all.
On any plausible theory about rationality, it would be rational for me, in
this case, to cause myself to become for a period very irrational.
I’m not the one addressed here, but the term “rational” can be replaced with “useful” here. The argument is that there is a difference between the questions “what is the most useful thing to do in this given situation?” and “what is the most generally useful decision algorithm, and what would it recommend in this situation?” Usefulness is a causal concept, but it is applied to different things here (actions versus decision algorithms that cause actions). CDT answers the first question, MIRI-style decision theories answer something similar to the second question.
What people like me claim is that the answer to the first question may be different from the second. E.g. for Newcomb’s problem, where having a decision algorithm that, in certain situations, picks non-useful actions, can be useful. Like when an entity can read your decision algorithm and predict what action you would pick, and then change its rewards based on that.
In fact, that the answers to the above questions can diverge was already discussed by Thomas Schelling and Derek Parfit. See Parfit in Reasons and Persons: