I’m not assuming that the relevant predictors in my scenarios are infallible. In the Blackmail scenario, for example, I’m assuming that the blackmailer is fairly good but not perfect at predicting your reaction. So it’s perfectly possible for an FDT agent to find themselves in that scenario. If they do, they will clearly do worse than a CDT agent.
You’re right that I shouldn’t have called FDT’s recommendation in the Twin case “insane”. I do think FDT’s recommendation is insane for the other cases I discuss, but the Twin case is tricky. It’s a Newcomb Problem. I’d still say that FDT gives the wrong advise here, and CDT gives the right advice. I’m a two-boxer.
Of course making agents care about others (and about their integrity etc.) changes the utility function and therefore the decision problem. That’s exactly the point. The idea is that in many realistic scenarios such agents will tend to do better for themselves than purely egoistical agents. So if I were to build an agent with the goal that they do well for themselves, I’d give them this kind of utility function, rather than implement FDT.
“What if someone decides to punish agents for using CDT?”—Sure, this can happen. It’s what happens in Newcomb’s Problem.
“Schwarz goes on to list a number of points of questions he has/unclarities he found in Yudkowsky and Soares’ paper, which I don’t find relevant”—Their relevance is that FDT isn’t actually a theory, unlike CDT and EDT. In its present form it is only an underdeveloped sketch, and I have doubts that it can be spelled out properly.
You say that CDT “fails” the original problems. You don’t give any argument for this. My intuition is that FDT gets all the problems I discuss wrong and CDT gets them right. For what it’s worth, I’d bet that most people’s intuitions about cases like Blackmail, Procreation, and Newcomb’s Problem with Transparent Boxes are on my side. Of course intuitions can be wrong. But as a general rule, you need better arguments in support of a counter-intuitive hypothesis than in support of an intuitive hypothesis. I’m not aware of any good arguments in support of the FDT verdict.
Thanks for your reply. And I apologize: I should have looked whether you have an account on LessWrong and tag you in the post.
Alright, then it depends on the accuracy of Stormy’s prediction. Call this a, where 0 ⇐ a ⇐ 1. Let’s assume paying upon getting blackmailed gives −1 utility, not paying upon blackmail gives −9 utility and not getting blackmailed at all gives 0 utility. Then, if Donald’s decision theory says to blow the gaff, Stormy predicts this with p accuracy and thus blackmails Donald with probability 1 - p. This gives Donald an expected utility of p x 0 + (1 - p) x −9 = 9p − 9 utils for blowing the gaff. If instead Donald’s decision theory says to pay, then Stormy blackmails with probability p. This gives Donald an expected utility of p x −1 + (1 - p) x 0 = -p utils for paying. Solving 9p − 9 = -p gives 10p = 9, or p = 0.9. This means FDT would recommend blowing the gaff for p > 0.9. For p < 0.9 FDT recommends paying.
Confessing and two-boxing ignore the logical connection between the clones and the player and the demon, respectively. It’s worth noting that (given perfect prediction accuracy for the demon) two-boxers always walk away with only $1000. Given imperfect prediction, we can do an expected value calculation again, but you get my point, which is similar for the Twin case.
I know that’s your point; I said it’s your point. My point is that changing the utility function of a problem ignores the original problem, which your theory still doesn’t solve. If I build an algorithm for playing games, which doesn’t know how to play chess well, the right thing to do is improve the algorithm so it does play chess well, not redefining what a winning position in chess is. Your agent may do better in (some) of these modified scenarios, but FDT does well in both the modified and the original scenarios.
My point here was that you can directly punish agents for having any decision theory, so this is no relative disadvantage of FDT. Btw, I disagree on Newcomb’s problem punishing CDT agents: it punishes two-boxers. CDT two-boxing is CDT’s choice and problem. Not so for your original example of an environment giving FDT’ers worse options than CDT’ers: FDT’ers simply don’t get the better options there, whereas CDT’ers in Newcomb’s problem do.
Note that I said “relevant for the purpose of this post”. I didn’t say they aren’t relevant in general. The point of this post was to react to points I found to be clearly wrong/unfair.
I agree I could have made a clearer argument here, even though I gave some argumentation throughout my post. I maintain CDT fails the examples for the reason that if I were to adhere to CDT, I would be worse off than if I were to adhere to FDT given the three problems. CDT’ers do get blackmailed by Stormy; FDT’ers don’t. CDT’ers don’t end up in Newcomb’s Problem with Transparent Boxes as you described it: they end up with only the $1000 available. FDT’ers do end up in that scenario and get a million. As for Procreation, note that my point was about the problems of which you wanted to change the utility function, and Procreation wasn’t one of them. CDT does better on Procreation, like I said; I further explained how Procreation* is a better problem for comparing CDT and FDT.
The fundamental problem with your arguments is that the scenarios in which you’re imagining FDT agents “lose” are logically impossible. You’re not seeing the broader perspective that the FDT agents’ non-negotiation with terrorists policy prevents them from being blackmailed in the first place.
Thanks. A few points, mostly for clarification.
I’m not assuming that the relevant predictors in my scenarios are infallible. In the Blackmail scenario, for example, I’m assuming that the blackmailer is fairly good but not perfect at predicting your reaction. So it’s perfectly possible for an FDT agent to find themselves in that scenario. If they do, they will clearly do worse than a CDT agent.
You’re right that I shouldn’t have called FDT’s recommendation in the Twin case “insane”. I do think FDT’s recommendation is insane for the other cases I discuss, but the Twin case is tricky. It’s a Newcomb Problem. I’d still say that FDT gives the wrong advise here, and CDT gives the right advice. I’m a two-boxer.
Of course making agents care about others (and about their integrity etc.) changes the utility function and therefore the decision problem. That’s exactly the point. The idea is that in many realistic scenarios such agents will tend to do better for themselves than purely egoistical agents. So if I were to build an agent with the goal that they do well for themselves, I’d give them this kind of utility function, rather than implement FDT.
“What if someone decides to punish agents for using CDT?”—Sure, this can happen. It’s what happens in Newcomb’s Problem.
“Schwarz goes on to list a number of points of questions he has/unclarities he found in Yudkowsky and Soares’ paper, which I don’t find relevant”—Their relevance is that FDT isn’t actually a theory, unlike CDT and EDT. In its present form it is only an underdeveloped sketch, and I have doubts that it can be spelled out properly.
You say that CDT “fails” the original problems. You don’t give any argument for this. My intuition is that FDT gets all the problems I discuss wrong and CDT gets them right. For what it’s worth, I’d bet that most people’s intuitions about cases like Blackmail, Procreation, and Newcomb’s Problem with Transparent Boxes are on my side. Of course intuitions can be wrong. But as a general rule, you need better arguments in support of a counter-intuitive hypothesis than in support of an intuitive hypothesis. I’m not aware of any good arguments in support of the FDT verdict.
Thanks for your reply. And I apologize: I should have looked whether you have an account on LessWrong and tag you in the post.
Alright, then it depends on the accuracy of Stormy’s prediction. Call this a, where 0 ⇐ a ⇐ 1. Let’s assume paying upon getting blackmailed gives −1 utility, not paying upon blackmail gives −9 utility and not getting blackmailed at all gives 0 utility. Then, if Donald’s decision theory says to blow the gaff, Stormy predicts this with p accuracy and thus blackmails Donald with probability 1 - p. This gives Donald an expected utility of p x 0 + (1 - p) x −9 = 9p − 9 utils for blowing the gaff. If instead Donald’s decision theory says to pay, then Stormy blackmails with probability p. This gives Donald an expected utility of p x −1 + (1 - p) x 0 = -p utils for paying. Solving 9p − 9 = -p gives 10p = 9, or p = 0.9. This means FDT would recommend blowing the gaff for p > 0.9. For p < 0.9 FDT recommends paying.
Confessing and two-boxing ignore the logical connection between the clones and the player and the demon, respectively. It’s worth noting that (given perfect prediction accuracy for the demon) two-boxers always walk away with only $1000. Given imperfect prediction, we can do an expected value calculation again, but you get my point, which is similar for the Twin case.
I know that’s your point; I said it’s your point. My point is that changing the utility function of a problem ignores the original problem, which your theory still doesn’t solve. If I build an algorithm for playing games, which doesn’t know how to play chess well, the right thing to do is improve the algorithm so it does play chess well, not redefining what a winning position in chess is.
Your agent may do better in (some) of these modified scenarios, but FDT does well in both the modified and the original scenarios.
My point here was that you can directly punish agents for having any decision theory, so this is no relative disadvantage of FDT. Btw, I disagree on Newcomb’s problem punishing CDT agents: it punishes two-boxers. CDT two-boxing is CDT’s choice and problem. Not so for your original example of an environment giving FDT’ers worse options than CDT’ers: FDT’ers simply don’t get the better options there, whereas CDT’ers in Newcomb’s problem do.
Note that I said “relevant for the purpose of this post”. I didn’t say they aren’t relevant in general. The point of this post was to react to points I found to be clearly wrong/unfair.
I agree I could have made a clearer argument here, even though I gave some argumentation throughout my post. I maintain CDT fails the examples for the reason that if I were to adhere to CDT, I would be worse off than if I were to adhere to FDT given the three problems. CDT’ers do get blackmailed by Stormy; FDT’ers don’t. CDT’ers don’t end up in Newcomb’s Problem with Transparent Boxes as you described it: they end up with only the $1000 available. FDT’ers do end up in that scenario and get a million.
As for Procreation, note that my point was about the problems of which you wanted to change the utility function, and Procreation wasn’t one of them. CDT does better on Procreation, like I said; I further explained how Procreation* is a better problem for comparing CDT and FDT.
The fundamental problem with your arguments is that the scenarios in which you’re imagining FDT agents “lose” are logically impossible. You’re not seeing the broader perspective that the FDT agents’ non-negotiation with terrorists policy prevents them from being blackmailed in the first place.