In order to better understand the differences between different decision theories, I have been browsing each and every Newcomblike Problem and keeping track of how each decision theory answers it differently. However, I seem to be coming up short when it comes to answers addressing the Psychopath Button:
Paul is debating whether to press the “kill all psychopaths” button. It would, he thinks, be much better to live in a world with no psychopaths. Unfortunately, Paul is quite confident that only a psychopath would press such a button. Paul very strongly prefers living in a world with psychopaths to dying. Should Paul press the button?
In the FAQ I read, they only gave examples from CDT and EDT, of which CDT says “yes” (because pressing the button isn’t casually linked to whether Paul is already a psychopath) while EDT says “no” (because pressing the button increases the probability that Paul is a psychopath).
So I wonder how Logical Decision Theories (TDT, FDT, and UDT) would address the problem? Unlike Newcomb’s Problem, there is technically only one agent in play, and in the other problem that has only one agent (the Smoking Lesion Problem) the answers of LDT all agreed with CDT. But in this case, CDT doesn’t win.
From Cheating Death in Damascus (bold emphasis mine):
I’d give the psychopath button question a similar answer that I would to the smoking lesion question: what’s the mechanism that correlates being a psychopath with willingness to press the button?
If being a psychopath (or not being a psychopath) affects your answer because it affects your ability to reason, then based on your psychopath status, you may not even have the ability to reason correctly and choose an outcome. The problem is ill-defined, because it asks you to do something that you may be incapable, by stipulation, of doing.
If it affects your answer in another manner, then pressing the button because of the outcome of a reasoning process won’t be correlated with psychopathy even though pressing the button in general is. (Unless the button uses your decision as a criterion of psychopathy, in which case we get into halting problem considerations.)
Also, note that in everyday language, “only a psychopath would press the button” strongly implies that it affects your decision because it affects your values about killing people, which is the second scenario. It’s also inconsistent because the problem statement implies that both psychopaths and non-psychopaths would consider pressing the button and would reject pressing the button only after figuring out the logic, but if a non-psychopath would always refuse because of his values, this implication isn’t correct.
(Edit: Edited this lots of times. Phrasing my objection correctly is actually quite hard.)
Ah… but it’s the meta-you (the reader), not the story-you (the arguable psychopath), who is tasked with saying whether the story-you should press the button. Maybe the story-you is incapable of reasoning. But given his values and the setup of the story, it should be either better for him to press, or not to press, regardless of whether he can choose or not (and it’s that answer we’re tasked with giving).
I’m not convinced that “it’s impossible for him to press the button, but it’s better for him to press the button” is a meaningful concept.
It’s tempting to think “pressing the button results in X and X is good/bad”, but that cuts off the chain of reasoning early. Continuing the chain of reasoning past that will lead you to further conclusions that result in not-X after all, and you just got a contradiction.
Nothing suggests it’s impossible for him to press the button, even if we grant that it’s possible he can’t reason. Maybe he can stumble into it.
If you need to consider the possibility of pressing the button involuntarily, that affects the meaning of the original problem statement. Does “only a psychopath will press the button” include involuntary presses? If yes, then it’s still impossible for a non-psychopath to press the button. If no, then whether it’s better to involuntarily press the button may have a different answer from whether it’s better to voluntarily press the button.
I’d interpret it that way.
The intended interpretation is that if the person presses the button, they’re a psychopath.
If I press the button, I have always been a psychopath, and I die along with all other psychopaths.
If I don’t press the button, I may or may not be a psychopath, and I live along with all other psychopaths.
All the details you’re writing seem to me to go against the Occam’s razor’s interpretation of the problem.
In my view, FDT handles the problem as follows:
The main controversial piece is from the problem specification: “Paul is quite confident that only a psychopath would press such a button.” I think this mixes up
P(button|psychopath)
andP(psychopath|button)
, but since the problem specification is our only source of how the button determines who is or isn’t a psychopath, it seems fine to trust it on that point.Another related problem is one where there’s a button who kills everyone who would, given the option, press it. You might expect that such people are bad neighbors and prefer a world without them without having any way to act on that belief (and if you come to believe that FDT pushes that button, what it really means is that you shouldn’t be so confident people who would press the button are bad neighbors!).
[In general, your decision theory should save you from claims in the problem specification of the form “and then you make a bad decision”, but it can’t be expected to save you from having incorrect empirical beliefs.]
Psychopathy is strongly associated with poor impulse control and low self-reflection. If Paul is considering logical decision theories, rational choice, and their possible ramifications given his own mental makeup then he is substantially less likely than baseline to be a psychopath, which generally make up on the order of 1% of the population.
Does he have some prior evidence that he is a psychopath? If not, then his prior should be on the order of 0.2% or so. Willingness to press the button would otherwise be his only evidence, which he is “quite confident” about. What numerical value should he put for “quite confident”? Let’s say 90% (much more than that should be described more like “very” or “extremely” confident). So that would bring a baseline prior up to the order of 2%.
Now he “very strongly” prefers living in a world with psychopaths to dying. Is that 5x in utility? 100x? 10,000x? Well, dying is a pretty bad thing but I’d use some stronger term than just “very strongly” for 10,000x so let’s go with something on the order of 100x.
Well, this is awkward. For outcomes of pressing the button we’ve got a credence of 2% for being a psychopath and −100 utility, versus 98% for not being a psychopath and +1 utility. This is a net −1 utility, but the numbers are only order of magnitude estimates so the expected value could easily be much more positive or negative! It doesn’t really matter which decision theory he uses, Paul just doesn’t have enough information.
Yeah, in order to keep the problem statement clean I do think that one should specify that Paul does not have access to autobiographical memory or other self-knowledge for the duration of his time with the button making his decision. If he did, then he could use his self-knowledge to determine if he was a psychopath or not and use that information to supplement the piece of information from ‘would I choose to push the button’ to inform his prediction of whether he is indeed a psychopath and thus will be killed by pushing the button.