We were explicitly told to assume our knowledge is certain. Some of the circumstances assumed long chains of improbable-seeming events between sacrificing someone and saving 5 others.
Eliezer commented on this back in “Ends Don’t Justify Means (Among Humans)”, attempting to reconcile consequentialism with the possibility (observed in human politics) that humans may be running on hardware incapable of consequentialism accurate enough for extreme cases:
And now the philosopher comes and presents their “thought experiment”—setting up a scenario in which, by stipulation, the only possible way to save five innocent lives is to murder one innocent person, and this murder is certain to save the five lives. “There’s a train heading to run over five innocent people, who you can’t possibly warn to jump out of the way, but you can push one innocent person into the path of the train, which will stop the train. These are your only options; what do you do?”
An altruistic human, who has accepted certain deontological prohibits—which seem well justified by some historical statistics on the results of reasoning in certain ways on untrustworthy hardware—may experience some mental distress, on encountering this thought experiment.
So here’s a reply to that philosopher’s scenario, which I have yet to hear any philosopher’s victim give:
“You stipulate that the only possible way to save five innocent lives is to murder one innocent person, and this murder will definitely save the five lives, and that these facts are known to me with effective certainty. But since I am running on corrupted hardware, I can’t occupy the epistemic state you want me to imagine. Therefore I reply that, in a society of Artificial Intelligences worthy of personhood and lacking any inbuilt tendency to be corrupted by power, it would be right for the AI to murder the one innocent person to save five, and moreover all its peers would agree. However, I refuse to extend this reply to myself, because the epistemic state you ask me to imagine, can only exist among other kinds of people than human beings.”
Now, to me this seems like a dodge. I think the universe is sufficiently unkind that we can justly be forced to consider situations of this sort. The sort of person who goes around proposing that sort of thought experiment, might well deserve that sort of answer. But any human legal system does embody some answer to the question “How many innocent people can we put in jail to get the guilty ones?”, even if the number isn’t written down.
As a human, I try to abide by the deontological prohibitions that humans have made to live in peace with one another. But I don’t think that our deontological prohibitions are literally inherently nonconsequentially terminally right. I endorse “the end doesn’t justify the means” as a principle to guide humans running on corrupted hardware, but I wouldn’t endorse it as a principle for a society of AIs that make well-calibrated estimates. (If you have one AI in a society of humans, that does bring in other considerations, like whether the humans learn from your example.)
I have to admit, though, this does seem uncomfortably like the old aphorism quod licet Jovi, non licet bovi — “what is permitted to Jupiter is not permitted to a cow.”
But since I am running on corrupted hardware, I can’t occupy the epistemic state you want me to imagine.
It occurs to me that many (maybe even most) hypotheticals require you to accept an unreasonable epistemic state. Even something so simple as trusting that Omega is telling the truth [and that his “fair coin” was a quantum random number generator rather than, say, a metal disc that he flipped with a deterministic amount of force, but that’s easier to grant as simple sloppy wording]
Eliezer commented on this back in “Ends Don’t Justify Means (Among Humans)”, attempting to reconcile consequentialism with the possibility (observed in human politics) that humans may be running on hardware incapable of consequentialism accurate enough for extreme cases:
I have to admit, though, this does seem uncomfortably like the old aphorism quod licet Jovi, non licet bovi — “what is permitted to Jupiter is not permitted to a cow.”
It occurs to me that many (maybe even most) hypotheticals require you to accept an unreasonable epistemic state. Even something so simple as trusting that Omega is telling the truth [and that his “fair coin” was a quantum random number generator rather than, say, a metal disc that he flipped with a deterministic amount of force, but that’s easier to grant as simple sloppy wording]
In general, thought experiments that depend on an achievable epistemic state can actually be performed and don’t need to remain thought experiments.
They can depend on an achievable epistemic state, but be horribly impractical or immoral to set up (hello trolley problems).