Eliezer Yudkowsky comments on Ethical Injunctions

Eliezer Yudkowsky Oct 21, 2008, 8:28 PM
13 points
Psy-Kosh: Given the current sequence, perhaps it’s time to revisit the whole Torture vs Dust Specks thing?

I can think of two positions on torture to which I am sympathetic:

1) No legal system or society should ever refrain from punishing those who torture—anything important enough that torture would even be on the table, like a nuclear bomb in New York, is important enough that everyone involved should be willing to go to prison for the crime of torture.

2) The chance of actually encountering a “nuke in New York” situation, that can be effectively resolved by torture, is so low, and the knock-on effects of having the policy in place so awful, that a blanket injunction against torture makes sense.

In case 1, you would choose TORTURE over SPECKS, and then go to jail for it, even though it was the right thing to do.

In case 2, you would simultaneously say “TORTURE over SPECKS is the right alternative of the two, but a human can never be in an epistemic state where you have justified belief that this is the case”, which would tie in well to the Hansonian argument that you have an O(3^^^3) probability penalty from the unlikelihood of finding yourself in such a unique position.

So I am sympathetic to the argument that people should never torture, but I certainly can’t back the position that SPECKS over TORTURE is inherently the right thing to do—this seems to me to mix up an epistemic precaution with morality. There’s certainly worse things than torturing one person—torturing two people, for example. But if you adopt position 2, then you would refuse to torture one person with your own hands even to save a thousand people from torture, while simultaneously not saying that that it is better for a thousand people than one person to be tortured.

The moral questions are over the territory (or, hopefully equivalently, over epistemic states of absolute certainty). The ethical questions are over epistemic states that humans are likely to be in.

The problem here of course is how selective to be about rules to let into this protected level of “rules almost no one should think themselves clever enough to know when to violate.” After all, your social training may well want you to include “Never question our noble leader” in that set. Many a Christian has been told the mysteries of God are so subtle that they shouldn’t think themselves clever enough to know when they’ve found evidence that God isn’t following a grand plan to make this the best of all possible worlds.

I think it deserves to be noted that while some of the flaws in Christian theology are in what they think their supposed facts would imply (e.g., that because God did miracles you can know that God is good), other problems come more from the falsity of the premises than the falsity of the deductions. Which is to say, if God did exist and were good, then you would be justified in being cautious around parts of God’s plan that didn’t seem to make sense at the moment. But this would be best backed up with a long history of people saying, “Look how stupid God’s plan is, we need to do X” and then X blowing up on them. Rather than, as in the case, people saying “God’s plan is X” and then X blows up on them.

Or if you’d found with some historical regularity that, when you challenged God’s subtle plans, that you seemed to be right 90% of the time, but the other 10% of the time you got black-swan blowups that caused a hundred times as much damage, that would also be cause for suspicious of subtlety.

Nominull: So… do you not actually believe in your injunction to “shut up and multiply”? Because for some time now you seem to have been arguing that we should do what feels right rather than trying to figure out what is right.

Certainly I’m not saying “just do what feels right”. There’s no safe defense, not even ethics. There’s also no safe defense, not even shut up and multiply.

I probably should have been clearer about this before, but I was trying to discuss things in an order, and didn’t want to wade into ethics without specialized posts:

People often object to the sort of scenarios that illustrate “shut up and multiply” by saying, “But if the experimenter tells you X, what if they might be lying?” Well, in a lot of real-world cases, then yes, there are various probability updates you perform based on other people being willing to make bets against you, and just because you get certain experimental instructions doesn’t imply the real world is that way.

But the base case—the center—has to be the moral comparisons between worlds, or even comparisons of expected utility between given probability distributions. If you can’t ask about this, then what good will ethics do you?

So let’s be very clear that I don’t think that one small act of self-deception is an inherently morally worse event than, say, getting your left foot chopped off with a chainsaw. I’m asking, rather, how one should best avoid the chainsaw, and arguing that in reasonable states of knowledge a human can attain, the answer is, “Don’t deceive yourself, it’s a black-swan bet at best.”

Vassar: For such a reason, I would be very wary of using such rules in an AGI, but of course, perhaps the actual mathematical formulation of the rule in question within the AGI would be less problematic, though a few seconds of thought doesn’t give me much reason to think this.

Are we talking about self-deception still? Because I would give odds around as extreme as the odds I would give of anything, that, conditioning on any AI I build trying to deceive itself, some kind of really epic error has occurred. Controlled shutdown, immediately.

Vassar: In a very general sense though, I see a logical problem with this whole line of thought. How can any of these injunctions survive except as self-protecting beliefs? Isn’t this whole approach just the sort of “fighting bias with bias” that you and Robin usually argue against?

Maybe I’m not being clear about how this would work in an AI! The ethical injunction isn’t self-protecting, it’s justified within the structural framework of the system as a whole. You might even find ethical injunctions starting to emerge without programmer intervention, in some cases, depending on how well the AI understood its own situation. But the kind of injunctions I have in mind wouldn’t be reflective—they wouldn’t modify the utility function or kick in at the reflective level to ensure their own propagation. That sounds really scary, to me—there ought to be an injunction against it! You might have a rule that would controlledly shut down the (non-mature) AI if it tried to execute a certain kind of source code change, but that wouldn’t be the same as having an injunction that exerts direct control over the source code.

To the extent the injunction sticks around in the AI, it should be as the result of ordinary reasoning, not reasoning taking the injunction into account! My ethical injunctions do not come with an extra clause that says, “Do not reconsider this injunction, including not reconsidering this clause.” That would be going way too far. It would violate the injunction against self-protecting closed belief systems.

Toby Ord: As written, both these statements are conceptually confused. I understand that you didn’t actually mean either of them literally, but I would advise against trading on such deep-sounding conceptual confusions.

I can’t weaken them and make them come out as the right advice to give people. Even after “Shut up and do the impossible”, there was that commenter who posted on their failed attempt at the AI-Box Experiment by saying that they thought they gave it a good try—which shows how hard it is to convey the sentiment of “Shut up and do the impossible!” Readers can work out on their own how to distinguish the map and the territory here, but if you say “Shut up and do what seems impossible!” that, to me, sounds like dispelling part of the essential message—that what seems impossible doesn’t look like “seems impossible” it just looks impossible.

Likewise with “things you shouldn’t do even if they’re the right thing to do”; only this conveys the danger and tension of ethics, the genuine opportunities you might be passing up. “Don’t do it even if it seems right” sounds merely clever by comparison, like you’re going to reliably divine the difference between what seems right and what is right, and happily ride off into the sunset.

This seems closely related to inside-view versus outside-view. The think-lobe of the brain comes up with a cunning plan. The plan breaks an ethical rule but calculation shows it is for the greater good. The executive-lobe of the brain then ponders the outside view. Every-one who has executed an evil cunning plan has run a calculation of the greater good and had their plan endorsed. So the calculation lack outside-view credibility.

nod

(But with the proviso that some people who execute evil cunning plans may just be evil, that history may be written by the victors to emphasize the transgressions of the losers while overlooking the moral compromises of those who achieved “good” results, etc.)

What’s to prohibit the meta-reasoning from taking place before the shutdown triggers? It would seem that either you can hard-code an ethical inhibition or you can’t. Along those lines, is it fair to presume that the inhibitions are always negative, so that non-action is the safe alternative? Why not just revert to a known state?

If a self-modifying AI with the right structure will write ethical injunctions at all, it will also inspect the code to guarantee that no race condition exists with any deliberative-level supervisory systems that might have gone wrong in the condition where the code executes. Otherwise you might as well not have the code.

Inaction isn’t safe but it’s safer than running an AI whose moral system has gone awry.

Finney: Which is better: conscious self-deception (assuming that’s even meaningful), or unconscious?

Once you deliberately choose self-deception, you may have to protect it by adopting other Dark Side Epistemology. I would, of course, say “neither” (as otherwise I would be swapping to the Dark Side) but if you ask me which is worse—well, hell, even I’m still undoubtedly unconsciously self-deceiving, but that’s not the same as going over to the Dark Side by allowing it!