[SEQ RERUN] A Case Study of Motivated Continuation
Today’s post, A Case Study of Motivated Continuation was originally published on 31 October 2007. A summary (taken from the LW wiki):
When you find yourself considering a problem in which all visible options are uncomfortable, making a choice is difficult. Grit your teeth and choose anyways.
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we’ll be going through Eliezer Yudkowsky’s old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Torture vs. Dust Specks, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day’s sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
On-the-job training for professionals usually entails some ethics training. What the company calls “ethics” is not ethics in the same sense as LessWrong thinks about ethics. Still, corporate ethics push the same buttons in hominid brains as “real” ethics.
One key feature of “corporate” ethics always concerns the APPEARANCE of an act. Typically several examples of borderline nepotism and embezzlement are given. The trainee is intended to learn to disapprove of the cases that “look bad” and let slide the cases that look acceptable. In other words, there is a direct appeal to the in-built human bullshit detector.
I think the torture-versus-specks dilemma invokes the same wiring. Our snap intuition doesn’t judge okay-ness on a scale, but rather a binary classification of “okay” versus “bad.” Torturing somebody looks bad—its badness rating is “bad.” An infinite number of dust specks doesn’t look bad. It looks “okay.”
I would hope that, if an FAI was making the decision to torture someone for fifty years in exchange for saving us all from dust specks, the FAI would think for a moment about how it looks.
How an act looks is likely to affect how others respond to the actor. If you need people’s cheerful cooperation, it’s probably not a good idea to do things that make them think you are a horrible person.
I’d be surprised if humans can judge okay-ness on a spectrum, but I think we have a scale which can judge more points than “okay” and “bad.” “Dubious” at least seems to be an existing point in human judgment.
I think it would, but I suspect that the impulse would be overridden. An AI can prevent people from finding out what it did, or simply make its plans on the assumption that people wouldn’t like it. That’s not too big of an impediment to a superintelligence.
It certainly happened to me. And it took considerable inner (and outer) digging to end up confirming the no-torture choice, as opposed to switching to the “logical” one. In the end, I could not adequately resolve for myself a question “How confident am I that EY’s logic of torture is airtight? Confident enough to inflict torture?”. So I flinched again and decided that for this particular problem I am not willing to chance my decision on something I could not assign satisfactorily high probability to.
Interestingly, I found the other classic dilemma, personally killing one person to save several, perfectly clear-cut, since there are no uncertainties there, only uncomfortable, but necessary actions.
I wonder if this is a standard way of thinking.
Assuming this is a reference to the trolley problem—is the least convenient possible world the one where you yourself are large enough to stop the trolley, or the one where you are not? Because there definitely is uncertainty—you could sacrifice yourself instead.
Also, If someone else tried to toss you in front of a trolley to save 10 other people, would you resist?
I don’t know about shminux, but I sure as heck would. This is definitely a case of -wanting/-liking/+approving.
I would resist, too. I assign a very large positive utility to myself, don’t know about you guys. My immediate family would be the only ones in the ballpark.
What’s the LW public’s opinion on having a null action marker to avoid this; pretending you have the option of “do not allow time to pass” which then maps to for example “fall over and sprout error messages and links to this comment” in order to avoid “stand around and lie to yourself for a long time” being outputted?