Roko may have been thinking of [just called him, he was thinking of it] a conversation we had when he and I were roommates in Oxford while I was visiting the Future of Humanity Institute, and frequently discussed philosophical problems and thought experiments. Here’s the (redeeming?) context:
As those who know me can attest, I often make the point that radical self-sacrificing utilitarianism isn’t found in humans and isn’t a good target to aim for. Almost no one would actually take on serious harm with certainty for a small chance of helping distant others. Robin Hanson often presents evidence for this, e.g. this presentation on “why doesn’t anyone create investment funds for future people?” However, sometimes people caught up in thoughts of the good they can do, or a self-image of making a big difference in the world, are motivated to think of themselves as really being motivated primarily by helping others as such. Sometimes they go on to an excessive smart sincere syndrome, and try (at the conscious/explicit level) to favor altruism at the severe expense of their other motivations: self-concern, relationships, warm fuzzy feelings.
Usually this doesn’t work out well, as the explicit reasoning about principles and ideals is gradually overridden by other mental processes, leading to exhaustion, burnout, or disillusionment. The situation winds up worse according to all of the person’s motivations, even altruism. Burnout means less good gets done than would have been achieved by leading a more balanced life that paid due respect to all one’s values. Even more self-defeatingly, if one actually does make severe sacrifices, it will tend to repel bystanders.
Instead, I typically advocate careful introspection and the use of something like Nick Bostrom’s parliamentary model:
The idea here is that moral theories get more influence the more probable they are; yet even a relatively weak theory can still get its way on some issues that the theory think are extremely important by sacrificing its influence on other issues that other theories deem more important. For example, suppose you assign 10% probability to total utilitarianism and 90% to moral egoism (just to illustrate the principle). Then the Parliament would mostly take actions that maximize egoistic satisfaction; however it would make some concessions to utilitarianism on issues that utilitarianism thinks is especially important. In this example, the person might donate some portion of their income to existential risks research and otherwise live completely selfishly.
In the conversation with Roko, we were discussing philosophical thought experiments (trolley problem style, which may indeed be foolish ) to get at ‘real’ preferences and values for such an exercise. To do that, one often does best to adopt the device of the True Prisoner’s Dilemma and select positive and negative payoffs that actually have emotional valence (as opposed to abstract tokens). For positive payoffs, we used indefinite lifespans of steady “peak experiences” involving discovery, health, status, and elite mates. For negative payoffs we used probabilities of personal risk of death (which comes along with almost any effort, e.g. driving to places) and harms that involved pain and/or a decline in status (since these are separate drives). Since we were friends and roommates without excessive squeamishness, hanging out at home, we used less euphemistic language.
Neither of us was keen on huge sacrifices in Pascal’s-Mugging-like situations, viewing altruism as only one part of our respective motivational coalitions, or one term in bounded utility functions. I criticized his past “cheap talk” of world-saving as a primary motivation, given that in less convenient possible worlds, it was more easily overcome than his phrasing signaled. I said he should scale back his claims of altruism to match the reality, in the way that I explicitly note my bounded do-gooding impulses.
We also differed in our personal views on the relative badness of torture, humiliation and death. For me, risk of death was the worst, which I was least willing to trade off in trolley-problem type cases to save others. Roko placed relatively more value on the other two, which I jokingly ribbed and teased him about.
In retrospect, I was probably a bit of a jerk in pushing (normative) Hansonian transparency. I wish I had been more careful to distinguish between critiquing a gap between talk and values, and critiquing the underlying values, and probably should just take wedifrid’s advice on trolley-problem-type scenarios generally.
First off, great comment—interesting, and complex.
But, some things still don’t make sense to me...
Assuming that what you described led to:
I was once criticized by a senior singinst member for not being prepared to be tortured or raped for the cause. I mean not actually, but, you know, in theory. Precommiting to being prepared to make a sacrifice that big. shrugs
How did precommitting enter in to it?
Are you prepared to be tortured or raped for the cause? Have you precommitted to it?
Have other SIAI people you know of talked about this with you, have other SIAI people precommitted to it?
What do you think of others who do not want to be tortured or raped for the cause?
I find this whole line of conversation fairly ludicrous, but here goes:
Number 1. Time-inconsistency: we have different reactions about an immediate certainty of some bad than a future probability of it. So many people might be willing to go be a health worker in a poor country where aid workers are commonly (1 in 10,000) raped or killed, even though they would not be willing to be certainly attacked in exchange for 10,000 times the benefits to others. In the actual instant of being tortured anyone would break, but people do choose courses of action that carry risk (every action does, to some extent), so the latter is more meaningful for such hypotheticals.
Number 2. I have driven and flown thousands of kilometers in relation to existential risk, increasing my chance of untimely death in a car accident or plane crash, so obviously I am willing to take some increased probability of death. I think I would prefer a given chance of being tortured to a given chance of death, so obviously I care enough to take at least some tiny risk from what I said above. As I also said above, I’m not willing to make very big sacrifices (big probabilities of such nasty personal outcomes) for tiny shifts in probabilities of big impersonal payoffs (like existential risk reduction). In realistic scenarios, that’s what “the cause” would refer to. I haven’t made any verbal or explicit “precommitment” or promises or anything like that.
In sufficiently extreme (and ludicrously improbable) trolley-problem style examples, e.g. “if you push this button you’ll be tortured for a week, but if you don’t then the Earth will be destroyed (including all your loved ones) if this fair coin comes up heads, and you have incredibly (impossibly?) good evidence that this really is the setup” I hope I would push the button, but in a real world of profound uncertainty, limited evidence, limited personal power (I am not Barack Obama or Bill Gates), and cognitive biases, I don’t expect that to ever happen. I also haven’t made any promises or oaths about that.
I am willing to give of my time and effort, and forgo the financial rewards of a more lucrative career, in exchange for a chance for efficient do-gooding, interaction with interesting people who share my values, and a meaningful project. Given diminishing returns to money in rich countries today, and the ease of obtaining money for folk with high human capital, those aren’t big sacrifices, if they are sacrifices at all.
Number 3. SIAIers love to be precise and analytical and consider philosophical thought experiments, including ethical ones. I think most have views pretty similar to mine, with somewhat varying margins. Certainly Michael Vassar, the head of the organization, is also keen on recognizing one’s various motives and living a balanced life, and avoiding fanatics. Like me, he actively advocates Bostrom-like parliamentary model approaches to combining self-concern with parochial and universalist altruistic feelings.
I have never heard anyone making oaths or promises to make severe sacrifices.
Number 4. This is a pretty ridiculous question. I think that’s fine and normal, and I feel more comfortable with such folk than the alternative. I think people should not exaggerate that do-gooding is the most important thing in their life lest they deceive themselves and others about their willingness to make such choices, which I criticized Roko for.
This sounds very sane, and makes me feel a lot better about the context. Thank you very much.
I very much like the idea that top SIAI people believe that there is such a thing as too much devotion to the cause (and, I’m assuming, actively talk people who are above that level down as you describe doing for Roko).
As someone who has demonstrated impressive sanity around these topics, you seem to be in a unique position to answer these questions with an above-average level-headedness:
Do you understand the math behind the Roko post deletion?
Do you understand the math behind the Roko post deletion?
Yes, his post was based on (garbled versions of) some work I had been doing at FHI, which I had talked about with him while trying to figure out some knotty sub-problems.
What do you think about the Roko post deletion?
I think the intent behind it was benign, at least in that Eliezer had his views about the issue (which is more general, and not about screwed-up FAI attempts) previously, and that he was motivated to prevent harm to people hearing the idea and others generally. Indeed, he was explicitly motivated enough to take a PR hit for SIAI.
Regarding the substance, I think there are some pretty good reasons for thinking that the expected value (with a small probability of a high impact) of the info for the overwhelming majority of people exposed to it would be negative, although that estimate is unstable in the face of new info.
It’s obvious that the deletion caused more freak-out and uncertainty than anticipated, leading to a net increase in people reading and thinking about the content compared to the counterfactual with no deletion. So regardless of the substance about the info, clearly it was a mistake to delete (which Eliezer also recognizes).
What do you think about future deletions?
Obviously, Eliezer is continuing to delete comments reposting on the topic of the deleted post. It seems fairly futile to me, but not entirely. I don’t think that Less Wrong is made worse by the absence of that content as such, although the fear and uncertainty about it seem to be harmful. You said you were worried because it makes you uncertain about whether future deletions will occur and of what.
After about half an hour of trying, I can’t think of another topic with the same sorts of features. There may be cases involving things like stalkers or bank PINs or 4chan attacks or planning illegal activities. Eliezer called on people not to discuss AI at the beginning of Less Wrong to help establish its rationality focus, and to back off from the gender warfare, but hasn’t used deletion powers for such things.
Less Wrong has been around for 20 months. If we can rigorously carve out the stalker/PIN/illegality/spam/threats cases I would be happy to bet $500 against $50 that we won’t see another topic banned over the next 20 months.
Less Wrong has been around for 20 months. If we can rigorously carve out the stalker/PIN/illegality/spam/threats cases I would be happy to bet $500 against $50 that we won’t see another topic banned over the next 20 months.
That sounds like it’d generate some perverse incentives to me.
Well, if counterfactually Roko hadn’t wanted to take it down I think it would have been even more of a mistake to delete it, because then the author would have been peeved, not just the audience/commenters.
But Eliezer’s comments on the subject suggest to me that he doesn’t think that.
More specifically, they suggest that he thinks the most important thing is that the post not be viewable, and if we can achieve that by quietly convincing the author to take it down, great, and if we can achieve it by quietly deleting it without anybody noticing, great, and if we can’t do either of those then we achieve it without being quiet, which is less great but still better than leaving it up.
And it seemed to me your parenthetical could be taken to mean that he agrees with you that deleting it would be a mistake in all of those cases, so I figured I would clarify (or let myself be corrected, if I’m misunderstanding).
So many people might be willing to go be a health worker in a poor country where aid workers are commonly (1 in 10,000) raped or killed, even though they would not be willing to be certainly attacked in exchange for 10,000 times the benefits to others.
I agree with your main point, but the thought experiment seems to be based on the false assumption that the risk of being raped or murdered are smaller than 1 in 10K if you stay at home. Wikipedia guesstimates that 1 in 6 women in the US are on the receiving end of attempted rape at some point, so someone who goes to a place with a 1 in 10K chance of being raped or murdered has probably improved their personal safety. To make a better thought experiment, I suppose you have to talk about the marginal increase in rape or murder rate when working in the poor country when compared to staying home, and perhaps you should stick to murder since the rape rate is so high.
Roko may have been thinking of [just called him, he was thinking of it] a conversation we had when he and I were roommates in Oxford while I was visiting the Future of Humanity Institute, and frequently discussed philosophical problems and thought experiments. Here’s the (redeeming?) context:
As those who know me can attest, I often make the point that radical self-sacrificing utilitarianism isn’t found in humans and isn’t a good target to aim for. Almost no one would actually take on serious harm with certainty for a small chance of helping distant others. Robin Hanson often presents evidence for this, e.g. this presentation on “why doesn’t anyone create investment funds for future people?” However, sometimes people caught up in thoughts of the good they can do, or a self-image of making a big difference in the world, are motivated to think of themselves as really being motivated primarily by helping others as such. Sometimes they go on to an excessive smart sincere syndrome, and try (at the conscious/explicit level) to favor altruism at the severe expense of their other motivations: self-concern, relationships, warm fuzzy feelings.
Usually this doesn’t work out well, as the explicit reasoning about principles and ideals is gradually overridden by other mental processes, leading to exhaustion, burnout, or disillusionment. The situation winds up worse according to all of the person’s motivations, even altruism. Burnout means less good gets done than would have been achieved by leading a more balanced life that paid due respect to all one’s values. Even more self-defeatingly, if one actually does make severe sacrifices, it will tend to repel bystanders.
Instead, I typically advocate careful introspection and the use of something like Nick Bostrom’s parliamentary model:
In the conversation with Roko, we were discussing philosophical thought experiments (trolley problem style, which may indeed be foolish ) to get at ‘real’ preferences and values for such an exercise. To do that, one often does best to adopt the device of the True Prisoner’s Dilemma and select positive and negative payoffs that actually have emotional valence (as opposed to abstract tokens). For positive payoffs, we used indefinite lifespans of steady “peak experiences” involving discovery, health, status, and elite mates. For negative payoffs we used probabilities of personal risk of death (which comes along with almost any effort, e.g. driving to places) and harms that involved pain and/or a decline in status (since these are separate drives). Since we were friends and roommates without excessive squeamishness, hanging out at home, we used less euphemistic language.
Neither of us was keen on huge sacrifices in Pascal’s-Mugging-like situations, viewing altruism as only one part of our respective motivational coalitions, or one term in bounded utility functions. I criticized his past “cheap talk” of world-saving as a primary motivation, given that in less convenient possible worlds, it was more easily overcome than his phrasing signaled. I said he should scale back his claims of altruism to match the reality, in the way that I explicitly note my bounded do-gooding impulses.
We also differed in our personal views on the relative badness of torture, humiliation and death. For me, risk of death was the worst, which I was least willing to trade off in trolley-problem type cases to save others. Roko placed relatively more value on the other two, which I jokingly ribbed and teased him about.
In retrospect, I was probably a bit of a jerk in pushing (normative) Hansonian transparency. I wish I had been more careful to distinguish between critiquing a gap between talk and values, and critiquing the underlying values, and probably should just take wedifrid’s advice on trolley-problem-type scenarios generally.
First off, great comment—interesting, and complex.
But, some things still don’t make sense to me...
Assuming that what you described led to:
How did precommitting enter in to it?
Are you prepared to be tortured or raped for the cause? Have you precommitted to it?
Have other SIAI people you know of talked about this with you, have other SIAI people precommitted to it?
What do you think of others who do not want to be tortured or raped for the cause?
Thanks, wfg
I find this whole line of conversation fairly ludicrous, but here goes:
Number 1. Time-inconsistency: we have different reactions about an immediate certainty of some bad than a future probability of it. So many people might be willing to go be a health worker in a poor country where aid workers are commonly (1 in 10,000) raped or killed, even though they would not be willing to be certainly attacked in exchange for 10,000 times the benefits to others. In the actual instant of being tortured anyone would break, but people do choose courses of action that carry risk (every action does, to some extent), so the latter is more meaningful for such hypotheticals.
Number 2. I have driven and flown thousands of kilometers in relation to existential risk, increasing my chance of untimely death in a car accident or plane crash, so obviously I am willing to take some increased probability of death. I think I would prefer a given chance of being tortured to a given chance of death, so obviously I care enough to take at least some tiny risk from what I said above. As I also said above, I’m not willing to make very big sacrifices (big probabilities of such nasty personal outcomes) for tiny shifts in probabilities of big impersonal payoffs (like existential risk reduction). In realistic scenarios, that’s what “the cause” would refer to. I haven’t made any verbal or explicit “precommitment” or promises or anything like that.
In sufficiently extreme (and ludicrously improbable) trolley-problem style examples, e.g. “if you push this button you’ll be tortured for a week, but if you don’t then the Earth will be destroyed (including all your loved ones) if this fair coin comes up heads, and you have incredibly (impossibly?) good evidence that this really is the setup” I hope I would push the button, but in a real world of profound uncertainty, limited evidence, limited personal power (I am not Barack Obama or Bill Gates), and cognitive biases, I don’t expect that to ever happen. I also haven’t made any promises or oaths about that.
I am willing to give of my time and effort, and forgo the financial rewards of a more lucrative career, in exchange for a chance for efficient do-gooding, interaction with interesting people who share my values, and a meaningful project. Given diminishing returns to money in rich countries today, and the ease of obtaining money for folk with high human capital, those aren’t big sacrifices, if they are sacrifices at all.
Number 3. SIAIers love to be precise and analytical and consider philosophical thought experiments, including ethical ones. I think most have views pretty similar to mine, with somewhat varying margins. Certainly Michael Vassar, the head of the organization, is also keen on recognizing one’s various motives and living a balanced life, and avoiding fanatics. Like me, he actively advocates Bostrom-like parliamentary model approaches to combining self-concern with parochial and universalist altruistic feelings.
I have never heard anyone making oaths or promises to make severe sacrifices.
Number 4. This is a pretty ridiculous question. I think that’s fine and normal, and I feel more comfortable with such folk than the alternative. I think people should not exaggerate that do-gooding is the most important thing in their life lest they deceive themselves and others about their willingness to make such choices, which I criticized Roko for.
This sounds very sane, and makes me feel a lot better about the context. Thank you very much.
I very much like the idea that top SIAI people believe that there is such a thing as too much devotion to the cause (and, I’m assuming, actively talk people who are above that level down as you describe doing for Roko).
As someone who has demonstrated impressive sanity around these topics, you seem to be in a unique position to answer these questions with an above-average level-headedness:
Do you understand the math behind the Roko post deletion?
What do you think about the Roko post deletion?
What do you think about future deletions?
Yes, his post was based on (garbled versions of) some work I had been doing at FHI, which I had talked about with him while trying to figure out some knotty sub-problems.
I think the intent behind it was benign, at least in that Eliezer had his views about the issue (which is more general, and not about screwed-up FAI attempts) previously, and that he was motivated to prevent harm to people hearing the idea and others generally. Indeed, he was explicitly motivated enough to take a PR hit for SIAI.
Regarding the substance, I think there are some pretty good reasons for thinking that the expected value (with a small probability of a high impact) of the info for the overwhelming majority of people exposed to it would be negative, although that estimate is unstable in the face of new info.
It’s obvious that the deletion caused more freak-out and uncertainty than anticipated, leading to a net increase in people reading and thinking about the content compared to the counterfactual with no deletion. So regardless of the substance about the info, clearly it was a mistake to delete (which Eliezer also recognizes).
Obviously, Eliezer is continuing to delete comments reposting on the topic of the deleted post. It seems fairly futile to me, but not entirely. I don’t think that Less Wrong is made worse by the absence of that content as such, although the fear and uncertainty about it seem to be harmful. You said you were worried because it makes you uncertain about whether future deletions will occur and of what.
After about half an hour of trying, I can’t think of another topic with the same sorts of features. There may be cases involving things like stalkers or bank PINs or 4chan attacks or planning illegal activities. Eliezer called on people not to discuss AI at the beginning of Less Wrong to help establish its rationality focus, and to back off from the gender warfare, but hasn’t used deletion powers for such things.
Less Wrong has been around for 20 months. If we can rigorously carve out the stalker/PIN/illegality/spam/threats cases I would be happy to bet $500 against $50 that we won’t see another topic banned over the next 20 months.
That sounds like it’d generate some perverse incentives to me.
Urk.
Just to be clear: he recognizes this by comparison with the alternative of privately having the poster delete it themselves, rather than by comparison to not-deleting.
Or at least that was my understanding.
Regardless, thanks for a breath of clarity in this thread. As a mostly disinterested newcomer, I very much appreciated it.
Well, if counterfactually Roko hadn’t wanted to take it down I think it would have been even more of a mistake to delete it, because then the author would have been peeved, not just the audience/commenters.
Which is fine.
But Eliezer’s comments on the subject suggest to me that he doesn’t think that.
More specifically, they suggest that he thinks the most important thing is that the post not be viewable, and if we can achieve that by quietly convincing the author to take it down, great, and if we can achieve it by quietly deleting it without anybody noticing, great, and if we can’t do either of those then we achieve it without being quiet, which is less great but still better than leaving it up.
And it seemed to me your parenthetical could be taken to mean that he agrees with you that deleting it would be a mistake in all of those cases, so I figured I would clarify (or let myself be corrected, if I’m misunderstanding).
I should have taken this bet
Your post has been moved to the Discussion section, not deleted.
Looking at your recent post, I think Alicorn had a good point.
I agree with your main point, but the thought experiment seems to be based on the false assumption that the risk of being raped or murdered are smaller than 1 in 10K if you stay at home. Wikipedia guesstimates that 1 in 6 women in the US are on the receiving end of attempted rape at some point, so someone who goes to a place with a 1 in 10K chance of being raped or murdered has probably improved their personal safety. To make a better thought experiment, I suppose you have to talk about the marginal increase in rape or murder rate when working in the poor country when compared to staying home, and perhaps you should stick to murder since the rape rate is so high.
You lost me at ‘ludicrous’. :)
but he won me back by answering anyway <3
How so?
Thanks!
Great comment Carl!