”You could call it heroic responsibility, maybe,” Harry Potter said. “Not like the usual sort. It means that whatever happens, no matter what, it’s always your fault. Even if you tell Professor McGonagall, she’s not responsible for what happens, you are. Following the school rules isn’t an excuse, someone else being in charge isn’t an excuse, even trying your best isn’t an excuse. There just aren’t any excuses, you’ve got to get the job done no matter what.” –HPMOR, chapter 75.
I think a typical-ish person actually doing this doesn’t look like them rising to the challenge. I think someone actually doing this looks like them thinking they have advanced mind control powers (since even things done by other people are their fault) and that since there continue to be horrible things happening in the world, they must have evil intentions and be a partly-demonic entity. It looks like them making themselves a scapegoat. This isn’t speculative, I’ve experienced this and I think it was connected to trying to take seriously heroic responsibility and that I could personally be responsible for the destruction of the world (e.g. by starting conversations about AI that cause AI to be developed sooner), which my social environment encouraged.
I think this goes against normal therapy advice e.g. the idea that you having been abused isn’t your fault, that you need to forgive yourself for having acted suboptimally given the confusions you previously had, that you shouldn’t depend on controlling others’ behavior, that you should respect others’ boundaries and their ability to make their own decisions, etc. There are certainly problems with normal therapy advice, but this is something people have already thought a lot about and have clinical experience with.
Maybe some people get something out of this, either because they do a pretend version of it or have an abnormal psychology where they don’t connect everything bad being their fault with normal emotions a typical person would have as a consequence. But it seems out of place in a compilation about how to have good mental health.
My other comment notwithstanding, I do think the HPMOR quote is not very helpful for someone’s mental health when they’re in pain and seems a bit odd placed atop a section on advice, and I think the advice at the wrong time can feel oppressive. The hero-licensing post feels much less like it risks feeling oppressed by every bad thing that happens in the world. And personally I found Anna’s post linked earlier to be much more helpful advice that is related to and partially upstream of the sorts of changes in my life that have reduced a lot of anxiety. If it were me I’d probably put that at the top of the list there, perhaps along with Come to Your Terms by Nate which also resonates strongly with me.
(Looking further) I see, the point of that section isn’t to be “the advice section”, it’s to be “the advice posts that don’t talk about AI”. I still think something about that is confusing. My first-guess is that I’d structure a post like this like an FAQ, “Are you feeling X because Y? Then here’s two posts that address this” and so on, so that people can find the bit that is relevant to their problem. But not sure.
I can understand thinking of yourself as having evil intentions, but I don’t understand believing you’re a partly-demonic entity.
I think the way that the global market and culture can respond to ideas is strange and surprising, with people you don’t know taking major undertakings based on your ideas, with lots of copying and imitation and whole organizations or people changing their lives around something you did without them ever knowing you. Like the way that Elon Musk met a girlfriend of his via a Roko’s Basilisk meme, or one time someone on reddit I don’t know believed that an action I’d taken was literally “the AGI” acting in their life (which was weird for me). I think that one can make straightforward mistakes in earnestly reasoning about strange things (as is argued in this Astral Codex Ten post that IIRC argues that conspiracy theories often have surprisingly good arguments for them that a typical person would find persuasive on their own merits). So I’m not saying that really trying to act on a global scale on a difficult problem couldn’t cause you to have supernatural beliefs.
But you said it’s what would happen to a ‘typical-ish person’. If you believe a ‘typical-ish person’ trying to have an epistemology will reliably fail in ways that lead to them believing in conspiracies, then I guess yes, they may also come to have supernatural beliefs if they try to take action that has massive consequences in the world. But I think a person with just a little more perspective can be self-aware about conspiracy theories and similarly be self-aware about whatever other hypotheses they form, and try to stick to fairly grounded ones. It turns out that when you poke civilization the right way does a lot of really outsized and overpowered things sometimes.
I imagine it was a trip for Doug Engelbart to watch everyone in the world get a personal computer, with a computer mouse and a graphical user-interface that he had invented. But I think it would have been a mistake for him to think anything supernatural was going on, even if he were trying to personally take responsibility for directing the world in as best he could, and I expect most people would be able to see that (from the outside).
If you think you’re responsible for everything, that means you’re responsible for everything bad that happens. That’s a lot of very bad stuff, some of which is motivated by bad intentions. An entity who’s responsible for that much bad stuff couldn’t be like a typical person, who is responsible for a modest amount of bad stuff. It’s hard to conceptualize just how much bad stuff this hypothetical person is responsible for without supernatural metaphors; it’s far beyond what a mere genocidal dictator like Hitler or Stalin is responsible for (at least, if you aren’t attributing heroic responsibility to them). At that point, “well, I’m responsible for more bad stuff than I previously thought Hitler was responsible for” doesn’t come close to grasping the sheer magnitude, and supernatural metaphors like God or Satan come closer. The conclusion is insane and supernatural because the premise, that you are personally responsible for everything that happens, is insane and supernatural.
I’m not really sure how typical this particular response would be. But I think it’s incredibly rare to actually take heroic responsibility literally and seriously. So even if I only rarely see evidence of people thinking they’re demonic (which is surprisingly common, even if rare in absolute terms), that doesn’t say much about the conditional likelihood of that response on taking heroic responsibility seriously.
I have a version of heroic responsibility in my head that I don’t think causes one to have false beliefs about supernatural phenomena, so I’m interested in engaging on whether the version in my head makes sense, though I don’t mean to invalidate your strongly negative personal experiences with the idea.
I think there’s a difference between causing something and taking responsibility for it. There’s a notion of “I didn’t cause this mess but I am going to clean it up.” In my team often a problem arises that we didn’t cause and weren’t expecting. A few months ago there were heavy rains in Berkeley and someone had to step up and make sure they didn’t cause serious water damage to our property. Further beyond the organization’s remit, one time Scott Aaronson’s computational complexity wiki was set to go down, and a team member said they’d step forward to fix it and take responsibility for keeping it up in the future. These were situations where the person who took them on didn’t cause them and hadn’t said that they were responsible for the class of things ahead of time, but increasingly took on more responsibility because they could and because it was good.
When Harry is speaking to McGonagall in that quote, I believe he’s saying “No, I’m actually taking responsibility for what happened to my friend. I’m asking myself what it would’ve looked like for me to actually take responsibility for it earlier, rather than the default state of nature where we’re all just bumbling around. Where the standard is ‘this terrible thing doesn’t happen’ as opposed to ‘well I’m deontologically in the clear and nobody blames me but the thing still happens’.”
I don’t think this gives Harry false magical beliefs that he personally caused a horrendous thing to happen to his friend (though I think that magical beliefs of the sort so have a higher prior in his universe).
I think you can “take responsibility” for civilization not going extinct in this manner, without believing you personally caused the extinction. (It will suck a bit for you because it’s very hard and you will probably fail in your responsibilities.) I think there’s reasons to give up responsibility if you’ve done a poor job, but I think failure is not deontologically bad especially in a world where few others are going to take responsibility for it.
If I try to imagine what happened with jessicata, what I get is this: taking responsibility means that you’re trying to apply your agency to everything; you’re clamping the variable of “do I consider this event as being within the domain of things I try to optimize” to “yes”. Even if you didn’t even think about X before X has already happened, doesn’t matter; you clamped the variable to yes. If you consider X as being within the domain of things you try to optimize, then it starts to make sense to ask whether you caused X. If you add in this “no excuses” thing, you’re saying: even if supposedly there was no way you could have possibly stopped X, it’s still your responsibility. This is just another instance of the variable being clamped; just because you supposedly couldn’t do anything, doesn’t make you not consider X as something that you’re applying your agency to. (This can be extremely helpful, which is why heroic responsibility has good features; it makes you broaden your search, go meta, look harder, think outside the box, etc., without excuses like “oh but it’s impossible, there’s nothing I can do”; and it makes you look in retrospect at what, in retrospect, you could have done, so that you can pre-retrospect in the future.)
If you’re applying your agency to X “as though you could affect it”, then you’re basically thinking of X as being determined in part by your actions. Yes, other stuff makes X happen, but one of the necessary conditions for X to happen is that you don’t personally prevent it. So every X is partly causally/agentially dependent on you, and so is partly your fault. You could have done more sooner.
A few months ago there were heavy rains in Berkeley and someone had to step up and make sure they didn’t cause serious water damage to our property. Further beyond the organization’s remit, one time Scott Aaronson’s computational complexity wiki was set to go down, and a team member said they’d step forward to fix it and take responsibility for keeping it up in the future.
This sounds like a positive form of ‘take responsibility’ I can agree with.
However, I’m not sure about this whole discussion in regards to ‘the world’, ‘civilization’, etc.
What does ‘take responsibility’ mean for an individual across the span of the entire Earth?
For a very specific sub-sub-sub area, such as imparting some useful knowledge to a fraction of online fan-fiction readers of a specific fandom, it’s certainly possible to make a tangible, measurable, difference, even without some special super-genius.
But beyond that I think it gets exponentially more difficult.
Even a modestly larger goal of imparting some useful knowledge to a majority of online fan-fiction readers would practically be a life’s effort, assuming the individual already has moderately above average talents in writing and so on.
There’s nothing special about taking responsibility for something big or small. It’s the same meaning.
Within teams I’ve worked in it has meant:
You can be confident that someone is personally optimizing to achieve the goal
Both the shame of failing and the glory of succeeding will primarily accrue to them
There is a single point of contact for checking in about any aspect of the problem.
For instance, if you have an issue with how a problem is being solved, there is a single person you can go to to complain
Or if you want to make sure that something you’re doing does not obstruct this other problem from being solved, you can go to them and ask their opinion.
And more things.
I think this applies straightforwardly beyond single organizations.
Various public utilities like water and electricity have government departments who are attempting to actually take responsibility for the problem of everyone having reliable and cheap access to these products. These are the people responsible when the national grid goes out in the UK, which is different from countries with no such government department.
NASA was broadly working on space rockets, but now Elon Musk has stepped forward to make sure our civilization actually becomes multi-planetary in this century. If I was considering some course of action (e.g. taxing imports from India) but wanted to know if it could somehow prevent us from becoming multi planetary, he is basically the top person on my list of people to go to to ask whether it would prevent him from succeeding. (Other people and organizations are also trying to take responsibility for this problem as well and get nonzero credit allocation. In general it’s great if there’s a problem domain where multiple people can attempt to take responsibility for the problem being solved.)
I think there are quite a lot of people trying to take responsibility for improving the public discourse, or preventing it from deteriorating in certain ways, e.g. defending attacks on freedom of speech from particular attack vectors. I think Sam Harris thinks of part of his career as defending the freedom to openly criticize religions like Islam and Christianity, and if I felt like I was concerned that such freedoms would be lost, he’d be one of the first people I’d want to turn to read or reach out to to ask how to help and what the attack vectors are.
You can apply this to particular extinction threats (e.g. asteroids, pandemics, AGI, etc) or to the overall class of such threats. (For instance I’ve historically thought of MIRI as focused on AI and the FHI as interested in the whole class.)
Extinction-level threats seem like a perfectly natural kind of problem someone could try to take responsibility for, thinking about how the entire civilization would respond to a particular attack vector, asking what that person could do in order to prevent extinction (or similar) in that situation, and then implementing such an improvement.
I share your concern and insight, yet I also strongly identify with what Eliezer calls heroic responsibility, and have found it an empowering concept.
For me, it resonates with two groups of fundamental values and assumptions for me:
Group 1:
If something evil is happening, do not assume someone else has already stepped forward and is competently handling it unless proven otherwise. If everyone thinks someone is handling it, likely, noone is; step up, and verify. (Bystander effect: if you hear someone screaming faintly in the distance, and think, there are a hundred people between me and the screaming one, surely someone has alerted the authorities… stop assuming this, right now, verify.) In these scenarios, I will happily hand over to someone more qualified who will handle the thing better. But this often involves handling it while alerting the people who should, and pushing them repeatedly until they actually show up, and staying on site and doing what you can until they do and are sure they will actually take over.
New forms of evil often have noone who was assigned responsibility yet; someone needs to choose to take it—and on this point, see 1. (Relevant for relatively novel problems like AI alignment.)
Enormous forms of evil are too big for any one person to handle, so assume you need to chip in, even if responsible people exist. (E.g. Politicians ought to handle the climate crisis; but they can’t, so each of us needs to help.)
Existential evil is the responsibility of everyone, no matter how weak, yourself included. If you lived in nazi Germany while the Jews were being exterminated, you had the responsibility to help, no matter who you were and what you did. There is no “this is not my job”. If you are human, it is. There is something each of us can do, always. Start small—something is better than nothing—but do not stop building. Recognise contemporary parallels.
Group 2:
Your goal is not to give a plausible report of how you tried that makes you look good and makes your failure comprehensible. Your goal is to succeed. For in things that truly matter, that report makes no difference whatsoever, even if you can make yourself look golden. I keep listening to politicians who say “So anyhow, we did not meet the climate targets… but I mean, the public did not want restrictions, and industry did not comply, and the war led to an energy crisis, and anyway, China was not complying either...” as though the Earth gave a flying fuck. As though you could make the ocean stop rising by explaining that really, seriously, quitting fossil fuels was really very difficult during your term. As though the ocean would give you an extension if only your report had sufficient arguments. The report is helpful if you can learn from it and do better, an analysis of what went wrong to plot a path to right—which is very different from an excuse. At the point where the learning opportunities are over because you are drowning, it becomes a worthless piece of paper. It is not the goal.
You’ll note this does not proceed from the assumption that I am special, or chosen, or brave, or the best at things, or stronger than others. I genuinely do not think I am. I know I can fail badly, because I have failed badly, bitterly so. I know how scared and confused I often feel. But this duty does not arise from what I already am, but what I want all of us to be, believe we all can be. It is a standard universally applied, in which I strive to lead by example, but where I want to live in a world where this is how everyone thinks, because I believe this is something humans can do—take responsibility, be proactive, show agency, look for what needs to be done and do it, forge free paths.
But notably, I see this as a call; a productive, constructive call to do better. It is pointed at the future, and it is pointed outwards.
Reminders of instances where I failed burn in me, and haunt me, but as a reminder to not fail again. Mistakes learned. Knowing of my weakness, so I can avoid it next time. The horror of knowing I failed, as a way to stop me from doing so again. Ever tried, ever failed. Try again, fail again, fail better.
Not to stew in the past. I do not think guilt, or shame, or blame, or fault, are helpful emotions at all.
In instances where I did not manage to protect myself from evil, I want to learn how to protect myself better in the future, but hating myself for getting hurt does not help, it just adds more pain to a heap of pain. Me getting hurt having been avoidable does not make it fair, or okay. I can have compassion for myself having remained in situations that were terrible, while also having the belief that an escape would have been possible, and that if this scenario came again, I would find it this time, with the skills and knowledge I have now. I can think of who I am now with care and kindness, and still want to become something much more.
I can simultaneously think that there is way to really change our lives and communities for each and every one of us; and that it is fucking hard, and that I cannot look into the minds of others to know how hard it is for them, that we are each haunted by demons invisible to others, dragging baggage others do not see. That I did not know how hard many things I believed to be easy were, until I was on the wrong end of them. To know that I do not want to belittle what they are up again and have been through, because that be cruel and ignorant and pointless, but want to empower them to get over it regardless, not because of how small their issues are, for they are vast, but because of what they can become to counter them, something vaster still. I can simultaneously forgive, and burn to undo the damage.
To believe that I, and all those around us, are ultimately helpless, that noone is really responsible for anything… it would not be a kindness or healing. Nor true. But I want to see the opportunities in that truth, not the guilt and shame. For one gets us out of a terrible world; the other keeps us in.
and that since there continue to be horrible things happening in the world, they must have evil intentions and be a partly-demonic entity.
Did you conclude this entirely because there continue to be horrible things happening in the world, or was this based on other reflective information that was consistent with horrible things happening in the world too?
I imagine that this conclusion must at least be partly based on latent personality factors as well. But if so, I’m very curious as to how these things jive with your desire to be heroically responsible at the same time. E.g., how do evil intentions predict your other actions and intentions regarding AI-risk and wanting to avert the destruction of the world?
It wasn’t just that, it was also based on thinking I had more control over other people than I realistically had. Probably it is partly latent personality factors. But a heroic responsibility mindset will tend to cause people to think other people’s actions are their fault if they could, potentially, have affected them through any sort of psychological manipulation (see also, Against Responsibility).
I think I thought I was working on AI risk but wasn’t taking heroic responsibility because I wasn’t owning the whole problem. People around me encouraged me to take on more responsibility and actually optimize on the world as a consequentialist agent. I subsequently felt very bad that I had taken on responsibilities for solving AI safety that I could not deliver on. I also felt bad that maybe because I wrote some blog posts online criticizing “rationalists” that that would lead to the destruction of the world and that would be my fault.
This is cool because what you’re saying has useful information pertinent to model updates regardless of how I choose to model your internal state.
Here’s why it’s really important:
You seem to have been motivated to classify your own intentions as “evil” at some point, based entirely on things that were not entirely under your own control.
That points to your social surroundings as having pressured you to come to that conclusion (I am not sure it is very likely that you would have come to that conclusion on your own, without any social pressure).
So that brings us to the next question: Is it more likely that you are evil, or rather, that your social surroundings were / are?
I think those are hard to separate. Bad social circumstances can make people act badly. There’s the “hurt people hurt people” truism and numerous examples of people being caused to act morally worse by their circumstances e.g. in war. I do think I have gone through extraordinary measures to understand the ways in which I act badly (often in response to social cues) and to act more intentionally well.
Yes, but the point is that we’re trying to determine if you are under “bad” social circumstances or not. Those circumstances will not be independent from other aspects of the social group, e.g. the ideology it espouses externally and things it tells its members internally.
What I’m trying to figure out is to what extent you came to believe you were “evil” on your own versus you were compelled to think that about yourself. You were and are compelled to think about ways in which you act “badly”—nearby or adjacent to a community that encourages its members to think about how to act “goodly.” It’s not a given, per se, that a community devoted explicitly to doing good in the world thinks that it should label actions as “bad” if they fall short of arbitrary standards. It could, rather, decide to label actions people take as “good” or “gooder” or “really really good” if it decides that most functional people are normally inclined to behave in ways that aren’t necessarily un-altruistic or harmful to other people.
I’m working on a theory of social-group-dynamics which posits that your situation is caused by “negative-selection groups” or “credential-groups” which are characterized by their tendency to label only their activities as actually successfully accomplishing whatever it is they claim to do—e.g., “rationality” or “effective altruism.” If it seems like the group’s ideology or behavior implies that non-membership is tantamount to either not caring about doing well or being incompetent in that regard, then it is a credential-group.
Credential-groups are bad social circumstances, and in a nutshell, they act badly by telling members who they know not to be intentionally causing harm that they are harmful or bad people (or mentally ill).
I think a typical-ish person actually doing this doesn’t look like them rising to the challenge. I think someone actually doing this looks like them thinking they have advanced mind control powers (since even things done by other people are their fault) and that since there continue to be horrible things happening in the world, they must have evil intentions and be a partly-demonic entity. It looks like them making themselves a scapegoat. This isn’t speculative, I’ve experienced this and I think it was connected to trying to take seriously heroic responsibility and that I could personally be responsible for the destruction of the world (e.g. by starting conversations about AI that cause AI to be developed sooner), which my social environment encouraged.
I think this goes against normal therapy advice e.g. the idea that you having been abused isn’t your fault, that you need to forgive yourself for having acted suboptimally given the confusions you previously had, that you shouldn’t depend on controlling others’ behavior, that you should respect others’ boundaries and their ability to make their own decisions, etc. There are certainly problems with normal therapy advice, but this is something people have already thought a lot about and have clinical experience with.
Maybe some people get something out of this, either because they do a pretend version of it or have an abnormal psychology where they don’t connect everything bad being their fault with normal emotions a typical person would have as a consequence. But it seems out of place in a compilation about how to have good mental health.
My other comment notwithstanding, I do think the HPMOR quote is not very helpful for someone’s mental health when they’re in pain and seems a bit odd placed atop a section on advice, and I think the advice at the wrong time can feel oppressive. The hero-licensing post feels much less like it risks feeling oppressed by every bad thing that happens in the world. And personally I found Anna’s post linked earlier to be much more helpful advice that is related to and partially upstream of the sorts of changes in my life that have reduced a lot of anxiety. If it were me I’d probably put that at the top of the list there, perhaps along with Come to Your Terms by Nate which also resonates strongly with me.
(Looking further) I see, the point of that section isn’t to be “the advice section”, it’s to be “the advice posts that don’t talk about AI”. I still think something about that is confusing. My first-guess is that I’d structure a post like this like an FAQ, “Are you feeling X because Y? Then here’s two posts that address this” and so on, so that people can find the bit that is relevant to their problem. But not sure.
I can understand thinking of yourself as having evil intentions, but I don’t understand believing you’re a partly-demonic entity.
I think the way that the global market and culture can respond to ideas is strange and surprising, with people you don’t know taking major undertakings based on your ideas, with lots of copying and imitation and whole organizations or people changing their lives around something you did without them ever knowing you. Like the way that Elon Musk met a girlfriend of his via a Roko’s Basilisk meme, or one time someone on reddit I don’t know believed that an action I’d taken was literally “the AGI” acting in their life (which was weird for me). I think that one can make straightforward mistakes in earnestly reasoning about strange things (as is argued in this Astral Codex Ten post that IIRC argues that conspiracy theories often have surprisingly good arguments for them that a typical person would find persuasive on their own merits). So I’m not saying that really trying to act on a global scale on a difficult problem couldn’t cause you to have supernatural beliefs.
But you said it’s what would happen to a ‘typical-ish person’. If you believe a ‘typical-ish person’ trying to have an epistemology will reliably fail in ways that lead to them believing in conspiracies, then I guess yes, they may also come to have supernatural beliefs if they try to take action that has massive consequences in the world. But I think a person with just a little more perspective can be self-aware about conspiracy theories and similarly be self-aware about whatever other hypotheses they form, and try to stick to fairly grounded ones. It turns out that when you poke civilization the right way does a lot of really outsized and overpowered things sometimes.
I imagine it was a trip for Doug Engelbart to watch everyone in the world get a personal computer, with a computer mouse and a graphical user-interface that he had invented. But I think it would have been a mistake for him to think anything supernatural was going on, even if he were trying to personally take responsibility for directing the world in as best he could, and I expect most people would be able to see that (from the outside).
If you think you’re responsible for everything, that means you’re responsible for everything bad that happens. That’s a lot of very bad stuff, some of which is motivated by bad intentions. An entity who’s responsible for that much bad stuff couldn’t be like a typical person, who is responsible for a modest amount of bad stuff. It’s hard to conceptualize just how much bad stuff this hypothetical person is responsible for without supernatural metaphors; it’s far beyond what a mere genocidal dictator like Hitler or Stalin is responsible for (at least, if you aren’t attributing heroic responsibility to them). At that point, “well, I’m responsible for more bad stuff than I previously thought Hitler was responsible for” doesn’t come close to grasping the sheer magnitude, and supernatural metaphors like God or Satan come closer. The conclusion is insane and supernatural because the premise, that you are personally responsible for everything that happens, is insane and supernatural.
I’m not really sure how typical this particular response would be. But I think it’s incredibly rare to actually take heroic responsibility literally and seriously. So even if I only rarely see evidence of people thinking they’re demonic (which is surprisingly common, even if rare in absolute terms), that doesn’t say much about the conditional likelihood of that response on taking heroic responsibility seriously.
I have a version of heroic responsibility in my head that I don’t think causes one to have false beliefs about supernatural phenomena, so I’m interested in engaging on whether the version in my head makes sense, though I don’t mean to invalidate your strongly negative personal experiences with the idea.
I think there’s a difference between causing something and taking responsibility for it. There’s a notion of “I didn’t cause this mess but I am going to clean it up.” In my team often a problem arises that we didn’t cause and weren’t expecting. A few months ago there were heavy rains in Berkeley and someone had to step up and make sure they didn’t cause serious water damage to our property. Further beyond the organization’s remit, one time Scott Aaronson’s computational complexity wiki was set to go down, and a team member said they’d step forward to fix it and take responsibility for keeping it up in the future. These were situations where the person who took them on didn’t cause them and hadn’t said that they were responsible for the class of things ahead of time, but increasingly took on more responsibility because they could and because it was good.
When Harry is speaking to McGonagall in that quote, I believe he’s saying “No, I’m actually taking responsibility for what happened to my friend. I’m asking myself what it would’ve looked like for me to actually take responsibility for it earlier, rather than the default state of nature where we’re all just bumbling around. Where the standard is ‘this terrible thing doesn’t happen’ as opposed to ‘well I’m deontologically in the clear and nobody blames me but the thing still happens’.”
I don’t think this gives Harry false magical beliefs that he personally caused a horrendous thing to happen to his friend (though I think that magical beliefs of the sort so have a higher prior in his universe).
I think you can “take responsibility” for civilization not going extinct in this manner, without believing you personally caused the extinction. (It will suck a bit for you because it’s very hard and you will probably fail in your responsibilities.) I think there’s reasons to give up responsibility if you’ve done a poor job, but I think failure is not deontologically bad especially in a world where few others are going to take responsibility for it.
If I try to imagine what happened with jessicata, what I get is this: taking responsibility means that you’re trying to apply your agency to everything; you’re clamping the variable of “do I consider this event as being within the domain of things I try to optimize” to “yes”. Even if you didn’t even think about X before X has already happened, doesn’t matter; you clamped the variable to yes. If you consider X as being within the domain of things you try to optimize, then it starts to make sense to ask whether you caused X. If you add in this “no excuses” thing, you’re saying: even if supposedly there was no way you could have possibly stopped X, it’s still your responsibility. This is just another instance of the variable being clamped; just because you supposedly couldn’t do anything, doesn’t make you not consider X as something that you’re applying your agency to. (This can be extremely helpful, which is why heroic responsibility has good features; it makes you broaden your search, go meta, look harder, think outside the box, etc., without excuses like “oh but it’s impossible, there’s nothing I can do”; and it makes you look in retrospect at what, in retrospect, you could have done, so that you can pre-retrospect in the future.)
If you’re applying your agency to X “as though you could affect it”, then you’re basically thinking of X as being determined in part by your actions. Yes, other stuff makes X happen, but one of the necessary conditions for X to happen is that you don’t personally prevent it. So every X is partly causally/agentially dependent on you, and so is partly your fault. You could have done more sooner.
This sounds like a positive form of ‘take responsibility’ I can agree with.
However, I’m not sure about this whole discussion in regards to ‘the world’, ‘civilization’, etc.
What does ‘take responsibility’ mean for an individual across the span of the entire Earth?
For a very specific sub-sub-sub area, such as imparting some useful knowledge to a fraction of online fan-fiction readers of a specific fandom, it’s certainly possible to make a tangible, measurable, difference, even without some special super-genius.
But beyond that I think it gets exponentially more difficult.
Even a modestly larger goal of imparting some useful knowledge to a majority of online fan-fiction readers would practically be a life’s effort, assuming the individual already has moderately above average talents in writing and so on.
There’s nothing special about taking responsibility for something big or small. It’s the same meaning.
Within teams I’ve worked in it has meant:
You can be confident that someone is personally optimizing to achieve the goal
Both the shame of failing and the glory of succeeding will primarily accrue to them
There is a single point of contact for checking in about any aspect of the problem.
For instance, if you have an issue with how a problem is being solved, there is a single person you can go to to complain
Or if you want to make sure that something you’re doing does not obstruct this other problem from being solved, you can go to them and ask their opinion.
And more things.
I think this applies straightforwardly beyond single organizations.
Various public utilities like water and electricity have government departments who are attempting to actually take responsibility for the problem of everyone having reliable and cheap access to these products. These are the people responsible when the national grid goes out in the UK, which is different from countries with no such government department.
NASA was broadly working on space rockets, but now Elon Musk has stepped forward to make sure our civilization actually becomes multi-planetary in this century. If I was considering some course of action (e.g. taxing imports from India) but wanted to know if it could somehow prevent us from becoming multi planetary, he is basically the top person on my list of people to go to to ask whether it would prevent him from succeeding. (Other people and organizations are also trying to take responsibility for this problem as well and get nonzero credit allocation. In general it’s great if there’s a problem domain where multiple people can attempt to take responsibility for the problem being solved.)
I think there are quite a lot of people trying to take responsibility for improving the public discourse, or preventing it from deteriorating in certain ways, e.g. defending attacks on freedom of speech from particular attack vectors. I think Sam Harris thinks of part of his career as defending the freedom to openly criticize religions like Islam and Christianity, and if I felt like I was concerned that such freedoms would be lost, he’d be one of the first people I’d want to turn to read or reach out to to ask how to help and what the attack vectors are.
You can apply this to particular extinction threats (e.g. asteroids, pandemics, AGI, etc) or to the overall class of such threats. (For instance I’ve historically thought of MIRI as focused on AI and the FHI as interested in the whole class.)
Extinction-level threats seem like a perfectly natural kind of problem someone could try to take responsibility for, thinking about how the entire civilization would respond to a particular attack vector, asking what that person could do in order to prevent extinction (or similar) in that situation, and then implementing such an improvement.
I share your concern and insight, yet I also strongly identify with what Eliezer calls heroic responsibility, and have found it an empowering concept.
For me, it resonates with two groups of fundamental values and assumptions for me:
Group 1:
If something evil is happening, do not assume someone else has already stepped forward and is competently handling it unless proven otherwise. If everyone thinks someone is handling it, likely, noone is; step up, and verify. (Bystander effect: if you hear someone screaming faintly in the distance, and think, there are a hundred people between me and the screaming one, surely someone has alerted the authorities… stop assuming this, right now, verify.) In these scenarios, I will happily hand over to someone more qualified who will handle the thing better. But this often involves handling it while alerting the people who should, and pushing them repeatedly until they actually show up, and staying on site and doing what you can until they do and are sure they will actually take over.
New forms of evil often have noone who was assigned responsibility yet; someone needs to choose to take it—and on this point, see 1. (Relevant for relatively novel problems like AI alignment.)
Enormous forms of evil are too big for any one person to handle, so assume you need to chip in, even if responsible people exist. (E.g. Politicians ought to handle the climate crisis; but they can’t, so each of us needs to help.)
Existential evil is the responsibility of everyone, no matter how weak, yourself included. If you lived in nazi Germany while the Jews were being exterminated, you had the responsibility to help, no matter who you were and what you did. There is no “this is not my job”. If you are human, it is. There is something each of us can do, always. Start small—something is better than nothing—but do not stop building. Recognise contemporary parallels.
Group 2:
Your goal is not to give a plausible report of how you tried that makes you look good and makes your failure comprehensible. Your goal is to succeed. For in things that truly matter, that report makes no difference whatsoever, even if you can make yourself look golden. I keep listening to politicians who say “So anyhow, we did not meet the climate targets… but I mean, the public did not want restrictions, and industry did not comply, and the war led to an energy crisis, and anyway, China was not complying either...” as though the Earth gave a flying fuck. As though you could make the ocean stop rising by explaining that really, seriously, quitting fossil fuels was really very difficult during your term. As though the ocean would give you an extension if only your report had sufficient arguments. The report is helpful if you can learn from it and do better, an analysis of what went wrong to plot a path to right—which is very different from an excuse. At the point where the learning opportunities are over because you are drowning, it becomes a worthless piece of paper. It is not the goal.
You’ll note this does not proceed from the assumption that I am special, or chosen, or brave, or the best at things, or stronger than others. I genuinely do not think I am. I know I can fail badly, because I have failed badly, bitterly so. I know how scared and confused I often feel. But this duty does not arise from what I already am, but what I want all of us to be, believe we all can be. It is a standard universally applied, in which I strive to lead by example, but where I want to live in a world where this is how everyone thinks, because I believe this is something humans can do—take responsibility, be proactive, show agency, look for what needs to be done and do it, forge free paths.
But notably, I see this as a call; a productive, constructive call to do better. It is pointed at the future, and it is pointed outwards.
Reminders of instances where I failed burn in me, and haunt me, but as a reminder to not fail again. Mistakes learned. Knowing of my weakness, so I can avoid it next time. The horror of knowing I failed, as a way to stop me from doing so again. Ever tried, ever failed. Try again, fail again, fail better.
Not to stew in the past. I do not think guilt, or shame, or blame, or fault, are helpful emotions at all.
In instances where I did not manage to protect myself from evil, I want to learn how to protect myself better in the future, but hating myself for getting hurt does not help, it just adds more pain to a heap of pain. Me getting hurt having been avoidable does not make it fair, or okay. I can have compassion for myself having remained in situations that were terrible, while also having the belief that an escape would have been possible, and that if this scenario came again, I would find it this time, with the skills and knowledge I have now. I can think of who I am now with care and kindness, and still want to become something much more.
I can simultaneously think that there is way to really change our lives and communities for each and every one of us; and that it is fucking hard, and that I cannot look into the minds of others to know how hard it is for them, that we are each haunted by demons invisible to others, dragging baggage others do not see. That I did not know how hard many things I believed to be easy were, until I was on the wrong end of them. To know that I do not want to belittle what they are up again and have been through, because that be cruel and ignorant and pointless, but want to empower them to get over it regardless, not because of how small their issues are, for they are vast, but because of what they can become to counter them, something vaster still. I can simultaneously forgive, and burn to undo the damage.
To believe that I, and all those around us, are ultimately helpless, that noone is really responsible for anything… it would not be a kindness or healing. Nor true. But I want to see the opportunities in that truth, not the guilt and shame. For one gets us out of a terrible world; the other keeps us in.
Did you conclude this entirely because there continue to be horrible things happening in the world, or was this based on other reflective information that was consistent with horrible things happening in the world too?
I imagine that this conclusion must at least be partly based on latent personality factors as well. But if so, I’m very curious as to how these things jive with your desire to be heroically responsible at the same time. E.g., how do evil intentions predict your other actions and intentions regarding AI-risk and wanting to avert the destruction of the world?
It wasn’t just that, it was also based on thinking I had more control over other people than I realistically had. Probably it is partly latent personality factors. But a heroic responsibility mindset will tend to cause people to think other people’s actions are their fault if they could, potentially, have affected them through any sort of psychological manipulation (see also, Against Responsibility).
I think I thought I was working on AI risk but wasn’t taking heroic responsibility because I wasn’t owning the whole problem. People around me encouraged me to take on more responsibility and actually optimize on the world as a consequentialist agent. I subsequently felt very bad that I had taken on responsibilities for solving AI safety that I could not deliver on. I also felt bad that maybe because I wrote some blog posts online criticizing “rationalists” that that would lead to the destruction of the world and that would be my fault.
This is cool because what you’re saying has useful information pertinent to model updates regardless of how I choose to model your internal state.
Here’s why it’s really important:
You seem to have been motivated to classify your own intentions as “evil” at some point, based entirely on things that were not entirely under your own control.
That points to your social surroundings as having pressured you to come to that conclusion (I am not sure it is very likely that you would have come to that conclusion on your own, without any social pressure).
So that brings us to the next question: Is it more likely that you are evil, or rather, that your social surroundings were / are?
I think those are hard to separate. Bad social circumstances can make people act badly. There’s the “hurt people hurt people” truism and numerous examples of people being caused to act morally worse by their circumstances e.g. in war. I do think I have gone through extraordinary measures to understand the ways in which I act badly (often in response to social cues) and to act more intentionally well.
Yes, but the point is that we’re trying to determine if you are under “bad” social circumstances or not. Those circumstances will not be independent from other aspects of the social group, e.g. the ideology it espouses externally and things it tells its members internally.
What I’m trying to figure out is to what extent you came to believe you were “evil” on your own versus you were compelled to think that about yourself. You were and are compelled to think about ways in which you act “badly”—nearby or adjacent to a community that encourages its members to think about how to act “goodly.” It’s not a given, per se, that a community devoted explicitly to doing good in the world thinks that it should label actions as “bad” if they fall short of arbitrary standards. It could, rather, decide to label actions people take as “good” or “gooder” or “really really good” if it decides that most functional people are normally inclined to behave in ways that aren’t necessarily un-altruistic or harmful to other people.
I’m working on a theory of social-group-dynamics which posits that your situation is caused by “negative-selection groups” or “credential-groups” which are characterized by their tendency to label only their activities as actually successfully accomplishing whatever it is they claim to do—e.g., “rationality” or “effective altruism.” If it seems like the group’s ideology or behavior implies that non-membership is tantamount to either not caring about doing well or being incompetent in that regard, then it is a credential-group.
Credential-groups are bad social circumstances, and in a nutshell, they act badly by telling members who they know not to be intentionally causing harm that they are harmful or bad people (or mentally ill).
I agree with this, thanks for the feedback! Edited.