I would be up for having a dialogue with Nate. Quintin, myself, and the others in the Optimist community are working on posts which will more directly critique the arguments for pessimism.
I am appreciative of folks like yourself Nora and Quintin building detailed models of the alignment problem and presenting thoughtful counterarguments to existing arguments about the difficulty. I think anyone would consider it a worthwhile endeavor regardless of their perspective on how hard the problem is, and wish you good luck in your efforts to do so.
In my culture, people understand and respect that humans can easily trick themselves into making terrible collective decisions because of tribal dynamics. They respond to this in many ways, such as by working to avoid making it a primary part of people’s work or of people’s attention, and also by making sure to not accidentally trigger tribal dynamics by inventing tribal distinctions that didn’t formerly exist but get picked up by the brain and thunk into being part of our shared mapmaking [edit: and also by keeping their identity small]. It is generally considered healthy to spend most of our attention on understanding the world, solving problems, and sharing arguments, rather than making political evaluations about which group one is a member of. People are also extra hesitant about creating groups that exist fundamentally in opposition to other groups.
My current belief is that the vast majority of the people who have thought about the impacts and alignment of advanced AI (academics like Geoffrey Hinton, forecasters like Phil Tetlock, rationalists like Scott Garrabrant, and so forth) don’t think of themselves as participating in ‘optimist’ or ‘pessimist’ communities, and would not use the term to describe their community. So my sense is that this is a false description of the world. I have a spidey-sense that language like this often tries to make itself become true by saying it is true, and is good at getting itself into people’s monkey brains and inventing tribal lines between friends where formerly there were none.
I think that the existing so-called communities (e.g. “Effective Altruism” or “Rationality” or “Academia”) are each in their own ways bereft of some essential qualities for functioning and ethical people and projects. This does not mean that if you or I create new ones quickly they will be good or even better. I do ask that you take care to not recklessly invent new tribes that have even worse characteristics than those that already exist.
From my culture to yours, I would like to make a request that you exercise restraint on the dimension of reifying tribal distinctions that did not formerly exist. It is possible that there are two natural tribes here that will exist in healthy opposition to one another, but personally I doubt it, and I hope you will take time to genuinely consider the costs of greater tribalism.
don’t think of themselves as participating in ‘optimist’ or ‘pessimist’ communities, and would not use the term to describe their community. So my sense is that this is a false description of the world
I think you’re playing dumb. Descriptively, the existing “EA”/”rationalist” so-called communities are pessimistic. That’s what the “AI optimists” brand is a reaction to! We shouldn’t reify pessimism as an identity (because it’s supposed to be a reflection of reality that responds to the evidence), but we also shouldn’t imagine that declining to reify a description as a tribal identity makes it “a false description of the world”.
I think the words “optimism” and “pessimism” are really confusing, because they conflate the probability, utilityandsteam of things:
You can be “optimistic” if you believe a good event is likely (or a bad one unlikely), you can be optimistic because you believe a future event (maybe even unlikely) is good, or you have a plan or idea or stance for which you have a high recursive self-trust/recursive reflectively stable prediction that you will engage in it.
So you could be “pessimistic” in the sense that extinction due to AI is unlikely (say, <1%) but you find it super bad and you currently don’t have anything concrete that you can latch onto to decrease it.
Or (in the case of e.g. MIRI) you might have (“indefinitely optimistic”?) steam for reducing AI risk, find it moderately to extremely likely, and think it’s going to be super bad.
Or you might think that extinction would be super bad, and believe it’s unlikely (as Belrose and Popedo) and have steam for both AI and AI alignment.
But the terms are apparently confusing to many people, and I think using these terminologies can “leak” optimism or pessimism from one category into another, and can lead to worse decisions and incorrect beliefs.
It’s correct that there’s a distinction between whether people identify as pessimistic and whether they are pessimistic in their outlook. I think the first claim is false, and I actually also think the second claim is false, though I am less confident in that.
Rohin reported an unusually large (90%) chance that AI systems will be safe without additional intervention. His optimism was largely based on his belief that AI development will be relatively gradual and AI researchers will correct safety issues that come up.
...without AI alignment, AI systems are reasonably likely to cause an irreversible catastrophe like human extinction. I think most people can agree that this would be bad, though there’s a lot of reasonable debate about whether it’s likely. I believe the total risk is around 10–20%, which is high enough to obsess over.
I go back and forth more than I can really justify, but if you force me to give an estimate it’s probably around 33%; I think it’s very plausible that we die, but more likely that we survive (at least for a little while).
Step 1: sort out our fundamental confusions about agency
Step 2: ambitious value learning (i.e. build an AI which correctly learns human values and optimizes for them)
Step 3: …
Step 4: profit!
… and do all that before AGI kills us all.
That sounds… awfully optimistic. Do you actually think that’s viable?
Better than a 50⁄50 chance of working in time.
Davidad also feels to me like an optimist to me about the world — someone who is excited about solving the problems and finding ways to win, and is excited about other people and ready to back major projects to set things on a good course. I don’t know his probability of an AI takeover but I stand by that he doesn’t seem pessimistic in personality.
On occasion when talking to researchers, I talk to someone who is optimistic that their research path will actually work. I won’t name who but I recently spoke with a long-time researcher who believes that they have a major breakthrough and will be able to solve alignment. I think researchers can trick themselves into thinking they have a breakthrough when they don’t, and this field is unusually lacking in feedback, so I’m not saying I straightforwardly buy their claims, but I think it’s inaccurate to describe them all as pessimistic.
One story we could tell is that the thing these people have in common is that they take alignment seriously, not that they are generally pessimists.
I think alignment is unsolved in the general case and so this makes it harder to strongly argue that it will get solved for future systems, but I don’t buy that people would not update on seeing a solution or strong arguments for that conclusion, and I think that some of Quintin’s and Nora’s arguments have caused people I know to rethink their positions and update some in that direction.
I think the rationalist and EA spaces have been healthy enough for people to express quite extreme positions of expecting an AI-takeover-slash-extinction. I think it would be a strongly negative sign for everyone in these spaces to have identical views or for everyone to give up all hope on civilization’s prospects; but in the absence of that I think it’s a sign of health that people are able to be open about having very strong views. I also think the people who most confidently anticipate an AI takeover sometimes feel and express hope.
I don’t think everyone is starting with pessimism as their bottom line, and I think it’s inaccurate to describe the majority of people in these ecosystems as temperamentally pessimistic or epistemically pessimistic.
I think there are at least two definitions of optimistic/pessimistic that are often conflated:
Epistemic: an optimist is someone who thinks doom is unlikely, a pessimist someone who thinks doom is likely
Dispositional: an optimist is someone who is hopeful and glass-half-full, a pessimist is someone who is despondent and fatalistic
Certainly these are correlated to some extent: if you believe there’s a high chance of everyone dying, probably this is not great for your mental health. Also probably people who are depressed are more likely to have negatively distorted epistemics. This would explain why it’s tempting to use the same term to refer to both.
However, I think using the same term to refer to both leads to some problems:
Being cheerful and hopeful is generally a good trait to have. However, this often bleeds into also believing it is desirable to have epistemic beliefs that doom is unlikely, rather than trying to figure out whether doom is actually likely.
Because “optimism” feels morally superior to “pessimism” (due to the dispositional definition), it’s inevitable that using the terms for tribal affiliation even for the epistemic definition causes tension.
I personally strive to be someone with an optimistic disposition and also to try my best to have my beliefs track the truth. I also try my best to notice and avoid the tribal pressures.
I think Nora is reacting to tribal line strategy in use by ai nihilists (e/acc). I also think your comment could use the clarity of being a third of its length without losing any meaning.
I think that short, critical comments can sometimes read as snarky/rude, and I don’t want to speak that way to Nora. I also wanted to take some space to try to invoke the general approach to thinking about tribalism and show how I was applying it here, to separate my point from one that is only arguing against this particular tribal line that Nora is reifying, but instead to encourage restraint in general. Probably you’re right that I could make it substantially shorter; writing concisely is a skill I want to work on.
I don’t know who the “ai nihilists” are supposed to be. My sense is that you could’ve figured out from my comment objecting to playing and fast and loose with group names that I wouldn’t think that phrase carved reality and that I wasn’t sure who you have in mind!
The nihilists would be folks who don’t even care to try to align ai because they don’t value humans. eaccs, in other words. I’m just being descriptive.
I would be up for having a dialogue with Nate. Quintin, myself, and the others in the Optimist community are working on posts which will more directly critique the arguments for pessimism.
I am appreciative of folks like yourself Nora and Quintin building detailed models of the alignment problem and presenting thoughtful counterarguments to existing arguments about the difficulty. I think anyone would consider it a worthwhile endeavor regardless of their perspective on how hard the problem is, and wish you good luck in your efforts to do so.
In my culture, people understand and respect that humans can easily trick themselves into making terrible collective decisions because of tribal dynamics. They respond to this in many ways, such as by working to avoid making it a primary part of people’s work or of people’s attention, and also by making sure to not accidentally trigger tribal dynamics by inventing tribal distinctions that didn’t formerly exist but get picked up by the brain and thunk into being part of our shared mapmaking [edit: and also by keeping their identity small]. It is generally considered healthy to spend most of our attention on understanding the world, solving problems, and sharing arguments, rather than making political evaluations about which group one is a member of. People are also extra hesitant about creating groups that exist fundamentally in opposition to other groups.
My current belief is that the vast majority of the people who have thought about the impacts and alignment of advanced AI (academics like Geoffrey Hinton, forecasters like Phil Tetlock, rationalists like Scott Garrabrant, and so forth) don’t think of themselves as participating in ‘optimist’ or ‘pessimist’ communities, and would not use the term to describe their community. So my sense is that this is a false description of the world. I have a spidey-sense that language like this often tries to make itself become true by saying it is true, and is good at getting itself into people’s monkey brains and inventing tribal lines between friends where formerly there were none.
I think that the existing so-called communities (e.g. “Effective Altruism” or “Rationality” or “Academia”) are each in their own ways bereft of some essential qualities for functioning and ethical people and projects. This does not mean that if you or I create new ones quickly they will be good or even better. I do ask that you take care to not recklessly invent new tribes that have even worse characteristics than those that already exist.
From my culture to yours, I would like to make a request that you exercise restraint on the dimension of reifying tribal distinctions that did not formerly exist. It is possible that there are two natural tribes here that will exist in healthy opposition to one another, but personally I doubt it, and I hope you will take time to genuinely consider the costs of greater tribalism.
I agree that it would be terrible for people to form tribal identities around “optimism” or “pessimism” (and have criticized Belrose and Pope’s “AI optimism” brand name on those grounds). However, when you say
I think you’re playing dumb. Descriptively, the existing “EA”/”rationalist” so-called communities are pessimistic. That’s what the “AI optimists” brand is a reaction to! We shouldn’t reify pessimism as an identity (because it’s supposed to be a reflection of reality that responds to the evidence), but we also shouldn’t imagine that declining to reify a description as a tribal identity makes it “a false description of the world”.
I think the words “optimism” and “pessimism” are really confusing, because they conflate the probability, utility and steam of things:
You can be “optimistic” if you believe a good event is likely (or a bad one unlikely), you can be optimistic because you believe a future event (maybe even unlikely) is good, or you have a plan or idea or stance for which you have a high recursive self-trust/recursive reflectively stable prediction that you will engage in it.
So you could be “pessimistic” in the sense that extinction due to AI is unlikely (say, <1%) but you find it super bad and you currently don’t have anything concrete that you can latch onto to decrease it.
Or (in the case of e.g. MIRI) you might have (“indefinitely optimistic”?) steam for reducing AI risk, find it moderately to extremely likely, and think it’s going to be super bad.
Or you might think that extinction would be super bad, and believe it’s unlikely (as Belrose and Pope do) and have steam for both AI and AI alignment.
But the terms are apparently confusing to many people, and I think using these terminologies can “leak” optimism or pessimism from one category into another, and can lead to worse decisions and incorrect beliefs.
It’s correct that there’s a distinction between whether people identify as pessimistic and whether they are pessimistic in their outlook. I think the first claim is false, and I actually also think the second claim is false, though I am less confident in that.
Interview with Rohin Shah in Dec ’19
Paul Christiano in Dec ’22
Scott Alexander, in Why I Am Not (As Much Of) A Doomer (As Some People) in March ’23
John Wentworth in Dec ’21 (also see his to-me-inspiring stump speech from a month later):
Davidad also feels to me like an optimist to me about the world — someone who is excited about solving the problems and finding ways to win, and is excited about other people and ready to back major projects to set things on a good course. I don’t know his probability of an AI takeover but I stand by that he doesn’t seem pessimistic in personality.
On occasion when talking to researchers, I talk to someone who is optimistic that their research path will actually work. I won’t name who but I recently spoke with a long-time researcher who believes that they have a major breakthrough and will be able to solve alignment. I think researchers can trick themselves into thinking they have a breakthrough when they don’t, and this field is unusually lacking in feedback, so I’m not saying I straightforwardly buy their claims, but I think it’s inaccurate to describe them all as pessimistic.
A few related thoughts:
One story we could tell is that the thing these people have in common is that they take alignment seriously, not that they are generally pessimists.
I think alignment is unsolved in the general case and so this makes it harder to strongly argue that it will get solved for future systems, but I don’t buy that people would not update on seeing a solution or strong arguments for that conclusion, and I think that some of Quintin’s and Nora’s arguments have caused people I know to rethink their positions and update some in that direction.
I think the rationalist and EA spaces have been healthy enough for people to express quite extreme positions of expecting an AI-takeover-slash-extinction. I think it would be a strongly negative sign for everyone in these spaces to have identical views or for everyone to give up all hope on civilization’s prospects; but in the absence of that I think it’s a sign of health that people are able to be open about having very strong views. I also think the people who most confidently anticipate an AI takeover sometimes feel and express hope.
I don’t think everyone is starting with pessimism as their bottom line, and I think it’s inaccurate to describe the majority of people in these ecosystems as temperamentally pessimistic or epistemically pessimistic.
I think there are at least two definitions of optimistic/pessimistic that are often conflated:
Epistemic: an optimist is someone who thinks doom is unlikely, a pessimist someone who thinks doom is likely
Dispositional: an optimist is someone who is hopeful and glass-half-full, a pessimist is someone who is despondent and fatalistic
Certainly these are correlated to some extent: if you believe there’s a high chance of everyone dying, probably this is not great for your mental health. Also probably people who are depressed are more likely to have negatively distorted epistemics. This would explain why it’s tempting to use the same term to refer to both.
However, I think using the same term to refer to both leads to some problems:
Being cheerful and hopeful is generally a good trait to have. However, this often bleeds into also believing it is desirable to have epistemic beliefs that doom is unlikely, rather than trying to figure out whether doom is actually likely.
Because “optimism” feels morally superior to “pessimism” (due to the dispositional definition), it’s inevitable that using the terms for tribal affiliation even for the epistemic definition causes tension.
I personally strive to be someone with an optimistic disposition and also to try my best to have my beliefs track the truth. I also try my best to notice and avoid the tribal pressures.
I think Nora is reacting to tribal line strategy in use by ai nihilists (e/acc). I also think your comment could use the clarity of being a third of its length without losing any meaning.
I think that short, critical comments can sometimes read as snarky/rude, and I don’t want to speak that way to Nora. I also wanted to take some space to try to invoke the general approach to thinking about tribalism and show how I was applying it here, to separate my point from one that is only arguing against this particular tribal line that Nora is reifying, but instead to encourage restraint in general. Probably you’re right that I could make it substantially shorter; writing concisely is a skill I want to work on.
I don’t know who the “ai nihilists” are supposed to be. My sense is that you could’ve figured out from my comment objecting to playing and fast and loose with group names that I wouldn’t think that phrase carved reality and that I wasn’t sure who you have in mind!
The nihilists would be folks who don’t even care to try to align ai because they don’t value humans. eaccs, in other words. I’m just being descriptive.
Feedback: I had formed a guess as to who you meant to which I assigned >50% probability, and my guess was incorrect.
I’d be happy to have a dialogue with you too (I think my view is maybe not so different from Nate’s?)