AI scares and changing public beliefs

The opportunity

I, for one, am looking forward to the the next public AI scares.

There’s a curious up-side to the reckless deployment of powerful, complex systems: they scare people. This has the potential to shift the landscape of the public debate. It’s an opportunity the AGI safety community should be prepared for the next time it happens. I’m not saying that we should become fear-mongers, but that we need to engage wisely when we’re given those opportunities. We could easily fumble the ball badly, and we need to not do that.

We seem to be in an interesting wild-west era of AI deployment. I hope we leave this era before it’s too late. Before we do, I expect to see more scary behavior from AI systems. The early behavior of Bing Chat, documented in bing Chat is blatantly, aggressively misaligned caused a furor of media responses, and it seemed to result in some AI X-risk skeptics changing their opinions to take those dangers more seriously. I’d rather not concern myself with public opinion, and I’ve hoped that we could just leave this to the relative experts (the machine learning and safety communities), but it’s looking more and more like the public are going to want to weigh in. I’m hopeful that those changes in expert opinion can continue, and that they’ll trickle down into public opinion with enough fidelity to help the situation[1].

I’ll mention two particular cases. Gary Marcus said he’s shifting his opinions on dangers in a recent podcast (paywalled) with Sam Harris and Stuart Russell. He maintained his skepticism about LLMs constituting real intelligence, but said he’d become more concerned about the dangers, particularly because of how OpenAI and Microsoft deployed rapidly, pressuring Google to respond with rushed deployments of their own. I think he and similar skeptics will shift further as LangChain, AutoGPT, and other automated chain-of-thought adaptations add goals and executive function to LLMs. Another previous skeptic is Russ Roberts of EconTalk, who reaches a very different, older, less tech-savvy, and more conservative audience. He hosted Eric Hoel on a recent episode (not paywalled). Hoel does a very nice job of explaining the risks in sensible, gentle way, but he does focus on x-risk and doesn’t fall back on near-term lesser risks. Roberts appears to be pretty much won over, while having been highly skeptical in past interviews on the topic. I haven’t attempted anything like a systematic review of public responses, but I’ve noted not just increased interest but increased credulity among skeptics.

I think we’re going to see a shift in public opinion about AI development. That will in part be powered by new scares. Those scares also create opportunities for more people in the AGI safety community to engage with the public. We should think about those public presentations before the next scare. The AI safety community is full of highly intelligent, logical people, but on average, public relations is not our strength. It is time for those of us who want to engage the public to get better.

The challenge

I think it’s time to think about our approach now, because we need to get this right. Presenting detailed arguments forcefully is not the way to shift public opinion to your cause. The course of the climate debate is one important object lesson. As the evidence and expert opinion converged, half of the public actually became substantially more skeptical of human-caused climate change.

From:(A Cooling Climate for Change? Party Polarization and the Politics of Global Warming—Deborah Lynn Guber, 2013 (sagepub.com)

My main argument is this: we need to not let the same polarization happen with AGI x-risk. And where it’s already happening, among intellectuals, we need to reverse it. This is a much larger topic than I’m prepared to cover comprehensively. My argument here is that this is something we should learn, debate, and practice.

The reasons for this polarizing shift are unclear. It’s probably not the backfire effect, which appears to happen only in some situations and perhaps among the cognitively-inclined, who will internally create new counterarguments. This is good news. Presenting data and arguments can work.

I’ve done academic research on cognitive biases, so I feel like I have a pretty good idea what went wrong in that climate debate. Probably many fellow rationalists do too. I’d say that this paradoxical effect seems to result from motivated reasoning and confirmation bias, combined with social rewards. It is socially advantageous to share the important beliefs of those around us. Motivated reasoning pushes us subtly to think more in directions that are likely to lead to that sort of social reward. Confirming arguments and evidence both come to mind more easily, and fulfill that subtle drive to believe what will benefit us. And ugh fields drive us away from disconfirming evidence and arguments.

Approaches to convincing the public

It’s probably useful to have decent theories of the problem. But I don’t have a clear idea of how to be persuasive in the public sphere. I’ve looked at practical theories of changing beliefs in real world settings a little bit recently, but only a little, because learning to be persuasive has always seemed dishonest. But I think that’s largely an inappropriate emotional hangup, because the stuff that rings true has nothing to do with lying or even being manipulative. It’s basically about having your audience like and trust you, and not being an asshole. It’s about showing respect and empathy for the person you’re talking to, not pressuring them to make a quick decision, and letting them go off to think it through themselves. These techniques probably don’t work if the truth isn’t on your side, but that seems fine.

I wanted to get my very vague and tentative conclusions out there, but I don’t really have good ideas about how to be persuasive. I want to do more research on that, holding to methods that preserve honesty, and I hope that more folks in this community will share their research and thoughts. We may not want to be in the business of public relations, but at least some of us sort of need to.

I do have ideas about how to fuck up persuading people, and turn them against your beliefs. Some of these I’ve acquired the hard way, some through research. Accidentally implying you think your audience are idiots is one easy way to do it. And that’s tough to avoid, when you’re talking about something you’ve thought about a thousand times as much as the people you’re addressing. For one thing, they are idiots- we all are. Rationalism is an ideal, not an achievable goal, even for those of us who aspire to it. In addition, your audience are particularly idiots in the domain of AI safety, relative to you. But people intuitively pick up on a lack of respect, and they’re not likely to work out where that’s coming from.

Overstating your certainty on various points is one great way to both imply that your audience is an idiot, and to give them an easy way to rationalize that you must really be the idiot- nobody could really be that certain. Communicating effectively under Knightian norms helps clarify how different ways of communicating probabilistic guesses can make one person’s rational best guess sound to another person like sheer hubris. Eliezer may be rational in saying we’re doomed with 99% certainty under certain assumptions; but his audience isn’t going to listen and think carefully while he explains all of those many assumptions. They’ll just move on with their busy day, and assume he’s crazy.

Another factor here is that not everyone makes a hobby of understanding complex new ideas quickly, and very few people adhere to a social group where changing one’s mind in the face of arguments and evidence is highly valued. Rationalists do have a bit of an edge in this particular area, but we’re hardly immune to cognitive biases. Thinking really hard about how motivated reasoning and confirmation bias are at play in various areas of your own mind is one way to develop empathy for the perspective of someone who’s being asked to re-evaluate an important belief on the spot. (This deserves a post titled “I am an idiot”. We’ll see if I have the gumption to write that post.)

I think there’s a big advantage here in that the public’s existing beliefs around AI x-risk often aren’t that strong, and don’t fall across existing tribal lines. The AGI safety community seems to have done a great job so far of not making enemies of the ML community, for instance. I think we need to carefully widen this particular circle, and use the time before the public’s beliefs are solidified to persuade them, probably through honesty, humbleness, and empathy.

Approaches to convincing experts

I’m repeatedly amazed by how easy it is to convince laypeople that self-aware, agentic AI presents an X-risk we should worry about, and I’m equally amazed at how difficult it is to convince experts- those in ML or cognitive science. And those latter are the people we most need to convince, because the public (rationally) takes their cues from those they trust who are more expert than they.

In discussions, I often hear experts deploy absolutely ridiculous arguments, or worse yet, no real argument beyond “that’s ridiculous.” I think of their motivation to defend their own field, or to protect their comfortable worldview. And I get frustrated, because that attitude might well get us all killed. This has often caused me to get more forceful, and to talk as though I think they’re an idiot. This has predictably terrible results.

I’ve recently had a discussion with a close friend who works in ML, including having worked at DeepMind. I’d mostly avoided the topic with him until starting to work directly in AGI safety, because I knew it would be a dangerous a friction point, and I highly value his friendship since he’s intelligent, creative, generous, and kind. The first real exchange went almost as badly as expected. I steeled myself to practice the virtues I mentioned above: listening, staying calm and empathetic, and presenting arguments gently instead of forcefully. We managed a much better exchange that established a central crux, which I think is probably common to the average ML vs. rationalist view: timelines.

His timeline was around 30 years and maybe never, while mine is much shorter. His opinion was coherent: AI x-risk is a potential problem, but it’s not worth talking about because we’re so far from needing to solve it, and the solutions will become clearer as we get closer. Those going on about X-risk seem like a cult that’s talked themselves into something ridiculous and at odds with the actual experts, probably through the same social and cognitive biases that caused the climate change polarization among conservatives. Right now we should be focusing on near-term AI risks, and the many other ills and dangers of the modern world.

I think this is a common crux of disagreement, but I’m sure it’s not the only one. AI scares are reducing that disagreement. We can take advantage of that situation, but only if we get better at persuasion. I intend to do more research and practice on this skill. I hope some of you will join me. Our efforts to date are not encouraging, and look like they may produce polarization rather than steadily shift opinions. But we can do better.

  1. ^

    How public opinion will affect outcomes is a complex discussion, and I’m not ready to offer even a guess, except for the background assumption I use here: the public is going to believe something about AI safety, and it’s likely better if they believe something like the truth.