oh man. so, I agree this is a thing, and up to this point, it’s great -
To address capabilities denial, we can draw on insights from the literature on science denial more broadly. Here are some specific lessons that may be helpful:
I’m really hesitant about the core suggestions in GPT4′s version—they feel like bland RLHF’d PR suggestions. Dropping the intro into Claude+, [citation needed] added by me:
Focus on facts over values. Debates about AI capabilities are often polarized along ideological lines[citation needed], with different sides arguing for opposing visions of the future. While values and preferences are important to consider, we must ground these discussions in the facts about what AI can and cannot currently do. Present the evidence for AI capabilities in a clear, measured way.
Explain why capabilities matter. People may deny AI capabilities because they do not fully understand why they are concerning or how they relate to existential risks[citation needed]. We need to do a better job of articulating why capabilities like general reasoning, transfer learning, and unsupervised learning could be dangerous if misaligned. Help people connect the dots.
Address the root causes. Do not attack the individuals but instead focus on the root causes of their beliefs. Gently question assumptions, cognitive biases, and ideological commitments to open minds and build understanding. Show empathy for their perspectives while also stating the facts.
Watch your own biases. It is important to acknowledge our own biases and avoid overstating the case for AI capabilities. While we should avoid capabilities denial, we must also be realistic and precise in how we discuss AI. Do not contribute to hype and doomsday scenarios, which can be counterproductive.
Emphasize responsibility. Science denial is often motivated by a desire to absolve responsibility or avoid difficult problems. Highlight how accurately assessing AI capabilities is crucial to taking responsibility and solving issues like alignment proactively. We all must do our part to ensure a good future with AI.
Build consensus. Find common ground and shared concerns to bring people together, rather than emphasizing disagreements[citation needed]. A unified front will be needed to address the existential risks of advanced AI. While we may have different views on capabilities and timelines, we can work together toward a future of beneficial AI.
Conclusion
Capacities denial poses a real threat to AI safety, as it can lead to lack of awareness, inadequate preparation, and inability to proactively address risks. However, by applying lessons from other forms of science denial and communicating about AI with empathy and facts, we can work to counter capabilities denial and build a future where humans and AI systems thrive together. The time for action is now. We must get ahead of existential catastrophe, whatever capabilities current and future AI may hold.
I actually ran it a couple of times (which was hard to keep track of due to the current tech issues). There were more complex versions (like versions that went over analogies involving specific climate change organizations), but I liked this version better. “bland RLHF’d PR suggestions” are useful when the problem involves PR and humans.
I would’ve probably went into more detail about “call these people science deniers” thing. It frustrates me that public is thinking that those denying capabilities are the experts on capabilities. But GPT-4′s suggestions are probably more actionable than mine. It also seemed to have higher signal-to-noise ratio than something I would write.
Hmm. To clarify, I mean that the suggestions from GPT4 feel low on substance about how to clarify while maintaining reputation, and are focused on PR instead
I think capabilities denial is basically a PR problem. This is different from denying the importance of the alignment problem; people are peddling pseudo-scientific explanations about why the AIs “seem” capable.
By contrast, I think alignment is still fuzzy enough that there is no scientific consensus, so techniques for dealing with science denial are less applicable.
PR and communication are not the same thing. It seems to me to be a communication problem; maintaining a positive affect for a brand is not the goal, which it would need to be in order for the term “PR” to be appropriate. The difference between reputation and PR is that, if communicating well in order to better explain a situation also happens to reduce the positive affect for the folks doing the communicating, then that’s still a success; honesty and accurate affect must be the goal for a communication to be reputation-maintenance seeking.
This is really just scientific communication anyhow—the variable we want people to have more accurate models of is “what can ai do now, and what might it be able to do soon?” not anything about any human’s intent or honor.
oh man. so, I agree this is a thing, and up to this point, it’s great -
I’m really hesitant about the core suggestions in GPT4′s version—they feel like bland RLHF’d PR suggestions. Dropping the intro into Claude+, [citation needed] added by me:
I actually ran it a couple of times (which was hard to keep track of due to the current tech issues). There were more complex versions (like versions that went over analogies involving specific climate change organizations), but I liked this version better. “bland RLHF’d PR suggestions” are useful when the problem involves PR and humans.
I would’ve probably went into more detail about “call these people science deniers” thing. It frustrates me that public is thinking that those denying capabilities are the experts on capabilities. But GPT-4′s suggestions are probably more actionable than mine. It also seemed to have higher signal-to-noise ratio than something I would write.
Hmm. To clarify, I mean that the suggestions from GPT4 feel low on substance about how to clarify while maintaining reputation, and are focused on PR instead
I think capabilities denial is basically a PR problem. This is different from denying the importance of the alignment problem; people are peddling pseudo-scientific explanations about why the AIs “seem” capable.
By contrast, I think alignment is still fuzzy enough that there is no scientific consensus, so techniques for dealing with science denial are less applicable.
PR and communication are not the same thing. It seems to me to be a communication problem; maintaining a positive affect for a brand is not the goal, which it would need to be in order for the term “PR” to be appropriate. The difference between reputation and PR is that, if communicating well in order to better explain a situation also happens to reduce the positive affect for the folks doing the communicating, then that’s still a success; honesty and accurate affect must be the goal for a communication to be reputation-maintenance seeking.
This is really just scientific communication anyhow—the variable we want people to have more accurate models of is “what can ai do now, and what might it be able to do soon?” not anything about any human’s intent or honor.