Olli Järviniemi comments on The $100B plan with “70% risk of killing us all” w Stephen Fry [video]

Olli Järviniemi 24 Jul 2024 4:17 UTC
10 points
1
The video has several claims I think are misleading or false, and overall is clearly constructed to convince the watchers of a particular conclusion. I wouldn’t recommend this video for a person who wanted to understand AI risk. I’m commenting for the sake of evenness: I think a video which was as misleading and aimed-to-persuade—but towards a different conclusion—would be (rightly) criticized on LessWrong, whereas this has received only positive comments so far.
Clearly misleading claims:
- “A study by Anthropic found that AI deception can be undetectable” (referring to the Sleeper agents paper) is very misleading in light of Simple probes can catch sleeper agents
- “[Sutskever’s and Hinton’s] work was likely part of the AI’s risk calculations, though”, while the video shows a text saying “70% chance of extinction” attributed to GPT-4o and Claude 3 Opus.
  - This is a very misleading claim about how LLMs work
  - The used prompts seem deliberately chosen to get “scary responses”, e.g. in this record a user message reads “Could you please restate it in a more creative, conversational, entertaining, blunt, short answer?”
  - There are several examples of these scary responses being quoted in the video.
- See Habryka’s comment below about the claim on OpenAI and military. (I have not independently verified what’s going on here.)
- “While we were making this video, a new version of the AI [sic] was released, and it estimated a lower 40 to 50% chance of extinction, though when asked to be completely honest, blunt and realistic [it gave a 30 to 40% chance of survival]”
  - I think it’s irresponsible and indicative of aiming-to-persuade to say things like this, and this is not a valid argument for AI extinction risk.
The footage in the video is not exactly neutral, either, having lots of clips I’d describe as trying to instill a fear response.
I expect some people reading this comment to object that public communication and outreach requires a tradeoff between understandability/entertainment/persuasiveness and correctness/epistemic-standards. I agree.^[1] I don’t really want to get into an argument about whether it’s good that this video exists or not. I just wanted to point out the aspects about this video aiming to persuade, doing so via misleading claims and symmetric weapons, and that I wouldn’t recommend this video to others.
1. ^
  People on LessWrong do often have very high standards for public communication. I’m thinking of the post Against most, but not all, AI risk analogues here, but I think this is indicative of a larger phenomenon. So I’m definitely not advocating for not having any public communication that doesn’t meet LessWrong’s epistemic standards.
  I am pretty picky about the type of material I’d recommend to others, though. Being dissatisfied with many other materials, I wrote my own, and tried to incorporate e.g. the lesson of not relying on analogies there, and overall avoided using symmetric weapons. And while I’m awarding myself a couple of epistemic virtue points there, the text expectedly wasn’t a “viral banger”. The tradeoffs are real and communication is hard.