Matthew Barnett comments on Against most, but not all, AI risk analogies

Matthew Barnett 14 Jan 2024 23:41 UTC
5 points
−4
I agree with the broad message of what I interpret you to be saying, and I do agree there’s some value in analogies, as long as they are used carefully (as I conceded in the post). That said, I have some nitpicks with the way you frame the issue:
In particular,
- future AI is like aliens in some ways but very unlike aliens in other ways;
- future AI is like domesticated animals in some ways but very unlike them in other ways;
- future AI is like today’s LLMs in some ways but very unlike them in other ways;
etc. All these analogies can be helpful or misleading, depending on what is being argued.
I think it’s literally true that these analogies can be helpful or misleading depending on what’s being argued. However, my own personal experience of these things, based on my admittedly unreliable memory, is that when I use the “AIs will be like domesticated animals” analogy I generally get way more pushback, at least around spaces like LessWrong, than I think I’d get if I used the “AIs will be like aliens” analogy.
And this, I feel, is pretty irrational. The pushback isn’t necessarily irrational. Don’t misunderstand me: there are disanalogous elements here. It’s the selective pushback that I’m mainly complaining about. Are AIs really so similar to aliens — something we have literally no actual experience with — but aren’t similar to real physical objects that we are familiar with like LLMs and domesticated animals? For crying out loud, LLMs are already considered “AIs” by most people! How could they be a worse analogy for AI, across the board, than extraterrestrial beings that we have never come in contract with?
- Steven Byrnes 15 Jan 2024 1:52 UTC
  20 points
  16
  Parent
  Ideally, people invoke analogies in order to make a point. And then readers / listeners will argue about whether the point is valid or invalid, and (relatedly) whether the analogy is illuminating or misleading. I think it’s really bad to focus discussion on, and police, the analogy target, i.e. to treat certain targets as better or worse, in and of themselves, separate from the point that’s being made.
  For example, Nora was just comparing LLMs to mattresses. And I opened my favorite physics textbook to a random page and there was an prominent analogy between electromagnetic fields and shaking strings. And, heck, Shakespeare compared a woman to a summer’s day!
  So when you ask whether AIs are overall more similar to aliens, versus more similar to LLMs, then I reject the question! It’s off-topic. Overall, mattresses & LLMs are very different, and electric fields & strings are very different, and women & summer’s days are very different. But there’s absolutely nothing wrong with analogizing them!
  And if someone complained “um, excuse me, but I have to correct you here, actually, LLMs and mattresses are very different, you see, for example, you can sleep on mattresses but you can’t sleep on LLMs, and therefore, Nora, you should not be saying that LLMs are like mattresses”, then I would be very annoyed at that person, and I would think much less of him. (We’ve all talked to people like that, right?)
  …And I was correspondingly unhappy to see this post, because I imagine it backing up the annoying-person-who-is-totally-missing-the-point from the previous paragraph. I imagine him saying “You see? I told you! LLMs really are quite different from mattresses, and you shouldn’t analogize them. Check this out, here’s a 2000-word blog post backing me up.”
  Of course, people policing the target of analogies (separately from the point being made in context) is a thing that happens all the time, on all sides. I don’t like it, and I want it to stop, and I see this post as pushing things in the wrong direction. For example, this thread is an example where I was defending myself against analogy-target-policing last month. I stand by my analogizing as being appropriate and helpful in context. I’m happy to argue details if you’re interested—it’s a nuclear analogy :-P
  I can’t speak to your experience, but some of my reactions to your account are:
  - if people are policing your analogies between AI and domestic animals, the wrong response is to say that we should instead police analogies between AI and aliens; the right response is to say that analogy-target-policing is the wrong move and we should stop it altogether: we should not police the target of an analogy independently from the point being made in context.
  - I wonder if what you perceive to be analogy-target-policing is (at least sometimes) actually people just disagreeing with the point that you’re making, i.e. saying that the analogy is misleading in context
  - Yes lesswrong has some partisans who will downvote anything with insufficiently doomy vibes without thinking too hard about it, sorry, I’m not happy about that either :-P (And vice-versa to some extent on EAForum… or maybe EAF has unthinking partisans on both sides, not sure. And Twitter definitely has an infinite number of annoying unthinking partisans on both sides of every issue.)
  For crying out loud, LLMs are already considered “AIs” by most people!
  FWIW, you and me and everyone is normally trying to talk about “future AI that might pose an x-risk”, which everyone agrees does not yet exist. A different category is “AI that does not pose an x-risk”, and this is a very big tent, containing everything from Cyc and GOFAI to MuZero and (today’s) LLMs. So the fact that people call some algorithm X by the term “AI” doesn’t in and of itself imply that X is similar to “future AI that might pose an x-risk”, in any nontrivial way—it only implies that in the (trivial) ways that LLMs and MuZero and Cyc are all similar to each other (e.g. they all run on computers).
  Now, there is a hypothesis that “AI that might pose an x-risk” is especially similar to LLMs in particular—much more than it is similar to Cyc, or to MuZero. I believe that you put a lot of stock in that hypothesis. And that’s fine—it’s not a crazy hypothesis, even if I happen personally to doubt it. My main complaint is when people forget that it’s a hypothesis, but rather treat it as self-evident truth. (One variant of this is people who understand how LLMs work but don’t understand how MuZero or any other kind of ML works, and instead they just assume that everything in ML is pretty similar to LLMs. I am not accusing you of that.)
  What links here?
  - Steven Byrnes's comment on On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI by JWS 🔸 (EA Forum; 17 Jun 2024 3:16 UTC; 18 points)
  - Steven Byrnes's comment on Biorisk is an Unhelpful Analogy for AI Risk by Davidmanheim (6 May 2024 11:35 UTC; 3 points)
- Joe Collman 15 Jan 2024 1:15 UTC
  7 points
  2
  Parent
  For crying out loud, LLMs are already considered “AIs” by most people! How could they be a worse analogy for AI, across the board, than extraterrestrial beings that we have never come in contract with?
  By tending to lead to overconfidence.
  An aliens analogy is explicitly relying on [we have no idea what this will do]. It’s easy to imagine friendly aliens, just as it’s easy to imagine unfriendly ones, or entirely disinterested ones. The analogy is unlikely to lead to a highly specific, incorrect model.
  This is not true for LLMs. It’s easy to assume that particular patterns will continue to hold—e.g. that it’ll be reasonably safe to train systems with something like our current degree of understanding.
  To be clear, I’m not saying they’re worse in terms of information content: I’m saying they can be worse in the terms you’re using to object to analogies: “routinely conveying the false impression of a specific, credible model of AI”.
  I think it’s correct that we should be very wary of the use of analogies (though they’re likely unavoidable).
  However, the cases where we need to be the most wary are those that seem most naturally applicable—these are the cases that are most likely to lead to overconfidence. LLMs, [current NNs], or [current AI systems generally] are central examples here.
  On asymmetric pushback, I think you’re correct, but that you’ll tend to get an asymmetry everywhere between [bad argument for conclusion most people agree with] and [bad argument for conclusion most people disagree with].
  People have limited time. They’ll tend to put a higher value on critiquing invalid-in-their-opinion arguments when those lead to incorrect-in-their-opinion conclusions (at least unless they’re deeply involved in the discussion).
  There’s also an asymmetry in terms of consequences-of-mistakes here: if we think that AI will be catastrophic, and are wrong, this causes a delay, a large loss of value, and a small-but-significant increase in x-risk; if we think that AI will be non-catastrophic, and are wrong, we’re dead.
  Lack of pushback shouldn’t be taken as a strong indication that people agree with the argumentation used.
  Clearly this isn’t ideal.
  I do think it’s worth thinking about mechanisms to increase the quality of argument.
  E.g. I think the ability to emoji react to particular comment sections is helpful here—though I don’t think there’s one that’s great for [analogy seems misleading] as things stand. Perhaps there should be a [seems misleading] react?? (I don’t think “locally invalid” covers this)
  - Matthew Barnett 15 Jan 2024 1:59 UTC
    2 points
    −3
    Parent
    An aliens analogy is explicitly relying on [we have no idea what this will do]. It’s easy to imagine friendly aliens, just as it’s easy to imagine unfriendly ones, or entirely disinterested ones. The analogy is unlikely to lead to a highly specific, incorrect model.
    
    As a matter of fact I think the word “alien” often evokes a fairly specific caricature that is separate from “something that’s generically different and hard to predict”. But it’s obviously hard for me to prove what’s going on in people’s minds, so I’ll just say what tends to flashes in my mind when I think of aliens:
    
    Beings who have no shared history with us, as they were created under a completely separate evolutionary history
    
    A Hollywood stock image of an alien species that is bent on some goal, such as extermination (i.e. rapacious creatures who will stop at nothing to achieve something)
    
    A being that does not share our social and cultural concepts
    
    I think these things are often actually kind of being jammed into the analogy, intended or not, and the question of how much future AIs will share these properties is still an open question. I think we should not merely assume these things.
- Vladimir_Nesov 15 Jan 2024 10:07 UTC
  2 points
  1
  Parent
  
  Are AIs really so similar to aliens — something we have literally no actual experience with — but aren’t similar to real physical objects that we are familiar with like LLMs and domesticated animals?
  
  Being real or familiar has nothing to do with being similar to a given thing.