Thane Ruthenis comments on Against most, but not all, AI risk analogies

Thane Ruthenis 14 Jan 2024 15:38 UTC
31 points
12
My point is that we should stop relying on analogies in the first place. Use detailed object-level arguments instead!
And yet you immediately use an analogy to make your model of AI progress more intuitively digestible and convincing:
I expect AIs will be born directly into our society, deliberately shaped by us, for the purpose of filling largely human-shaped holes in our world
That evokes the image of entities not unlike human children. The language following this line only reinforces that image, and thereby sneaks in an entire cluster of children-based associations. Of course the progress will be incremental! It’ll be like the change of human generations. And they will be “socially integrated with us”, so of course they won’t grow up to be alien and omnicidal! Just like our children don’t all grow up to be omnicidal. Plus, they...
… will be numerous and everywhere, interacting with us constantly, assisting us, working with us, and even providing friendship to hundreds of millions of people.
That sentence only sounds reassuring because the reader is primed with the model of AIs-as-children. Having lots of social-bonding time with your child, and having them interact with the community, is good for raising happy children who grow up how you want them to. The text already implicitly establishes that AIs are going to be just like human children. Thus, having lots of social-bonding time with AIs and integrating them into the community is going to lead to aligned AIs. QED.
Stripped of this analogizing, none of what this sentence says is a technical argument for why AIs will be safe or controllable or steerable. Nay, the opposite: if the paragraph I’m quoting from started by talking about incomprehensible alien intelligences with opaque goals tenuously inspired by a snapshot of the Internet containing lots of data on manipulating humans, the idea that they’d be “numerous” and “everywhere” and “interacting with us constantly” and “providing friendship” (something notably distinct from “being friends”, eh?) would have sounded starkly worrying.
The way the argument is shaped here is subtler than most cases of argument-by-analogy, in that you don’t literally say “AIs will be like human children”. But the association is very much invoked, and has a strong effect on your message.
And I would argue this is actually worse than if you came out and made a direct argument-by-analogy, because it might fool somebody into thinking you’re actually making an object-level technical argument. At least if the analogizing is direct and overt, someone can quickly see what your model is based on, and swiftly move onto picking at the ways in which the analogy may be invalid.
The alternative being demonstrated here is that we essentially have to have all the same debates, but through a secondary layer of metaphor, at which we’re pretending that these analogy-rooted arguments are actually Respectably Technical, meaning we’re only allowed to refute them by (likely much more verbose and hard-to-parse) Respectably Technical counter-arguments.
And I think AI Risk debates are already as tedious as they need to be.
The broader point I’m making here is that, unless you can communicate purely via strict provable mathematical expressions, you ain’t getting rid of analogies.
I do very much agree that there are some issues with the way analogies are used in the AI-risk discourse. But I don’t think “minimize the use of analogies” is good advice. If anything, I think analogies improve the clarify and the bandwidth of communication, by letting people more easily understand each others’ positions and what reference classes others are drawing on when making their points.
You’re talking about sneaking-in assumptions – well, as I’d outlined above, analogies are actually relatively good about that. When you’re directly invoking an analogy, you come right out and say what assumptions you’re invoking!
- Matthew Barnett 14 Jan 2024 18:46 UTC
  15 points
  7
  Parent
  
  And yet you immediately use an analogy to make your model of AI progress more intuitively digestible and convincing
  
  I was upfront about my intention with my language in that section. Portraying me as contradicting myself is misleading because I was deliberately being evocative in the section you critique, rather than trying to present an argument. That was the whole point. The language you criticized was marked as a separate section in my post in which I wrote:
  
  Part of this is that I don’t share other people’s picture about what AIs will actually look like in the future. This is only a small part of my argument, because my main point is that that we should use analogies much less frequently, rather than switch to different analogies that convey different pictures. But this difference in how I view the future still plays a significant role in my frustration at the usage of AI risk analogies.
  
  Maybe you think, for example, that the alien and animal analogies are great for reasons that I’m totally missing. But it’s still hard for me to see that. At least, let me compare my picture, and maybe you can see where I’m coming from.
  
  I have now added this paragraph to the post to make my intention more clear:
  
  “Again: The next section is not an argument. It is a deliberately evocative picture, to help compare my expectations of the future against the analogies I cited above. My main point in this post is that we should move away from a dependence on analogies, but if you need a “picture” of what I expect from AI, to compare it to your own, here is mine.”
  
  More generally, my point is not that analogies should never be used, or that we should never try to use evocative language to describe things. My point is that we should be much more rigorous about when we’re presenting analogies, and use them much less frequently when presenting specific positions. When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
  
  In the same way you find plenty of reasons why my evocative picture is misleading, I think there are plenty of ways the “AIs are alien” picture is misleading. We should not be selective about our use of analogies. Instead, we should try to reason through these things more carefully.
  - Thane Ruthenis 14 Jan 2024 19:10 UTC
    14 points
    10
    Parent
    Fair enough. But in this case, what specifically are you proposing, then? Can you provide an example of the sort of object-level argument for your model of AI risk, that is simultaneously (1) entirely free of analogies and (2) is sufficiently evocative plus short plus legible, such that it can be used for effective messaging to people unfamiliar with the field (including the general public)?
    When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
    Because I’m pretty sure that as far as actual technical discussions and comprehensive arguments go, people are already doing that. Like, for every short-and-snappy Eliezer tweet about shoggoth actresses, there’s a text-wall-sized Eliezer tweet outlining his detailed mental model of misalignment.
    - Matthew Barnett 14 Jan 2024 20:51 UTC
      5 points
      4
      Parent
      Fair enough. But in this case, what specifically are you proposing, then?
      In this post, I’m not proposing a detailed model. I hope in the near future I can provide such a detailed model. But I hope you’d agree that it shouldn’t be a requirement that, to make this narrow point about analogies, I should need to present an entire detailed model of the alignment problem. Of course, such a model would definitely help, and I hope I can provide something like it at some point soon (time and other priorities permitting), but I’d still like to separately make my point about analogies as an isolated thesis regardless.
      - Thane Ruthenis 14 Jan 2024 21:47 UTC
        9 points
        9
        Parent
        My counter-point was meant to express skepticism that it is actually realistically possible for people to switch to non-analogy-based evocative public messaging. I think inventing messages like this is a very tightly constrained optimization problem, potentially an over-constrained one, such that the set of satisfactory messages is empty. I think I’m considerably better at reframing games than most people, and I know I would struggle with that.
        I agree that you don’t necessarily need to accompany any criticism you make with a ready-made example of doing better. Simply pointing out stuff you think is going wrong is completely valid! But a ready-made example of doing better certainly greatly enhances your point: an existence proof that you’re not demanding the impossible.
        That’s why I jumped at that interpretation regarding your AI-Risk model in the post (I’d assumed you were doing it), and that’s why I’m asking whether you could generate such a message now.
        I hope in the near future I can provide such a detailed model
        To be clear, I would be quite happy to see that! I’m always in the market for rhetorical innovations, and “succinct and evocative gears-level public-oriented messaging about AI Risk” would be a very powerful tool for the arsenal. But I’m a-priori skeptical.
- TAG 15 Jan 2024 3:40 UTC
  7 points
  5
  Parent
  Analogies can be bad argumentation, while simultaneously being good illustration.
  - Richard_Kennaway 15 Jan 2024 17:01 UTC
    2 points
    0
    Parent
    That the analogy is a good illustration is what has to be argued.
    - TAG 15 Jan 2024 19:24 UTC
      7 points
      2
      Parent
      An analogy can be a good illustration of a concept that is entirely wrong.