Thane Ruthenis comments on Against most, but not all, AI risk analogies

Thane Ruthenis 14 Jan 2024 19:10 UTC
14 points
10
Fair enough. But in this case, what specifically are you proposing, then? Can you provide an example of the sort of object-level argument for your model of AI risk, that is simultaneously (1) entirely free of analogies and (2) is sufficiently evocative plus short plus legible, such that it can be used for effective messaging to people unfamiliar with the field (including the general public)?
When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
Because I’m pretty sure that as far as actual technical discussions and comprehensive arguments go, people are already doing that. Like, for every short-and-snappy Eliezer tweet about shoggoth actresses, there’s a text-wall-sized Eliezer tweet outlining his detailed mental model of misalignment.
- Matthew Barnett 14 Jan 2024 20:51 UTC
  5 points
  4
  Parent
  Fair enough. But in this case, what specifically are you proposing, then?
  In this post, I’m not proposing a detailed model. I hope in the near future I can provide such a detailed model. But I hope you’d agree that it shouldn’t be a requirement that, to make this narrow point about analogies, I should need to present an entire detailed model of the alignment problem. Of course, such a model would definitely help, and I hope I can provide something like it at some point soon (time and other priorities permitting), but I’d still like to separately make my point about analogies as an isolated thesis regardless.
  - Thane Ruthenis 14 Jan 2024 21:47 UTC
    9 points
    9
    Parent
    My counter-point was meant to express skepticism that it is actually realistically possible for people to switch to non-analogy-based evocative public messaging. I think inventing messages like this is a very tightly constrained optimization problem, potentially an over-constrained one, such that the set of satisfactory messages is empty. I think I’m considerably better at reframing games than most people, and I know I would struggle with that.
    I agree that you don’t necessarily need to accompany any criticism you make with a ready-made example of doing better. Simply pointing out stuff you think is going wrong is completely valid! But a ready-made example of doing better certainly greatly enhances your point: an existence proof that you’re not demanding the impossible.
    That’s why I jumped at that interpretation regarding your AI-Risk model in the post (I’d assumed you were doing it), and that’s why I’m asking whether you could generate such a message now.
    I hope in the near future I can provide such a detailed model
    To be clear, I would be quite happy to see that! I’m always in the market for rhetorical innovations, and “succinct and evocative gears-level public-oriented messaging about AI Risk” would be a very powerful tool for the arsenal. But I’m a-priori skeptical.