And yet you immediately use an analogy to make your model of AI progress more intuitively digestible and convincing
I was upfront about my intention with my language in that section. Portraying me as contradicting myself is misleading because I was deliberately being evocative in the section you critique, rather than trying to present an argument. That was the whole point. The language you criticized was marked as a separate section in my post in which I wrote:
Part of this is that I don’t share other people’s picture about what AIs will actually look like in the future. This is only a small part of my argument, because my main point is that that we should use analogies much less frequently, rather than switch to different analogies that convey different pictures. But this difference in how I view the future still plays a significant role in my frustration at the usage of AI risk analogies.
Maybe you think, for example, that the alien and animal analogies are great for reasons that I’m totally missing. But it’s still hard for me to see that. At least, let me compare my picture, and maybe you can see where I’m coming from.
I have now added this paragraph to the post to make my intention more clear:
“Again: The next section is not an argument. It is a deliberately evocative picture, to help compare my expectations of the future against the analogies I cited above. My main point in this post is that we should move away from a dependence on analogies, but if you need a “picture” of what I expect from AI, to compare it to your own, here is mine.”
More generally, my point is not that analogies should never be used, or that we should never try to use evocative language to describe things. My point is that we should be much more rigorous about when we’re presenting analogies, and use them much less frequently when presenting specific positions. When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
In the same way you find plenty of reasons why my evocative picture is misleading, I think there are plenty of ways the “AIs are alien” picture is misleading. We should not be selective about our use of analogies. Instead, we should try to reason through these things more carefully.
Fair enough. But in this case, what specifically are you proposing, then? Can you provide an example of the sort of object-level argument for your model of AI risk, that is simultaneously (1) entirely free of analogies and (2) is sufficiently evocative plus short plus legible, such that it can be used for effective messaging to people unfamiliar with the field (including the general public)?
When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
Because I’m pretty sure that as far as actual technical discussions and comprehensive arguments go, people are already doing that. Like, for every short-and-snappy Eliezer tweet about shoggoth actresses, there’s a text-wall-sized Eliezer tweet outlining his detailed mental model of misalignment.
Fair enough. But in this case, what specifically are you proposing, then?
In this post, I’m not proposing a detailed model. I hope in the near future I can provide such a detailed model. But I hope you’d agree that it shouldn’t be a requirement that, to make this narrow point about analogies, I should need to present an entire detailed model of the alignment problem. Of course, such a model would definitely help, and I hope I can provide something like it at some point soon (time and other priorities permitting), but I’d still like to separately make my point about analogies as an isolated thesis regardless.
My counter-point was meant to express skepticism that it is actually realistically possible for people to switch to non-analogy-based evocative public messaging. I think inventing messages like this is a very tightly constrained optimization problem, potentially an over-constrained one, such that the set of satisfactory messages is empty. I think I’m considerably better at reframing games than most people, and I know I would struggle with that.
I agree that you don’t necessarily need to accompany any criticism you make with a ready-made example of doing better. Simply pointing out stuff you think is going wrong is completely valid! But a ready-made example of doing better certainly greatly enhances your point: an existence proof that you’re not demanding the impossible.
That’s why I jumped at that interpretation regarding your AI-Risk model in the post (I’d assumed you were doing it), and that’s why I’m asking whether you could generate such a message now.
I hope in the near future I can provide such a detailed model
To be clear, I would be quite happy to see that! I’m always in the market for rhetorical innovations, and “succinct and evocative gears-level public-oriented messaging about AI Risk” would be a very powerful tool for the arsenal. But I’m a-priori skeptical.
I was upfront about my intention with my language in that section. Portraying me as contradicting myself is misleading because I was deliberately being evocative in the section you critique, rather than trying to present an argument. That was the whole point. The language you criticized was marked as a separate section in my post in which I wrote:
I have now added this paragraph to the post to make my intention more clear:
“Again: The next section is not an argument. It is a deliberately evocative picture, to help compare my expectations of the future against the analogies I cited above. My main point in this post is that we should move away from a dependence on analogies, but if you need a “picture” of what I expect from AI, to compare it to your own, here is mine.”
More generally, my point is not that analogies should never be used, or that we should never try to use evocative language to describe things. My point is that we should be much more rigorous about when we’re presenting analogies, and use them much less frequently when presenting specific positions. When making a precise claim, we should generally try to reason through it using concrete evidence and models instead of relying heavily on analogies.
In the same way you find plenty of reasons why my evocative picture is misleading, I think there are plenty of ways the “AIs are alien” picture is misleading. We should not be selective about our use of analogies. Instead, we should try to reason through these things more carefully.
Fair enough. But in this case, what specifically are you proposing, then? Can you provide an example of the sort of object-level argument for your model of AI risk, that is simultaneously (1) entirely free of analogies and (2) is sufficiently evocative plus short plus legible, such that it can be used for effective messaging to people unfamiliar with the field (including the general public)?
Because I’m pretty sure that as far as actual technical discussions and comprehensive arguments go, people are already doing that. Like, for every short-and-snappy Eliezer tweet about shoggoth actresses, there’s a text-wall-sized Eliezer tweet outlining his detailed mental model of misalignment.
In this post, I’m not proposing a detailed model. I hope in the near future I can provide such a detailed model. But I hope you’d agree that it shouldn’t be a requirement that, to make this narrow point about analogies, I should need to present an entire detailed model of the alignment problem. Of course, such a model would definitely help, and I hope I can provide something like it at some point soon (time and other priorities permitting), but I’d still like to separately make my point about analogies as an isolated thesis regardless.
My counter-point was meant to express skepticism that it is actually realistically possible for people to switch to non-analogy-based evocative public messaging. I think inventing messages like this is a very tightly constrained optimization problem, potentially an over-constrained one, such that the set of satisfactory messages is empty. I think I’m considerably better at reframing games than most people, and I know I would struggle with that.
I agree that you don’t necessarily need to accompany any criticism you make with a ready-made example of doing better. Simply pointing out stuff you think is going wrong is completely valid! But a ready-made example of doing better certainly greatly enhances your point: an existence proof that you’re not demanding the impossible.
That’s why I jumped at that interpretation regarding your AI-Risk model in the post (I’d assumed you were doing it), and that’s why I’m asking whether you could generate such a message now.
To be clear, I would be quite happy to see that! I’m always in the market for rhetorical innovations, and “succinct and evocative gears-level public-oriented messaging about AI Risk” would be a very powerful tool for the arsenal. But I’m a-priori skeptical.