Agree with lots of this– a few misc thoughts [hastily written]:
I think the Overton Window frame ends up getting people to focus too much on the dimension “how radical is my ask”– in practice, things are usually much more complicated than this. In my opinion, a preferable frame is something like “who is my target audience and what might they find helpful.” If you’re talking to someone who makes it clear that they will not support X, it’s silly to keep on talking about X. But I think the “target audience first” approach ends up helping people reason in a more sophisticated way about what kinds of ideas are worth bringing up. As an example, in my experience so far, many policymakers are curious to learn more about intelligence explosion scenarios and misalignment scenarios (the more “radical” and “speculative” threat models).
I don’t think it’s clear that the more effective actors in DC tend to be those who look for small wins. Outside of the AIS community, there sure do seem to be a lot of successful organizations that take hard-line positions and (presumably) get a lot of their power/influence from the ideological purity that they possess & communicate. Whether or not these organizations end up having more or less influence than the more “centrist” groups is, in my view, not a settled question & probably varies a lot by domain. In AI safety in particular, I think my main claim is something like “pretty much no group– whether radical or centrist– has had tangible wins. When I look at the small set of tangible wins, it seems like the groups involved were across the spectrum of “reasonableness.”
The more I interact with policymakers, the more I’m updating toward something like “poisoning the well doesn’t come from having radical beliefs– poisoning the well comes from lamer things like being dumb or uninformed, wasting peoples’ time, not understanding how the political process works, not having tangible things you want someone to do, explaining ideas poorly, being rude or disrespectful, etc.” I’ve asked ~20-40 policymakers (outside of the AIS bubble) things like “what sorts of things annoy you about meetings” or “what tends to make meetings feel like a waste of your time”, and no one ever says “people come in with ideas that are too radical.” The closest thing I’ve heard is people saying that they dislike it when groups fail to understand why things aren’t able to happen (like, someone comes in thinking their idea is great, but then they fail to understand that their idea needs approval from committee A and appropriations person B and then they’re upset about why things are moving slowly). It seems to me like many policy folks (especially staffers and exec branch subject experts) are genuinely interested in learning more about the beliefs and worldviews that have been prematurely labeled as “radical” or “unreasonable” (or perhaps such labels were appropriate before chatGPT but no longer are).
A reminder that those who are opposed to regulation have strong incentives to make it seem like basically-any-regulation is radical/unreasonable. An extremely common tactic is for industry and its allies to make common-sense regulation seem radical/crazy/authoritarian & argue that actually the people proposing strong policies are just making everyone look bad & argue that actually we should all rally behind [insert thing that isn’t a real policy.] (I admit this argument is a bit general, and indeed I’ve made it before, so I won’t harp on it here. Also I don’t think this is what Trevor is doing– it is indeed possible to raise serious discussions about “poisoning the well” even if one believes that the cultural and economic incentives disproportionately elevate such points).
In the context of AI safety, it seems to me like the most high-influence Overton Window moves have been positive– and in fact I would go as far as to say strongly positive. Examples that come to mind include the CAIS statement, FLI pause letter, Hinton leaving Google, Bengio’s writings/speeches about rogue AI & loss of control, Ian Hogarth’s piece about the race to god-like AI, and even Yudkowsky’s TIME article.
I think some of our judgments here depend on underlying threat models and an underlying sense of optimism vs. pessimism. If one things that labs making voluntary agreements/promises and NIST contributing to the development of voluntary standards are quite excellent ways to reduce AI risk, then the groups that have helped make this happen deserve a lot of credit. If one thinks that much more is needed to meaningfully reduce xrisk, then the groups that are raising awareness about the nature of the problem, making high-quality arguments about threat models, and advocating for stronger policies deserve a lot of credit.
I agree that more research on this could be useful. But I think it would be most valuable to focus less on “is X in the Overton Window” and more on “is X written/explained well and does it seem to have clear implications for the target stakeholders?”
Re: how over-emphasis on “how radical is my ask” vs “what my target audience might find helpful” and generally the importance of making your case well regardless of how radical it is, that makes sense. Though notably the more radical your proposal is (or more unfamiliar your threat models are), the higher the bar for explaining it well, so these do seem related.
Re: more effective actors looking for small wins, I agree that it’s not clear, but yeah, seems like we are likely to get into some reference class tennis here. “A lot of successful organizations that take hard-line positions and (presumably) get a lot of their power/influence from the ideological purity that they possess & communicate”? Maybe, but I think of like, the agriculture lobby, who just sort of quietly make friends with everybody and keep getting 11-figure subsidies every year, in a way that (I think) resulted more from gradual ratcheting than making a huge ask. “Pretty much no group– whether radical or centrist– has had tangible wins” seems wrong in light of the EU AI Act (where I think both a “radical” FLI and a bunch of non-radical orgs were probably important) and the US executive order (I’m not sure which strategy is best credited there, but I think most people would have counted the policies contained within it as “minor asks” relative to licensing, pausing, etc). But yeah I agree that there are groups along the whole spectrum that probably deserve credit.
Re: poisoning the well, again, radical-ness and being dumb/uninformed are of course separable but the bar rises the more radical you get, in part because more radical policy asks strongly correlate with more complicated procedural asks; tweaking ECRA is both non-radical and procedurally simple, creating a new agency to license training runs is both outside the DC Overton Window and very procedurally complicated.
Re: incentives, I agree that this is a good thing to track, but like, “people who oppose X are incentivized to downplay the reasons to do X” is just a fully general counterargument. Unless you’re talking about financial conflicts of interest, but there are also financial incentives for orgs pursuing a “radical” strategy to downplay boring real-world constraints, as well as social incentives (e.g. on LessWrong IMO) to downplay boring these constraints and cognitive biases against thinking your preferred strategy has big downsides.
I agree that the CAIS statement, Hinton leaving Google, and Bengio and Hogarth’s writing have been great. I think that these are all in a highly distinct category from proposing specific actors take specific radical actions (unless I’m misremembering the Hogarth piece). Yudkowsky’s TIME article, on the other hand, definitely counts as an Overton Window move, and I’m surprised that you think this has had net positive effects. I regularly hear “bombing datacenters” as an example of a clearly extreme policy idea, sometimes in a context that sounds like it maybe made the less-radical idea seem more reasonable, but sometimes as evidence that the “doomers” want to do crazy things and we shouldn’t listen to them, and often as evidence that they are at least socially clumsy, don’t understand how politics works, etc, which is related to the things you list as the stuff that actually poisons the well. (I’m confused about the sign of the FLI letter as we’ve discussed.)
I’m not sure optimism vs pessimism is a crux, except in very short, like, 3-year timelines. It’s true that optimists are more likely to value small wins, so I guess narrowly I agree that a ratchet strategy looks strictly better for optimists, but if you think big radical changes are needed, the question remains of whether you’re more likely to get there via asking for the radical change now or looking for smaller wins to build on over time. If there simply isn’t time to build on these wins, then yes, better to take a 2% shot at the policy that you actually think will work; but even in 5-year timelines I think you’re better positioned to get what you ultimately want by 2029 if you get a little bit of what you want in 2024 and 2026 (ideally while other groups also make clear cases for the threat models and develop the policy asks, etc.). Another piece this overlooks is the information and infrastructure built by the minor policy changes. A big part of the argument for the reporting requirements in the EO was that there is now going to be an office in the US government that is in the business of collecting critical information about frontier AI models and figuring out how to synthesize it to the rest of government, that has the legal authority to do this, and both the office and the legal authority can now be expanded rather than created, and there will now be lots of individuals who are experienced in dealing with this information in the government context, and it will seem natural that the government should know this information. I think if we had only been developing and advocating for ideal policy, this would not have happened (though I imagine that this is not in fact what you’re suggesting the community do!).
Unless you’re talking about financial conflicts of interest, but there are also financial incentives for orgs pursuing a “radical” strategy to downplay boring real-world constraints, as well as social incentives (e.g. on LessWrong IMO) to downplay boring these constraints and cognitive biases against thinking your preferred strategy has big downsides.
It’s not just that problem though, they will likely be biased to think that their policy is helpful for safety of AI at all, and this is a point that sometimes gets forgotten.
But correct on the fact that Akash’s argument is fully general.
Agree with lots of this– a few misc thoughts [hastily written]:
I think the Overton Window frame ends up getting people to focus too much on the dimension “how radical is my ask”– in practice, things are usually much more complicated than this. In my opinion, a preferable frame is something like “who is my target audience and what might they find helpful.” If you’re talking to someone who makes it clear that they will not support X, it’s silly to keep on talking about X. But I think the “target audience first” approach ends up helping people reason in a more sophisticated way about what kinds of ideas are worth bringing up. As an example, in my experience so far, many policymakers are curious to learn more about intelligence explosion scenarios and misalignment scenarios (the more “radical” and “speculative” threat models).
I don’t think it’s clear that the more effective actors in DC tend to be those who look for small wins. Outside of the AIS community, there sure do seem to be a lot of successful organizations that take hard-line positions and (presumably) get a lot of their power/influence from the ideological purity that they possess & communicate. Whether or not these organizations end up having more or less influence than the more “centrist” groups is, in my view, not a settled question & probably varies a lot by domain. In AI safety in particular, I think my main claim is something like “pretty much no group– whether radical or centrist– has had tangible wins. When I look at the small set of tangible wins, it seems like the groups involved were across the spectrum of “reasonableness.”
The more I interact with policymakers, the more I’m updating toward something like “poisoning the well doesn’t come from having radical beliefs– poisoning the well comes from lamer things like being dumb or uninformed, wasting peoples’ time, not understanding how the political process works, not having tangible things you want someone to do, explaining ideas poorly, being rude or disrespectful, etc.” I’ve asked ~20-40 policymakers (outside of the AIS bubble) things like “what sorts of things annoy you about meetings” or “what tends to make meetings feel like a waste of your time”, and no one ever says “people come in with ideas that are too radical.” The closest thing I’ve heard is people saying that they dislike it when groups fail to understand why things aren’t able to happen (like, someone comes in thinking their idea is great, but then they fail to understand that their idea needs approval from committee A and appropriations person B and then they’re upset about why things are moving slowly). It seems to me like many policy folks (especially staffers and exec branch subject experts) are genuinely interested in learning more about the beliefs and worldviews that have been prematurely labeled as “radical” or “unreasonable” (or perhaps such labels were appropriate before chatGPT but no longer are).
A reminder that those who are opposed to regulation have strong incentives to make it seem like basically-any-regulation is radical/unreasonable. An extremely common tactic is for industry and its allies to make common-sense regulation seem radical/crazy/authoritarian & argue that actually the people proposing strong policies are just making everyone look bad & argue that actually we should all rally behind [insert thing that isn’t a real policy.] (I admit this argument is a bit general, and indeed I’ve made it before, so I won’t harp on it here. Also I don’t think this is what Trevor is doing– it is indeed possible to raise serious discussions about “poisoning the well” even if one believes that the cultural and economic incentives disproportionately elevate such points).
In the context of AI safety, it seems to me like the most high-influence Overton Window moves have been positive– and in fact I would go as far as to say strongly positive. Examples that come to mind include the CAIS statement, FLI pause letter, Hinton leaving Google, Bengio’s writings/speeches about rogue AI & loss of control, Ian Hogarth’s piece about the race to god-like AI, and even Yudkowsky’s TIME article.
I think some of our judgments here depend on underlying threat models and an underlying sense of optimism vs. pessimism. If one things that labs making voluntary agreements/promises and NIST contributing to the development of voluntary standards are quite excellent ways to reduce AI risk, then the groups that have helped make this happen deserve a lot of credit. If one thinks that much more is needed to meaningfully reduce xrisk, then the groups that are raising awareness about the nature of the problem, making high-quality arguments about threat models, and advocating for stronger policies deserve a lot of credit.
I agree that more research on this could be useful. But I think it would be most valuable to focus less on “is X in the Overton Window” and more on “is X written/explained well and does it seem to have clear implications for the target stakeholders?”
Quick reactions:
Re: how over-emphasis on “how radical is my ask” vs “what my target audience might find helpful” and generally the importance of making your case well regardless of how radical it is, that makes sense. Though notably the more radical your proposal is (or more unfamiliar your threat models are), the higher the bar for explaining it well, so these do seem related.
Re: more effective actors looking for small wins, I agree that it’s not clear, but yeah, seems like we are likely to get into some reference class tennis here. “A lot of successful organizations that take hard-line positions and (presumably) get a lot of their power/influence from the ideological purity that they possess & communicate”? Maybe, but I think of like, the agriculture lobby, who just sort of quietly make friends with everybody and keep getting 11-figure subsidies every year, in a way that (I think) resulted more from gradual ratcheting than making a huge ask. “Pretty much no group– whether radical or centrist– has had tangible wins” seems wrong in light of the EU AI Act (where I think both a “radical” FLI and a bunch of non-radical orgs were probably important) and the US executive order (I’m not sure which strategy is best credited there, but I think most people would have counted the policies contained within it as “minor asks” relative to licensing, pausing, etc). But yeah I agree that there are groups along the whole spectrum that probably deserve credit.
Re: poisoning the well, again, radical-ness and being dumb/uninformed are of course separable but the bar rises the more radical you get, in part because more radical policy asks strongly correlate with more complicated procedural asks; tweaking ECRA is both non-radical and procedurally simple, creating a new agency to license training runs is both outside the DC Overton Window and very procedurally complicated.
Re: incentives, I agree that this is a good thing to track, but like, “people who oppose X are incentivized to downplay the reasons to do X” is just a fully general counterargument. Unless you’re talking about financial conflicts of interest, but there are also financial incentives for orgs pursuing a “radical” strategy to downplay boring real-world constraints, as well as social incentives (e.g. on LessWrong IMO) to downplay boring these constraints and cognitive biases against thinking your preferred strategy has big downsides.
I agree that the CAIS statement, Hinton leaving Google, and Bengio and Hogarth’s writing have been great. I think that these are all in a highly distinct category from proposing specific actors take specific radical actions (unless I’m misremembering the Hogarth piece). Yudkowsky’s TIME article, on the other hand, definitely counts as an Overton Window move, and I’m surprised that you think this has had net positive effects. I regularly hear “bombing datacenters” as an example of a clearly extreme policy idea, sometimes in a context that sounds like it maybe made the less-radical idea seem more reasonable, but sometimes as evidence that the “doomers” want to do crazy things and we shouldn’t listen to them, and often as evidence that they are at least socially clumsy, don’t understand how politics works, etc, which is related to the things you list as the stuff that actually poisons the well. (I’m confused about the sign of the FLI letter as we’ve discussed.)
I’m not sure optimism vs pessimism is a crux, except in very short, like, 3-year timelines. It’s true that optimists are more likely to value small wins, so I guess narrowly I agree that a ratchet strategy looks strictly better for optimists, but if you think big radical changes are needed, the question remains of whether you’re more likely to get there via asking for the radical change now or looking for smaller wins to build on over time. If there simply isn’t time to build on these wins, then yes, better to take a 2% shot at the policy that you actually think will work; but even in 5-year timelines I think you’re better positioned to get what you ultimately want by 2029 if you get a little bit of what you want in 2024 and 2026 (ideally while other groups also make clear cases for the threat models and develop the policy asks, etc.). Another piece this overlooks is the information and infrastructure built by the minor policy changes. A big part of the argument for the reporting requirements in the EO was that there is now going to be an office in the US government that is in the business of collecting critical information about frontier AI models and figuring out how to synthesize it to the rest of government, that has the legal authority to do this, and both the office and the legal authority can now be expanded rather than created, and there will now be lots of individuals who are experienced in dealing with this information in the government context, and it will seem natural that the government should know this information. I think if we had only been developing and advocating for ideal policy, this would not have happened (though I imagine that this is not in fact what you’re suggesting the community do!).
It’s not just that problem though, they will likely be biased to think that their policy is helpful for safety of AI at all, and this is a point that sometimes gets forgotten.
But correct on the fact that Akash’s argument is fully general.