AI safety research is receiving very little federal funding at this time, and is almost entirely privately funded, AFAIK. I agree with you that NSF funding leads to a field being perceived as more legitimate, which IMO is in fact one of the biggest benefits if we manage to get this through.
If you ask me, the AI safety community tends to overplay the perverse incentives in academia and underplay the value of having many many more (on average) very intelligent people thinking about what is arguably one of the bigger problems of our time. Color me skeptic, but I don’t see any universe in which having AI safety research go mainstream is a bad thing.
Plausible universe in which AI safety research going mainstream now would be bad: The mainstream version of the field uses the current dominant paradigms for understanding the problem, and those cause network effects such that new paradigms are very slow to be introduced. In the counterfactual world, the non-mainstream version of the field one year later develops a paradigm which, if introduced to academia immediately, solves the problem 2 years faster than the old paradigm would after being introduced to academia.
This is a conceivable universe, but do you really think it’s likely? It seems to me much more likely that additional funding opportunities would help AI safety research move at least a little bit faster.
I don’t know enough about the dynamics in academia, or the rate of progress in alignment to be confident in my assessment. But I don’t think it’s <6% something similar to this happens, so if people are introducing the field to mainstream academia, they should take precautions to minimize the chances of effect I described resulting in significant slowdowns.
Two views that I have seen from the AI risk community on perverse incentives within academia are:
Incentives within the academic community are such that researchers are unable to pursue or publish work that is responsible, high-quality, and high-value. With few exceptions, anyone who attempts to do so will be outcompeted and miss out on funding, tenure, and social capital, which will eventually lead to their exit from academia.
Money, effort, and attention within the academic community are allocated by a process that is only loosely aligned with the goal of producing good research. Some researchers are able to avoid the trap and competently pursue valuable projects, but many will not, and a few will even go after projects that are harmful.
I think 1 is overplayed. There may be fields/subfields that are like that, but I think there is room in most fields for the right kind of people to pursue high-value research while succeeding in academia. I think 2 is a pretty big deal, though. I’m not too worried about all the people who will get NSF grants for unimportant research, though I am a little concerned that a flood of papers that are missing the point will go against the goal of legitimizing AI risk research for policy impact.
What I’m more worried about is research that is actively harmful. For example, my understanding is that a substantial portion of gain-of-function research has been funded by the federal government. This strikes me as frighteningly analogous to the kind of work that we should be concerned about in AI risk. I think this was mostly NIH, not NSF, so maybe there are good reasons for thinking the NSF is less likely to support dangerous work? Is there a strategy or an already-in-place mechanism for preventing people from using NSF funds for high-risk work? Or maybe there’s an important difference in incentives here that I’m not seeing?
For what it’s worth, I’m mostly agnostic on this, with a slight lean toward NSF attention being bad. Many of the people I most admire for their ability to solve difficult problems are academics, and I’m excited about the prospect of getting more people like that working on these problems. I really don’t want to dismiss it unfairly. I find it pretty easy to imagine worlds in which this goes very badly, but I think the default outcome is probably that a bunch of money goes to pointless stuff, a smaller amount goes to very valuable work, the field grows and diversifies, and (assuming timelines are long enough) the overall result is a reduction in AI risk. But I’m not very confident of this, and the downsides seem much larger than the potential benefits.
AI safety research is receiving very little federal funding at this time, and is almost entirely privately funded, AFAIK. I agree with you that NSF funding leads to a field being perceived as more legitimate, which IMO is in fact one of the biggest benefits if we manage to get this through. If you ask me, the AI safety community tends to overplay the perverse incentives in academia and underplay the value of having many many more (on average) very intelligent people thinking about what is arguably one of the bigger problems of our time. Color me skeptic, but I don’t see any universe in which having AI safety research go mainstream is a bad thing.
Plausible universe in which AI safety research going mainstream now would be bad: The mainstream version of the field uses the current dominant paradigms for understanding the problem, and those cause network effects such that new paradigms are very slow to be introduced. In the counterfactual world, the non-mainstream version of the field one year later develops a paradigm which, if introduced to academia immediately, solves the problem 2 years faster than the old paradigm would after being introduced to academia.
This is a conceivable universe, but do you really think it’s likely? It seems to me much more likely that additional funding opportunities would help AI safety research move at least a little bit faster.
I don’t know enough about the dynamics in academia, or the rate of progress in alignment to be confident in my assessment. But I don’t think it’s <6% something similar to this happens, so if people are introducing the field to mainstream academia, they should take precautions to minimize the chances of effect I described resulting in significant slowdowns.
Two views that I have seen from the AI risk community on perverse incentives within academia are:
Incentives within the academic community are such that researchers are unable to pursue or publish work that is responsible, high-quality, and high-value. With few exceptions, anyone who attempts to do so will be outcompeted and miss out on funding, tenure, and social capital, which will eventually lead to their exit from academia.
Money, effort, and attention within the academic community are allocated by a process that is only loosely aligned with the goal of producing good research. Some researchers are able to avoid the trap and competently pursue valuable projects, but many will not, and a few will even go after projects that are harmful.
I think 1 is overplayed. There may be fields/subfields that are like that, but I think there is room in most fields for the right kind of people to pursue high-value research while succeeding in academia. I think 2 is a pretty big deal, though. I’m not too worried about all the people who will get NSF grants for unimportant research, though I am a little concerned that a flood of papers that are missing the point will go against the goal of legitimizing AI risk research for policy impact.
What I’m more worried about is research that is actively harmful. For example, my understanding is that a substantial portion of gain-of-function research has been funded by the federal government. This strikes me as frighteningly analogous to the kind of work that we should be concerned about in AI risk. I think this was mostly NIH, not NSF, so maybe there are good reasons for thinking the NSF is less likely to support dangerous work? Is there a strategy or an already-in-place mechanism for preventing people from using NSF funds for high-risk work? Or maybe there’s an important difference in incentives here that I’m not seeing?
For what it’s worth, I’m mostly agnostic on this, with a slight lean toward NSF attention being bad. Many of the people I most admire for their ability to solve difficult problems are academics, and I’m excited about the prospect of getting more people like that working on these problems. I really don’t want to dismiss it unfairly. I find it pretty easy to imagine worlds in which this goes very badly, but I think the default outcome is probably that a bunch of money goes to pointless stuff, a smaller amount goes to very valuable work, the field grows and diversifies, and (assuming timelines are long enough) the overall result is a reduction in AI risk. But I’m not very confident of this, and the downsides seem much larger than the potential benefits.