Given a choice between public funding and just not having funding, I’d take a lack of funding. The incentives/selection pressures which come with public funding, especially in a field with already-terrible feedback loops, spell doom for the field.
What do you think happens in a world where there is $100 billion in yearly alignment funding? How would they be making less progress? I want to note that even horrifically inefficient systems still produce more output than “uncorrupted” hobbyists—cancer research would produce much fewer results if it were done by 300 perfectly coordinated people, even if the 300 had zero ethical/legal restraints.
Let’s take cancer as an analogy for a moment. Suppose that, as a baseline, cancer research is basically-similar to other areas of medical research. Then, some politician comes along and declares “war on cancer”, and blindly pumps money into cancer research specifically. What happens? Well...
So even just from eyeballing that chart, it’s pretty plausible to me that if cancer funding dropped by a factor of 10, the net effect would be that clinical trial pass rates just return to comparable levels to other areas, and the actual benefits of all that research remain roughly-the-same.
… but that’s ignoring second-order effects.
Technical research fields have a “median researcher” problem: the memetic success of work in the field is not determined by the best researchers, but by the median researchers. Even if e.g. the best psychologists understand enough statistics to recognize crap studies, the median psychologist doesn’t (or at least didn’t 10 years ago), so we ended up with a field full of highly-memetically-successful crap studies which did not replicate (think Carol Dweck).
Back to cancer: if the large majority of the field is doing work which is predictably useless, then the field will develop standards for “success” which are totally decoupled from actual usefulness. (Note that the chart above doesn’t actually imply that most of the work done in the field is useless, let alone predictably useless; there’s not a one-to-one map between cancer research projects and clinical trials.) To a large extent, the new standards would be directly opposed to actual usefulness, in order to defend the entrenched researchers doing crap work—think Carol Dweck arguing that replication is a bad standard for psychology.
That’s the sort of thing I expect would happen if a government dumped $100B into alignment funding. There’d be a flood of people with nominally-alignment-related projects which are in fact basically useless for solving alignment; they would quickly balloon to 90+% of the field. With such people completely dominating the field, first memetic success and then grant money would mostly be apportioned by people whose standards for success are completely decoupled from actual usefulness for alignment. Insofar as anything useful got done, it would mostly be by people who figured out the real challenges of alignment for themselves, and had to basically hack the funding system in order to get money for their actually-useful work.
In the case of cancer, steady progress has been made over the years despite the mess; at the end of the day, clinical trials provide a good ground-truth signal for progress on cancer. Even if lots of shit is thrown at the wall, some of it sticks, and that’s useful. In alignment, one of the main frames for why the problem is hard is that we do not have a good ground-truth signal for whether we’re making progress. So all these problems would be much worse than usual, and it’s less likely to be the actually-useful shit which sticks to the metaphorical wall.
the issues you mention don’t seem tied to public versus private funding but more about size of funding + an intrinsically difficul scientific question. I agree that at some point more funding doesn’t help. At the moment, that doesn’t seem to be the case in alignment. Indeed, alignment is not even as large in number of researchers as a relatively small field like linguistics.
How well the funders understand the field, and can differentially target more-useful projects, is a key variable here. For public funding, the top-level decision maker is a politician; they will in the vast majority of cases have approximately-zero understanding themselves. They will either apportion funding on purely political grounds (e.g. pork-barrel spending), or defer to whoever the consensus “experts” are in the field (which is where the median researcher problem kicks in).
In alignment to date, the funders have generally been people who understand the problem themselves to at least enough extent to notice that it’s worth paying attention to (in a world where alignment concern wasn’t already mainstream), and can therefore differentially target useful work, rather than blindly spray money around.
Seems overstated. Universities support all kinds of very specialized long-term research that politicians don’t understand.
From my own observations and from talking with funders themselves most funding decisions in AI safety are made on mostly superficial markers—grantmakers on the whole don’t dive deep on technical details. [In fact, I would argue that blindly spraying around money in a more egalitarian way (i.e. what SeriMATS has accomplished) is probably not much worse than the status-quo.]
Academia isn’t perfect but on the whole it gives a lot of bright people the time, space and financial flexibility to pursue their own judgement. In fact, many alignment researchers have done a significant part of work in an academic setting or being supported in some ways by public funding.
At first, I predicted you were going to say that public funding would accelerate capabilities research over alignment but it seems like the gist of your argument is that lots of public funding would muddy the water and sharply reduce the average quality of alignment research.
That might be true for theoretical AI alignment research but I’d imagine it’s less of a problem for types of AI alignment research that have decent feedback loops like interpretability research and other kinds of empirical research like experiments on RL agents.
One reason that I’m skeptical is that there doesn’t seem to be a similar problem in the field of ML which is huge and largely publicly funded to the best of my knowledge and still makes good progress. Possible reasons why the ML field is still effective despite its size include sufficient empirical feedback loops and the fact that top conferences reject most papers (~25% is a typical acceptance rate for papers at NeurIPS).
Yeah, to be clear, acceleration of capabilities is a major reason why I expect public funding would be net negative, rather than just much closer to zero impact than naive multiplication would suggest.
Ignoring the capabilities issue, I think there’s lots of room for uncertainty about whether a big injection of “blind funding” would be net positive, for the reasons explained above. I think we should be pretty confident that the results would be an OOM or more less positive than the naive multiplication suggests, but that’s still not the same as “net negative”; the net positivity/negativity I see as much more uncertain (ignoring capabilities impact).
Accounting for capabilities impact, I think the net impact would be pretty robustly negative.
That might be true for theoretical AI alignment research but I’d imagine it’s less of a problem for types of AI alignment research that have decent feedback loops like interpretability research and other kinds of empirical research like experiments on RL agents.
(Which is not to say that e.g. interpretability research isn’t useful—we can often get great feedback loops on things which provide a useful foundation for the hard parts later on. The point is that, if the field as a whole streetlights on things with good feedback loops, it will end up ignoring the most dangerous things.)
This seems implausible. Almost all contributions to AI alignment (from any perspective) has been through by people having implicitly or explicitly outside funding—not by hobbyist doing alignment next to their dayjob.
I am not claiming that a hobbyist-only research community would outperform today’s community. I’m claiming that a hobbyist-only research community would outperform public (i.e. government) funding. Today’s situation is better than either of those two: we have funding which provides incentives which, for all their flaws, are far far better than the incentives which would come with government funding.
Roll-to-disbelieve. Can you name one kind of research that wouldn’t have counterfactually happened if alignment was publicly funded? Your own research seems like a good fit for academia for instance.
Can you name one kind of research that wouldn’t have counterfactually happened if alignment was publicly funded?
Wrong question.
The parable of the leprechaun is relevant here:
One day, a farmer managed to catch a leprechaun. As is usual for these tales, the leprechaun offered to show the farmer where the leprechaun’s gold was buried, in exchange for the leprechaun’s freedom. The farmer agreed. So the leprechaun led the farmer deep into the woods, eventually stopped at a tree, and said “my gold is buried under this tree”.
Unfortunately, the farmer had not thought to bring a shovel. So, the farmer tied a ribbon around the tree to mark the spot, and the leprechaun agreed not to remove it. Then the farmer returned home to fetch a shovel.
When the farmer returned, he found a ribbon tied around every tree in the forest. He never did find the gold.
This is the problem which plagues many academic fields (e.g. pre-crisis psychology is a now-clear example). It’s not mainly that good research goes unfunded, it’s that there’s so much crap that the good work is (a) hard to find, and (b) not differentially memetically successful.
A little crap research mostly doesn’t matter, so long as the competent researchers can still do their thing. But if the volume of crap reaches the point where competent researchers have trouble finding each other, or new people are mostly onboarded into crap research, or external decision-makers can’t defer to a random “expert” in the field without usually getting a bunch of crap, then all that crap research has important negative effects.
It’s a cute story John but do you have more than an anecdotal leprechaun?
I think the simplest model (so the one we should default to by Occam’s mighty Razor) is that whether good research will be done in a field is mostly tied to
intrinisic features of research in this area (i.e. how much feedback from reality, noisy vs nonnoisy, political implication, and lots more I don’t care to name)
initial fieldbuilding driving who self-selects into the research field
Number of Secure funded research positions
the first is independent of funding source - I don’t think we have much evidence that the second would be much worse for public funding as opposed to private funding.
in absence of strong evidence, I humbly suggest we should default to the simplest model in which :
more money & more secure positions → more people will be working on the problem
The fact that France has a significant larger number of effectively-tenured positions per capita than most other nations, entirely publicly funded, is almost surely one of the most important factors in its (continued) dominance in pure mathematics as evidenced by its large share of Fields medals (13/66 versus 15⁄66 for the US). I observe in passing that your own research program is far more akin to academic math than cancer research.
As for the position that you’d rather have no funding as opposed to public funding is … well let us be polite and call it … American.
(Probably not going to respond further here, but I wanted to note that this comment really hit the perfect amount of sarcasm and combativity for me personally; I enjoyed it.)
Alignment is almost exactly opposite of abstract math?
Math has a good quality of being checkable—you can get a paper, follow all its content and become sure that content is valid. Alignment research paper can have valid math, but be inadequate in questions such as “is this math even related to reality?”, which are much harder to check.
Wentworths own work is closest to academic math/theoretical physics, perhaps to philosophy.
are you claiming we have no way of telling good (alignment) research from bad?
And if we do, why would private funding be better at figuring this out than public funding?
To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example
Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
But if the volume of crap reaches the point where competent researchers have trouble finding each other, or new people are mostly onboarded into crap research, or external decision-makers can’t defer to a random “expert” in the field without usually getting a bunch of crap, then all that crap research has important negative effects
I agree with your observation. The problem is that many people are easily influenced by others, rather than critically evaluating whether the project they’re participating in is legitimate or if their work is safe to publish. It seems that most have lost the ability to listen to their own judgment and assess what is rational to do.
Given a choice between public funding and just not having funding, I’d take a lack of funding. The incentives/selection pressures which come with public funding, especially in a field with already-terrible feedback loops, spell doom for the field.
What do you think happens in a world where there is $100 billion in yearly alignment funding? How would they be making less progress? I want to note that even horrifically inefficient systems still produce more output than “uncorrupted” hobbyists—cancer research would produce much fewer results if it were done by 300 perfectly coordinated people, even if the 300 had zero ethical/legal restraints.
Let’s take cancer as an analogy for a moment. Suppose that, as a baseline, cancer research is basically-similar to other areas of medical research. Then, some politician comes along and declares “war on cancer”, and blindly pumps money into cancer research specifically. What happens? Well...
So even just from eyeballing that chart, it’s pretty plausible to me that if cancer funding dropped by a factor of 10, the net effect would be that clinical trial pass rates just return to comparable levels to other areas, and the actual benefits of all that research remain roughly-the-same.
… but that’s ignoring second-order effects.
Technical research fields have a “median researcher” problem: the memetic success of work in the field is not determined by the best researchers, but by the median researchers. Even if e.g. the best psychologists understand enough statistics to recognize crap studies, the median psychologist doesn’t (or at least didn’t 10 years ago), so we ended up with a field full of highly-memetically-successful crap studies which did not replicate (think Carol Dweck).
Back to cancer: if the large majority of the field is doing work which is predictably useless, then the field will develop standards for “success” which are totally decoupled from actual usefulness. (Note that the chart above doesn’t actually imply that most of the work done in the field is useless, let alone predictably useless; there’s not a one-to-one map between cancer research projects and clinical trials.) To a large extent, the new standards would be directly opposed to actual usefulness, in order to defend the entrenched researchers doing crap work—think Carol Dweck arguing that replication is a bad standard for psychology.
That’s the sort of thing I expect would happen if a government dumped $100B into alignment funding. There’d be a flood of people with nominally-alignment-related projects which are in fact basically useless for solving alignment; they would quickly balloon to 90+% of the field. With such people completely dominating the field, first memetic success and then grant money would mostly be apportioned by people whose standards for success are completely decoupled from actual usefulness for alignment. Insofar as anything useful got done, it would mostly be by people who figured out the real challenges of alignment for themselves, and had to basically hack the funding system in order to get money for their actually-useful work.
In the case of cancer, steady progress has been made over the years despite the mess; at the end of the day, clinical trials provide a good ground-truth signal for progress on cancer. Even if lots of shit is thrown at the wall, some of it sticks, and that’s useful. In alignment, one of the main frames for why the problem is hard is that we do not have a good ground-truth signal for whether we’re making progress. So all these problems would be much worse than usual, and it’s less likely to be the actually-useful shit which sticks to the metaphorical wall.
Many things here.
the issues you mention don’t seem tied to public versus private funding but more about size of funding + an intrinsically difficul scientific question. I agree that at some point more funding doesn’t help. At the moment, that doesn’t seem to be the case in alignment. Indeed, alignment is not even as large in number of researchers as a relatively small field like linguistics.
How well the funders understand the field, and can differentially target more-useful projects, is a key variable here. For public funding, the top-level decision maker is a politician; they will in the vast majority of cases have approximately-zero understanding themselves. They will either apportion funding on purely political grounds (e.g. pork-barrel spending), or defer to whoever the consensus “experts” are in the field (which is where the median researcher problem kicks in).
In alignment to date, the funders have generally been people who understand the problem themselves to at least enough extent to notice that it’s worth paying attention to (in a world where alignment concern wasn’t already mainstream), and can therefore differentially target useful work, rather than blindly spray money around.
Seems overstated. Universities support all kinds of very specialized long-term research that politicians don’t understand.
From my own observations and from talking with funders themselves most funding decisions in AI safety are made on mostly superficial markers—grantmakers on the whole don’t dive deep on technical details. [In fact, I would argue that blindly spraying around money in a more egalitarian way (i.e. what SeriMATS has accomplished) is probably not much worse than the status-quo.]
Academia isn’t perfect but on the whole it gives a lot of bright people the time, space and financial flexibility to pursue their own judgement. In fact, many alignment researchers have done a significant part of work in an academic setting or being supported in some ways by public funding.
At first, I predicted you were going to say that public funding would accelerate capabilities research over alignment but it seems like the gist of your argument is that lots of public funding would muddy the water and sharply reduce the average quality of alignment research.
That might be true for theoretical AI alignment research but I’d imagine it’s less of a problem for types of AI alignment research that have decent feedback loops like interpretability research and other kinds of empirical research like experiments on RL agents.
One reason that I’m skeptical is that there doesn’t seem to be a similar problem in the field of ML which is huge and largely publicly funded to the best of my knowledge and still makes good progress. Possible reasons why the ML field is still effective despite its size include sufficient empirical feedback loops and the fact that top conferences reject most papers (~25% is a typical acceptance rate for papers at NeurIPS).
Yeah, to be clear, acceleration of capabilities is a major reason why I expect public funding would be net negative, rather than just much closer to zero impact than naive multiplication would suggest.
Ignoring the capabilities issue, I think there’s lots of room for uncertainty about whether a big injection of “blind funding” would be net positive, for the reasons explained above. I think we should be pretty confident that the results would be an OOM or more less positive than the naive multiplication suggests, but that’s still not the same as “net negative”; the net positivity/negativity I see as much more uncertain (ignoring capabilities impact).
Accounting for capabilities impact, I think the net impact would be pretty robustly negative.
The parts where the bad feedback loops are, are exactly the places where the things-which-might-actually-kill-us are. Things we can see coming are exactly the things which don’t particularly need research to stop, and the fact that we can see them is exactly what makes the feedback loops good. It is not an accident that the feedback loop problem is unusually severe for the field of alignment in particular.
(Which is not to say that e.g. interpretability research isn’t useful—we can often get great feedback loops on things which provide a useful foundation for the hard parts later on. The point is that, if the field as a whole streetlights on things with good feedback loops, it will end up ignoring the most dangerous things.)
This seems implausible. Almost all contributions to AI alignment (from any perspective) has been through by people having implicitly or explicitly outside funding—not by hobbyist doing alignment next to their dayjob.
I am not claiming that a hobbyist-only research community would outperform today’s community. I’m claiming that a hobbyist-only research community would outperform public (i.e. government) funding. Today’s situation is better than either of those two: we have funding which provides incentives which, for all their flaws, are far far better than the incentives which would come with government funding.
Roll-to-disbelieve. Can you name one kind of research that wouldn’t have counterfactually happened if alignment was publicly funded? Your own research seems like a good fit for academia for instance.
Wrong question.
The parable of the leprechaun is relevant here:
This is the problem which plagues many academic fields (e.g. pre-crisis psychology is a now-clear example). It’s not mainly that good research goes unfunded, it’s that there’s so much crap that the good work is (a) hard to find, and (b) not differentially memetically successful.
A little crap research mostly doesn’t matter, so long as the competent researchers can still do their thing. But if the volume of crap reaches the point where competent researchers have trouble finding each other, or new people are mostly onboarded into crap research, or external decision-makers can’t defer to a random “expert” in the field without usually getting a bunch of crap, then all that crap research has important negative effects.
It’s a cute story John but do you have more than an anecdotal leprechaun?
I think the simplest model (so the one we should default to by Occam’s mighty Razor) is that whether good research will be done in a field is mostly tied to
intrinisic features of research in this area (i.e. how much feedback from reality, noisy vs nonnoisy, political implication, and lots more I don’t care to name)
initial fieldbuilding driving who self-selects into the research field
Number of Secure funded research positions
the first is independent of funding source - I don’t think we have much evidence that the second would be much worse for public funding as opposed to private funding.
in absence of strong evidence, I humbly suggest we should default to the simplest model in which :
more money & more secure positions → more people will be working on the problem
The fact that France has a significant larger number of effectively-tenured positions per capita than most other nations, entirely publicly funded, is almost surely one of the most important factors in its (continued) dominance in pure mathematics as evidenced by its large share of Fields medals (13/66 versus 15⁄66 for the US). I observe in passing that your own research program is far more akin to academic math than cancer research.
As for the position that you’d rather have no funding as opposed to public funding is … well let us be polite and call it … American.
(Probably not going to respond further here, but I wanted to note that this comment really hit the perfect amount of sarcasm and combativity for me personally; I enjoyed it.)
Alignment is almost exactly opposite of abstract math? Math has a good quality of being checkable—you can get a paper, follow all its content and become sure that content is valid. Alignment research paper can have valid math, but be inadequate in questions such as “is this math even related to reality?”, which are much harder to check.
That may be so.
Wentworths own work is closest to academic math/theoretical physics, perhaps to philosophy.
are you claiming we have no way of telling good (alignment) research from bad? And if we do, why would private funding be better at figuring this out than public funding?
To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
The canonical examples are NP problems.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
Yep, I was thinking about NP problems, though #P problems for the counting version would count as well.
I agree with your observation. The problem is that many people are easily influenced by others, rather than critically evaluating whether the project they’re participating in is legitimate or if their work is safe to publish. It seems that most have lost the ability to listen to their own judgment and assess what is rational to do.