My own take (which might be heavily wrong, and I would appreciate any good counterargument):
There is a massive mismatch between the incentives of the field and the incentives of people who could do the field building. Everyone constantly says that they want textbooks and courses and distillation, but I think any researcher without a stable job is really incentivized against that.
Why? Basically because if you want funding for research and/or a research job in alignment, you need good research. I have never seen a single example of good field building deciding such a case over research, and so even if you got kudos, you don’t get anything in return for field building (in terms of being able to work on alignment). Worse, you probably spent a shit ton of time on it, which makes you have less research to show for your grant application or job application, and so reduces significantly your chances of getting funding/jobs to keep working on alignment.
This line of thinking has heavily limited my attempts to do anything to solve the problems of the field. Any time I have a promising idea, I assumes it’s going to be a net negative on everything related to getting funding/jobs for me, and just see if I can get away with that cost. That has been my reasoning behind the alignment coffee time, mentoring people, and accepting some responsibilities. But I never maintain any of the bigger and important projects and ideas because doing so would torpedo my chances of ever doing alignment research again. Even when I think that they are probably the most important things for the future of alignment in general.
Thinking about it some more, I believe the actual problem I’m pointing out is not that people can’t find initial funding (I said elsewhere and still believe it’s quite easy to get one) but that we don’t really have a plan for longterm funding outside of funding labs. What I mean is that if you’re like me and have some funding for a year or two, either you’re looking for a job in one of the big labs or… you don’t really know. Maybe it’s possible to get funding every year for ten years? But that’s also very stressful, and not clear whether it is possible.
The related problem is that funding projects is fundamentally a bad idea if there are only few projects that require these skills but the skills take time and investment to get. Only thinking in terms of funding project means that you expect someone to stop what they’re doing for a year, do that project, and then go back to what they were doing. Except they probably need at least a year of study before being good enough to do it, and also these two years spent on this can cost you basically all you chances of doing anything in your previous career. (And maybe they just want to keep helping)
A last framing I find both sad and hilarious is that any independent researcher that wants to do field building is basically on the driver side of Parfit’s hitchhiker: they can give a ride to the hitchhiker in the desert (the alignment community in need of community) if the hitchhiker pays them when they arrive into town, but it looks pretty clear from the driver’s seat that the incentives of the hitchhiker are for not giving the money at that point because it would be wasteful. Hence why the alignment community is unable to get out of the desert.
My own take (which might be heavily wrong, and I would appreciate any good counterargument):
There is a massive mismatch between the incentives of the field and the incentives of people who could do the field building. Everyone constantly says that they want textbooks and courses and distillation, but I think any researcher without a stable job is really incentivized against that.
Why? Basically because if you want funding for research and/or a research job in alignment, you need good research. I have never seen a single example of good field building deciding such a case over research, and so even if you got kudos, you don’t get anything in return for field building (in terms of being able to work on alignment). Worse, you probably spent a shit ton of time on it, which makes you have less research to show for your grant application or job application, and so reduces significantly your chances of getting funding/jobs to keep working on alignment.
This line of thinking has heavily limited my attempts to do anything to solve the problems of the field. Any time I have a promising idea, I assumes it’s going to be a net negative on everything related to getting funding/jobs for me, and just see if I can get away with that cost. That has been my reasoning behind the alignment coffee time, mentoring people, and accepting some responsibilities. But I never maintain any of the bigger and important projects and ideas because doing so would torpedo my chances of ever doing alignment research again. Even when I think that they are probably the most important things for the future of alignment in general.
Thinking about it some more, I believe the actual problem I’m pointing out is not that people can’t find initial funding (I said elsewhere and still believe it’s quite easy to get one) but that we don’t really have a plan for longterm funding outside of funding labs. What I mean is that if you’re like me and have some funding for a year or two, either you’re looking for a job in one of the big labs or… you don’t really know. Maybe it’s possible to get funding every year for ten years? But that’s also very stressful, and not clear whether it is possible.
The related problem is that funding projects is fundamentally a bad idea if there are only few projects that require these skills but the skills take time and investment to get. Only thinking in terms of funding project means that you expect someone to stop what they’re doing for a year, do that project, and then go back to what they were doing. Except they probably need at least a year of study before being good enough to do it, and also these two years spent on this can cost you basically all you chances of doing anything in your previous career. (And maybe they just want to keep helping)
A last framing I find both sad and hilarious is that any independent researcher that wants to do field building is basically on the driver side of Parfit’s hitchhiker: they can give a ride to the hitchhiker in the desert (the alignment community in need of community) if the hitchhiker pays them when they arrive into town, but it looks pretty clear from the driver’s seat that the incentives of the hitchhiker are for not giving the money at that point because it would be wasteful. Hence why the alignment community is unable to get out of the desert.