I think the examples you give are valid, but there are several reasons why I think the situation is somewhat contingent or otherwise less bleak than you do:
Counterexamples: I think there are research agendas that are less pre-paradigmatic than the ones you’re focused on that are significantly more (albeit not entirely) parallelizable. For example, mechanistic interpretability and scalable oversight both have multiple groups focused on them and have grown substantially over the last couple of years. I’m aware that we disagree about how valuable these directions are.
Survival of the fittest: Unfortunately I think in cases where an individual has been pursuing a research direction for many years and has tried but failed to get anyone else on board with it, there is some explanatory power to the hypothesis that the direction is not that productive. Note that I’m not claiming to have a strong view on any particular agenda, and there are of course other possibilities in any given case. On the flip side, I expect promising directions to gain momentum over time, even if only gradually, and I consider the counterexamples from point 1 to be instances of this effect.
Fixable coordination/deference failures: I think it would be a mistake for absolutely everyone to go off and try to develop their own alignment strategy from scratch, and it’s plausible that the group you’re focused on is erring too far in this direction. My own strategy has been to do my best to develop my own inside view (which I think is important for research prioritization and motivation as well from a group epistemics perspective), use this to narrow down my set of options to agendas I consider plausible, but be considerably more willing to defer when it comes to making a final call about which agenda to pursue.
Clarity from AI advances: If the risk from AI is real, then I expect the picture of it to become clearer over time as AI improves. As a consequence, it should become clearer to people which directions are worth pursuing, and theoretical approaches should evolve into practical ones than can be iterated on empirically. This should both cause the field to grow and lead to more parallelizable work. I think this is already happening, and even the public at large is picking up on the spookiness of current alignment failures (even though the discourse is unsurprisingly very muddled).
I think the examples you give are valid, but there are several reasons why I think the situation is somewhat contingent or otherwise less bleak than you do:
Counterexamples: I think there are research agendas that are less pre-paradigmatic than the ones you’re focused on that are significantly more (albeit not entirely) parallelizable. For example, mechanistic interpretability and scalable oversight both have multiple groups focused on them and have grown substantially over the last couple of years. I’m aware that we disagree about how valuable these directions are.
Survival of the fittest: Unfortunately I think in cases where an individual has been pursuing a research direction for many years and has tried but failed to get anyone else on board with it, there is some explanatory power to the hypothesis that the direction is not that productive. Note that I’m not claiming to have a strong view on any particular agenda, and there are of course other possibilities in any given case. On the flip side, I expect promising directions to gain momentum over time, even if only gradually, and I consider the counterexamples from point 1 to be instances of this effect.
Fixable coordination/deference failures: I think it would be a mistake for absolutely everyone to go off and try to develop their own alignment strategy from scratch, and it’s plausible that the group you’re focused on is erring too far in this direction. My own strategy has been to do my best to develop my own inside view (which I think is important for research prioritization and motivation as well from a group epistemics perspective), use this to narrow down my set of options to agendas I consider plausible, but be considerably more willing to defer when it comes to making a final call about which agenda to pursue.
Clarity from AI advances: If the risk from AI is real, then I expect the picture of it to become clearer over time as AI improves. As a consequence, it should become clearer to people which directions are worth pursuing, and theoretical approaches should evolve into practical ones than can be iterated on empirically. This should both cause the field to grow and lead to more parallelizable work. I think this is already happening, and even the public at large is picking up on the spookiness of current alignment failures (even though the discourse is unsurprisingly very muddled).