I like what you are saying above, but I also think there is a deeper story about paradigms and EA that you are not yet touching on.
I am an alignment researcher, but not an EA. I read quite broadly about alignment research, specifically I also read beyond the filter bubble of EA and this forum. What I notice is that many authors, both inside and outside of EA, observe that the field needs more research and more fresh ideas.
However, the claim that the field as a whole is ‘pre-paradigmatic’ is a framing that I see only on the EA and Rationalist side.
To make this more specific: I encounter this we-are-all-pre-paradigmatic narrative almost exclusively on the LW/AF forums, and on the EA forum (I only dip into the EA forum it occasionally, as I am not an EA). I see it this narrative also in EA-created research agendas and introductory courses, for example in the AGI safety fundamentals curriculum.
My working thesis is that talk about being pre-paradigmatic tells us more about the fundamental nature of EA than it tells us about the fundamental nature of the AI alignment problem.
There are in fact many post-paradigmatic posts about AI alignment on this forum. I wrote some of them myself. What I mean is posts where the authors select some paradigm and then use it to design an actual AGI alignment mechanism. These results-based-on-a-paradigm posts are seldom massively upvoted. Massive upvoting does however happen for posts which are all about being pre-paradigmatic, or about walking the first tentative steps using a new paradigm. I feel that this tells us more about the nature of EA and Rationalism as movements than it tells us about the nature of the alignment problem.
Several EA funding managers are on record as wanting to fund pre-paradigmatic research. The danger of this of course is that it creates a great incentive for the EA-funded alignment researchers to never to become post-paradigmatic.
I believe this pre-paradigmatic stance also couples to the reluctance among many EAs to ever think about politics, to make actual policy proposals, or to investigate what it would take to get a policy proposal accepted.
There is an extreme type of pre-paradigmatic stance, which I also encounter on this forum. In this extreme stance, you do not only want more paradigms, but you also reject all already-existing paradigms as being fundamentally flawed, as not even close to being able to capture any truth. This rejection implies that you do not need to examine any of the policy proposals that might flow out of of any existing paradigmatic research. Which is convenient if you want to avoid thinking about policy. It also means you do not need to read other people’s research. Which can be convenient too.
If EA were to become post-paradigmatic, and then start to consider making actual policy proposals, this might split the community along various political fault lines, and it might upset many potential wealthy donors to boot. If you care about the size and funding level of the community, it is very convenient to remain in a pre-paradigmatic state, and to have people tell you that it is rational to be in that state.
I am not saying that EA is doomed to be ineffective. But I do feel that any alignment researcher who wants to be effective needs to be aware of the above forces that push them away from becoming paradigmatic, so that they can overcome these forces.
A few years back, I saw less talk about everybody being in a pre-paradigmatic state on this forum, and I was was feeling a vibe that was more encouraging to anybody who had a new idea. It may have been just me feeling that different vibe, though.
Based on my working thesis above, there is a deeper story about EA and paradigms to be researched and written, but it probably needs an EA to write it.
Honest confession: often when I get stuck doing actual paradigmatic AI alignment research, I feel an impulse to research and write well-researched meta-stories about the state of alignment field. At the same time, I feel that there is already an over-investment in people writing meta-stories, especially now that we have books like The alignment problem. So I usually manage to suppress my impulse to write well-researched meta-stories, sometimes by posting less fully researched meta-comments like this one.
I am interested in this criticism, particularly in connection to misconception 1 from Holden’s ‘Important, actionable research questions for the most important century’, which to me suggests doing less paradigmatic research (which I interpret to mean ‘what ‘normal science’ looks like in ML research/industry’ in the Structure of Scientific Revolutions sense, do say if I misinterpret ‘paradigm’).
I think this division would benefit from some examples however. To what extent to you agree with a quick classification of mine?
Paradigmatic alignment research 1) Interpretability of neural nets (e.g colah’s vision and transformer circuits) 2) Dealing with dataset bias and generalisation in ML
Pre-paradigmatic alignment research 1) Agentic foundations and things MIRI work on 2) Proposals for alignment put forward by Paul Christiano, e.g Iterated Amplification
My concern is that while the list two problems are more fuzzy and less well-defined, they are far less direcetly if at all (in 2) actually working on the problem we actually care about.
First, a remark on Holden’s writeup. I wrote above that Several EA funding managers are on record as wanting to fund pre-paradigmatic research, From his writeup, I am not entirely sure if Holden is one of them, the word ‘paradigmatic’ does not appear in it. But it is definately clear that Holden is not very happy with the current paradigm of AI research, in the Kuhnian sense where a paradigm is more than just a scientific method but a whole value system supported by a dominant tribe.
Kuhn acknowledges having used the term “paradigm” in two different meanings. In the first one, “paradigm” designates what the members of a certain scientific community have in common, that is to say, the whole of techniques, patents and values shared by the members of the community. In the second sense, the paradigm is a single element of a whole, say for instance Newton’s Principia, which, acting as a common model or an example… stands for the explicit rules and thus defines a coherent tradition of investigation.
Now, Holden writes under misconception 1:
I think there are very few people making focused attempts at progress on the below questions. Many institutions that are widely believed to be interested in these questions [of AI alignment in the EA/longtermist sense] have constraints, cultures and/or styles that I think make it impractical to tackle the most important versions of them [...]
Holden here expresses worry about a lack of incentives to tackle the right questions, not about these institutions even lacking the right scientific tools to make any progress if they wanted to. So Holden’s concern here is somewhat orthogonal to the ‘pre-paradigmatic’ narratives associated with MIRI and John Wentworth, which is that these institutions are not even using the right tools.
That being said, Holden has written a lot. I am only commenting here about a tiny part of one single post.
On your examples of Paradigmatic alignment research vs. Pre-paradigmatic alignment research: I agree with your paradigmatic examples being paradigmatic, because they have the strength of the tribe of ML researchers behind them. (A few years back, dataset bias was still considered a somewhat strange and career-killing topic to work on if you wanted to be an ML researcher, but this has changed, if I judge this by the most recent NeurIPS conference.)
The pre-paradigmatic examples you mention do not have the ML research tribe behind them, but in a Kuhnian sense they are in fact paradigmatic inside the EA/rationalist tribe. So I might still call them paradigmatic, just in a different tribe.
My concern is that while the list two problems are more fuzzy and less well-defined, they are far less direcetly if at all (in 2) actually working on the problem we actually care about.
..I am confused here, you meant to write ‘first two problems’ above? I can’t really decode your concern.
I like what you are saying above, but I also think there is a deeper story about paradigms and EA that you are not yet touching on.
I am an alignment researcher, but not an EA. I read quite broadly about alignment research, specifically I also read beyond the filter bubble of EA and this forum. What I notice is that many authors, both inside and outside of EA, observe that the field needs more research and more fresh ideas.
However, the claim that the field as a whole is ‘pre-paradigmatic’ is a framing that I see only on the EA and Rationalist side.
To make this more specific: I encounter this we-are-all-pre-paradigmatic narrative almost exclusively on the LW/AF forums, and on the EA forum (I only dip into the EA forum it occasionally, as I am not an EA). I see it this narrative also in EA-created research agendas and introductory courses, for example in the AGI safety fundamentals curriculum.
My working thesis is that talk about being pre-paradigmatic tells us more about the fundamental nature of EA than it tells us about the fundamental nature of the AI alignment problem.
There are in fact many post-paradigmatic posts about AI alignment on this forum. I wrote some of them myself. What I mean is posts where the authors select some paradigm and then use it to design an actual AGI alignment mechanism. These results-based-on-a-paradigm posts are seldom massively upvoted. Massive upvoting does however happen for posts which are all about being pre-paradigmatic, or about walking the first tentative steps using a new paradigm. I feel that this tells us more about the nature of EA and Rationalism as movements than it tells us about the nature of the alignment problem.
Several EA funding managers are on record as wanting to fund pre-paradigmatic research. The danger of this of course is that it creates a great incentive for the EA-funded alignment researchers to never to become post-paradigmatic.
I believe this pre-paradigmatic stance also couples to the reluctance among many EAs to ever think about politics, to make actual policy proposals, or to investigate what it would take to get a policy proposal accepted.
There is an extreme type of pre-paradigmatic stance, which I also encounter on this forum. In this extreme stance, you do not only want more paradigms, but you also reject all already-existing paradigms as being fundamentally flawed, as not even close to being able to capture any truth. This rejection implies that you do not need to examine any of the policy proposals that might flow out of of any existing paradigmatic research. Which is convenient if you want to avoid thinking about policy. It also means you do not need to read other people’s research. Which can be convenient too.
If EA were to become post-paradigmatic, and then start to consider making actual policy proposals, this might split the community along various political fault lines, and it might upset many potential wealthy donors to boot. If you care about the size and funding level of the community, it is very convenient to remain in a pre-paradigmatic state, and to have people tell you that it is rational to be in that state.
I am not saying that EA is doomed to be ineffective. But I do feel that any alignment researcher who wants to be effective needs to be aware of the above forces that push them away from becoming paradigmatic, so that they can overcome these forces.
A few years back, I saw less talk about everybody being in a pre-paradigmatic state on this forum, and I was was feeling a vibe that was more encouraging to anybody who had a new idea. It may have been just me feeling that different vibe, though.
Based on my working thesis above, there is a deeper story about EA and paradigms to be researched and written, but it probably needs an EA to write it.
Honest confession: often when I get stuck doing actual paradigmatic AI alignment research, I feel an impulse to research and write well-researched meta-stories about the state of alignment field. At the same time, I feel that there is already an over-investment in people writing meta-stories, especially now that we have books like The alignment problem. So I usually manage to suppress my impulse to write well-researched meta-stories, sometimes by posting less fully researched meta-comments like this one.
I am interested in this criticism, particularly in connection to misconception 1 from Holden’s ‘Important, actionable research questions for the most important century’, which to me suggests doing less paradigmatic research (which I interpret to mean ‘what ‘normal science’ looks like in ML research/industry’ in the Structure of Scientific Revolutions sense, do say if I misinterpret ‘paradigm’).
I think this division would benefit from some examples however. To what extent to you agree with a quick classification of mine?
Paradigmatic alignment research
1) Interpretability of neural nets (e.g colah’s vision and transformer circuits)
2) Dealing with dataset bias and generalisation in ML
Pre-paradigmatic alignment research
1) Agentic foundations and things MIRI work on
2) Proposals for alignment put forward by Paul Christiano, e.g Iterated Amplification
My concern is that while the list two problems are more fuzzy and less well-defined, they are far less direcetly if at all (in 2) actually working on the problem we actually care about.
First, a remark on Holden’s writeup. I wrote above that Several EA funding managers are on record as wanting to fund pre-paradigmatic research, From his writeup, I am not entirely sure if Holden is one of them, the word ‘paradigmatic’ does not appear in it. But it is definately clear that Holden is not very happy with the current paradigm of AI research, in the Kuhnian sense where a paradigm is more than just a scientific method but a whole value system supported by a dominant tribe.
To quote a bit of Wikipedia:
Now, Holden writes under misconception 1:
Holden here expresses worry about a lack of incentives to tackle the right questions, not about these institutions even lacking the right scientific tools to make any progress if they wanted to. So Holden’s concern here is somewhat orthogonal to the ‘pre-paradigmatic’ narratives associated with MIRI and John Wentworth, which is that these institutions are not even using the right tools.
That being said, Holden has written a lot. I am only commenting here about a tiny part of one single post.
On your examples of Paradigmatic alignment research vs. Pre-paradigmatic alignment research: I agree with your paradigmatic examples being paradigmatic, because they have the strength of the tribe of ML researchers behind them. (A few years back, dataset bias was still considered a somewhat strange and career-killing topic to work on if you wanted to be an ML researcher, but this has changed, if I judge this by the most recent NeurIPS conference.)
The pre-paradigmatic examples you mention do not have the ML research tribe behind them, but in a Kuhnian sense they are in fact paradigmatic inside the EA/rationalist tribe. So I might still call them paradigmatic, just in a different tribe.
..I am confused here, you meant to write ‘first two problems’ above? I can’t really decode your concern.