I am an alignment researcher and I read LW and AF occasionally. I tend to focus more on reading academic papers, not the alignment blogosphere. I read LW and AF mostly to find links to academic papers I might otherwise overlook, and for the occasional long-from analysis blogpost that the writer(s) put several months in to write. I am not a rationalist.
What I am seeing on LW is that numerically, many of the AI posts are from from newcomers to the alignment field, or from people who are just thinking about getting in. This is perfectly is fine, because they need some place to post and potentially get their questions answered. I do not think that the cause of alignment would be improved by moving all of these AI newcomer posts out of LW and onto AF,
So if there is a concern that high-quality long-form rationalist content is being drowned out by all the AI talk, I suggest you create an AF-like sub-forum dedicated to rationalist thought.
The AF versions of posts are primarily meant to be a thing you can link to professionally without having to explain the context of a lot of weird, not-obviously-related topics that show up on LessWrong.
From were I am standing, professionally speaking, AF has plenty of way-to-weird AI alignment content on it. Any policy maker or card-carrying AI/ML researcher browsing AF will quickly conclude that it is a place where posters can venture far outside of their political or main stream science Overton windows, without ever being shouted down or even frowned upon by the rest of the posters. It is also the most up-voted and commented-on posts that are often the least inside any Overton window. This is just a thing that has grown historically, there is definitely beauty and value in it, and it is definitely is too late to change now. Too late also given that that EY has now gone full prophet-of-doom.
What I am hearing is that some alignment newcomers who have spent a few months doing original research, and writing a paper on it, have trouble getting their post on their results promoted from LW to AF. This is a de-motivator which I feel limits the growth of the field, so I would not mind if the moderators of this site start using (and advertising that they are using) an automatic rule where, if it is clear that the post publishes alignment research results that took moths of honest effort to produce, any author request to promote it to AF will be almost automatically granted, no matter what the moderators think about the quality of the work inside.
Too late also given that that EY has now gone full prophet-of-doom.
I absolutely agree, at least here, and I’m not a fan of this. I think a large part of the problem is dubious assumptions combined with dubious solutions.
One good example is the FOOM assumption, which has much higher probability mass in MIRI than they should. The probability of FOOM is more like 3% in the first AI than 60-90%.
Second, their solutions are not really what is necessary here. In my view, interpretability and making sure that deceptive aligned models never arise is of paramount importance. Crucially, this will look far more empirical than past work.
That doesn’t mean we will make it, but it does mean we can probably deal with the problem.
First some background on me, then some thoughts.
I am an alignment researcher and I read LW and AF occasionally. I tend to focus more on reading academic papers, not the alignment blogosphere. I read LW and AF mostly to find links to academic papers I might otherwise overlook, and for the occasional long-from analysis blogpost that the writer(s) put several months in to write. I am not a rationalist.
What I am seeing on LW is that numerically, many of the AI posts are from from newcomers to the alignment field, or from people who are just thinking about getting in. This is perfectly is fine, because they need some place to post and potentially get their questions answered. I do not think that the cause of alignment would be improved by moving all of these AI newcomer posts out of LW and onto AF,
So if there is a concern that high-quality long-form rationalist content is being drowned out by all the AI talk, I suggest you create an AF-like sub-forum dedicated to rationalist thought.
From were I am standing, professionally speaking, AF has plenty of way-to-weird AI alignment content on it. Any policy maker or card-carrying AI/ML researcher browsing AF will quickly conclude that it is a place where posters can venture far outside of their political or main stream science Overton windows, without ever being shouted down or even frowned upon by the rest of the posters. It is also the most up-voted and commented-on posts that are often the least inside any Overton window. This is just a thing that has grown historically, there is definitely beauty and value in it, and it is definitely is too late to change now. Too late also given that that EY has now gone full prophet-of-doom.
What I am hearing is that some alignment newcomers who have spent a few months doing original research, and writing a paper on it, have trouble getting their post on their results promoted from LW to AF. This is a de-motivator which I feel limits the growth of the field, so I would not mind if the moderators of this site start using (and advertising that they are using) an automatic rule where, if it is clear that the post publishes alignment research results that took moths of honest effort to produce, any author request to promote it to AF will be almost automatically granted, no matter what the moderators think about the quality of the work inside.
I absolutely agree, at least here, and I’m not a fan of this. I think a large part of the problem is dubious assumptions combined with dubious solutions.
One good example is the FOOM assumption, which has much higher probability mass in MIRI than they should. The probability of FOOM is more like 3% in the first AI than 60-90%.
Second, their solutions are not really what is necessary here. In my view, interpretability and making sure that deceptive aligned models never arise is of paramount importance. Crucially, this will look far more empirical than past work.
That doesn’t mean we will make it, but it does mean we can probably deal with the problem.