Thanks for the link. I’m reading through the facebook thread, and I’ll come back here to discuss it after I finish.
There are already outlets that allow robust peer review, and the field is not well served by moving away from the current CS / ML dynamic of arXiv papers and presentations at conferences, which allow for more rapid iteration and collaboration / building on work than traditional journals—which are often a year or more out of date as of when they appear.
The only actual peer review I see for the type of research I’m talking about by researchers knowledgeable in the subject is from private gdocs, as mentioned for example by Rohin here. Although it’s better than nothing, it has the issue of being completely invisible for any reader without access to these gdocs. Maybe you could infer the “peer-reviewness” of a post/paper by who is thanked in it, but that seems ridiculously roundabout.
When something is published in the AF, it rarely gets any feedback as deep as a peer-review or the comments in private gdocs. When something is published in a ML conference, I assume that most if not all reviews don’t really consider the broader safety and alignment questions, and focus on the short term ML relevance. And there is some research that is not even possible to publish in big ML venues.
As for conference vs journal… I see what you mean, but I don’t think it’s really a big problem. In small subfields that actively use arXiv, papers are old news when the conference happens, so it’s not a problem if they also are when the journal publishes them. I also wonder how faster could we get a journal to run if we actively try to ease the process. I’m thinking for example of not giving two months to reviewers when they all do their review the last week anyway. Lastly, you’re not proposing to make a conference, but if you were, I still think a conference would require much more work to organize.
However, if this were done, I would strongly suggest doing it as an arXiv overlay journal, rather than a traditional structure.
I hadn’t thought of overaly journals, that’s a great idea! It might actually make it feasible without a full-time administrator.
One key drawback you didn’t note is that allowing AI safety further insulation from mainstream AI work could further isolate it. It also likely makes it harder for AI-safety researchers to have mainstream academic careers, since narrow journals don’t help on most of the academic prestige metrics.
I agree that this is risk, which is still another reason to privilege a journal. At least in Computer Science, the publication process is generally preprint → conference → journal. In that way, we can allow the submission of papers previously accepted at NeurIPS for example (maybe extended versions), which should mitigate the cost to academic careers. And if the journal curate enough great papers, it might end up decent enough on academic prestige metrics.
Two more minor disagreement are about first, the claim that “If JAA existed, it would be a great place to send someone who wanted a general overview of the field.” I would disagree—in field journals are rarely as good a source as textbooks or non-technical overview.
Agreed. Yet as I answer to Daniel below, I don’t think AI Alignment is mature enough and clear enough on what matters to write a satisfying textbook. Also, the state of the art is basically never in textbooks, and that’s the sort of overview I was talking about.
Second, the idea that a journal would provide deeper, more specific, and better review than Alignment forum discussions and current informal discussions seems farfetched given my experience publishing in journals that are specific to a narrow area, like Health security, compared to my experience getting feedback on AI safety ideas.
Hum, if you compare to private discussions and gdocs, I mostly agree that the review would be as good or a little worse (although you might get reviews from researchers to which you wouldn’t have sent your research). If it’s for the Alignment Forum, I definitely disagree that all the comments that you get here would be as useful as an actual peer-review. The most useful feedback I saw here recently was this review of Alex Turner’s paper by John, and that was actually from a peer-review process on LW.
So my point is that a journal with an open peer-review might be a way to make private gdocs discussions accessible while ensuring most people (not only those in contact of other researchers) can get any feedback whatsoever.
Onto Daniel’s answer:
+1 to each of these. May I suggest, instead of creating a JAA, we create a textbook? Or maybe a “special compilation” book that simply aggregates stuff? Or maybe even an encyclopedia? It’s like a journal, except that it doesn’t prevent these things from being published in normal academic journals as well.
As I wrote above, I don’t think we’re at the point where a textbook is a viable (and even useful) endeavor). For the second point, journals are not really important for careers in computer science (with maybe some exceptions, but all the recruiting processes I know basically only care about the conferences and maybe about the existence of at least one journal paper). And as long as we actually accept extended versions of papers published at conferences, there should be no problem with doing both.
Thanks. FWIW I find my worries mostly addressed by your reply about computer science conferences being the source of academic prestige and thus not in conflict with JAA. I still think a textbook or encyclopedia would be great; I think the field is plenty advanced enough, and in general there isn’t enough distillation and compilation work being done.
My issue with a textbook comes more from the lack of consensus. Like, the fundamentals (what you would put in the first few chapters) for embedded agency are different from those for preference learning, different from those for inner alignment, different from those for agent incentives (to only quote a handful of research directions). IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
Textbooks that cover a number of different approaches without taking a position on which one is the best are pretty much the standard in many fields. (I recall struggling with it in some undergraduate psychology courses, as previous schooling didn’t prepare me for a textbook that would cover three mutually exclusive theories and present compelling evidence in favor of each. Before moving on and presenting three mutually exclusive theories about some other phenomenon on the very next page.)
Fair enough. I think my real issue with an AI Alignment textbook is that for me a textbook presents relatively foundational and well established ideas and theories (maybe multiple ones), whereas I feel that AI Alignment is basically only state-of-the-art exploration, and that we have very few things that should actually be put into a textbook right now.
But I could change my mind if you have an example of what should be included in such an AI Alignment textbook.
That doesn’t seem like a big problem to me. Just make a different textbook for each major approach, or a single textbook that talks about each of them in turn. I would love such a book, and would happily recommend it to people looking to learn more about the field.
Or, just go ahead and overlook big chunks of the field. As long as you are clear that this is what you are doing, the textbook will still be useful for those interested in the chunk it covers.
As I said in my answer to Kaj, the real problem I see is that I don’t think we have the necessary perspective to write a useful textbook. Textbooks basically never touch research in the last ten years, or that research must be really easy to interpret and present, which is not the case here.
I think we do. I also think attempting to write a textbook would speed up the process of acquiring more perspective. Our goals, motivations, and constraints are very different from the goals and motivations of most textbook-writers, I think, so I don’t feel much pressure to defer to the collective judgment of other textbook-writers.
Thanks for your pushback! I’ll respond to both of you in this comment.
Thanks for the link. I’m reading through the facebook thread, and I’ll come back here to discuss it after I finish.
The only actual peer review I see for the type of research I’m talking about by researchers knowledgeable in the subject is from private gdocs, as mentioned for example by Rohin here. Although it’s better than nothing, it has the issue of being completely invisible for any reader without access to these gdocs. Maybe you could infer the “peer-reviewness” of a post/paper by who is thanked in it, but that seems ridiculously roundabout.
When something is published in the AF, it rarely gets any feedback as deep as a peer-review or the comments in private gdocs. When something is published in a ML conference, I assume that most if not all reviews don’t really consider the broader safety and alignment questions, and focus on the short term ML relevance. And there is some research that is not even possible to publish in big ML venues.
As for conference vs journal… I see what you mean, but I don’t think it’s really a big problem. In small subfields that actively use arXiv, papers are old news when the conference happens, so it’s not a problem if they also are when the journal publishes them. I also wonder how faster could we get a journal to run if we actively try to ease the process. I’m thinking for example of not giving two months to reviewers when they all do their review the last week anyway. Lastly, you’re not proposing to make a conference, but if you were, I still think a conference would require much more work to organize.
I hadn’t thought of overaly journals, that’s a great idea! It might actually make it feasible without a full-time administrator.
I agree that this is risk, which is still another reason to privilege a journal. At least in Computer Science, the publication process is generally preprint → conference → journal. In that way, we can allow the submission of papers previously accepted at NeurIPS for example (maybe extended versions), which should mitigate the cost to academic careers. And if the journal curate enough great papers, it might end up decent enough on academic prestige metrics.
Agreed. Yet as I answer to Daniel below, I don’t think AI Alignment is mature enough and clear enough on what matters to write a satisfying textbook. Also, the state of the art is basically never in textbooks, and that’s the sort of overview I was talking about.
Hum, if you compare to private discussions and gdocs, I mostly agree that the review would be as good or a little worse (although you might get reviews from researchers to which you wouldn’t have sent your research). If it’s for the Alignment Forum, I definitely disagree that all the comments that you get here would be as useful as an actual peer-review. The most useful feedback I saw here recently was this review of Alex Turner’s paper by John, and that was actually from a peer-review process on LW.
So my point is that a journal with an open peer-review might be a way to make private gdocs discussions accessible while ensuring most people (not only those in contact of other researchers) can get any feedback whatsoever.
Onto Daniel’s answer:
As I wrote above, I don’t think we’re at the point where a textbook is a viable (and even useful) endeavor). For the second point, journals are not really important for careers in computer science (with maybe some exceptions, but all the recruiting processes I know basically only care about the conferences and maybe about the existence of at least one journal paper). And as long as we actually accept extended versions of papers published at conferences, there should be no problem with doing both.
Thanks. FWIW I find my worries mostly addressed by your reply about computer science conferences being the source of academic prestige and thus not in conflict with JAA. I still think a textbook or encyclopedia would be great; I think the field is plenty advanced enough, and in general there isn’t enough distillation and compilation work being done.
My issue with a textbook comes more from the lack of consensus. Like, the fundamentals (what you would put in the first few chapters) for embedded agency are different from those for preference learning, different from those for inner alignment, different from those for agent incentives (to only quote a handful of research directions). IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
Textbooks that cover a number of different approaches without taking a position on which one is the best are pretty much the standard in many fields. (I recall struggling with it in some undergraduate psychology courses, as previous schooling didn’t prepare me for a textbook that would cover three mutually exclusive theories and present compelling evidence in favor of each. Before moving on and presenting three mutually exclusive theories about some other phenomenon on the very next page.)
Fair enough. I think my real issue with an AI Alignment textbook is that for me a textbook presents relatively foundational and well established ideas and theories (maybe multiple ones), whereas I feel that AI Alignment is basically only state-of-the-art exploration, and that we have very few things that should actually be put into a textbook right now.
But I could change my mind if you have an example of what should be included in such an AI Alignment textbook.
That doesn’t seem like a big problem to me. Just make a different textbook for each major approach, or a single textbook that talks about each of them in turn. I would love such a book, and would happily recommend it to people looking to learn more about the field.
Or, just go ahead and overlook big chunks of the field. As long as you are clear that this is what you are doing, the textbook will still be useful for those interested in the chunk it covers.
As I said in my answer to Kaj, the real problem I see is that I don’t think we have the necessary perspective to write a useful textbook. Textbooks basically never touch research in the last ten years, or that research must be really easy to interpret and present, which is not the case here.
I’m open to being proven wrong, though.
I think we do. I also think attempting to write a textbook would speed up the process of acquiring more perspective. Our goals, motivations, and constraints are very different from the goals and motivations of most textbook-writers, I think, so I don’t feel much pressure to defer to the collective judgment of other textbook-writers.