How harmful are improvements in AI? + Poll
This post was written by Marius Hobbhahn and Tilman Räuker.
Disclaimer: We have previously posted this piece on the EA forum. We now post it here because LW allows for polls, and we have worked in additional feedback.
Over the last years, we have encountered stances on development within AI from “all forms of development in AI is bad” over “sometimes development can be justified” to “If it means control over AI technologies, development could be a necessary evil.” We think it is essential to be aware of these different perspectives of development in AI and be able to deduct where other researchers/organizations stand. This post will look at different considerations and trade-offs regarding AI development and possible pitfalls.
You are welcome to add further reasons, trade-offs, etc.
TL;DR: We investigate different views on development and their considerations. We could not identify one view as clearly superior. We think that the interaction with non-aligned actors, e.g. AI researchers/companies, can make it necessary to apply strategies that increase acceleration.
Ozzie has recently posted 13 different stances on AGI. Our post tries to pick up some of his suggested “next steps.”
A bit of clarification about EleutherAI’s stance: the compressed version of our argument is that a) for a variety of reasons, including the fact that our models are far behind the frontier, we believe that the AGI capabilities contribution of our release is very small, and b) we believe that there’s a significant chance that alignment research that has any meaningful chance of generalizing to AGI requires access to large language models. I would say that the best case outcome of our work would be if research using our models results in some novel alignment techniques that scale to superhuman LLM-based AGI.
Our full argument is pretty nuanced and it’s hard to do justice to it in a few sentences, so I recommend reading the alignment section of the recent NeoX 20B paper, which outlines some of these arguments (and especially which concrete directions in particular we’re interested in) in far more detail.
thanks for the clarification!
Thank you for giving more context to EleutherAI’s stance on acceleration and linking to your newest paper.
I support the claim that your open model contributes to AI safety research, and I generally agree with the improvements for the alignment landscape. I can also understand why you are not detailing possible failure modes of realising LLM, as this would basically be stating a bunch of infohazards.
But at least for me, this opens the space for discussing until which point to open up previously closed models for the sake of alignment research. If an aligned researcher can benefit from access, so could a non-aligned researcher, hence the ” accidental acceleration.”
Power corrupts, so I don’t think the view number 3. Gaining control is likely to help.