A bit of clarification about EleutherAI’s stance: the compressed version of our argument is that a) for a variety of reasons, including the fact that our models are far behind the frontier, we believe that the AGI capabilities contribution of our release is very small, and b) we believe that there’s a significant chance that alignment research that has any meaningful chance of generalizing to AGI requires access to large language models. I would say that the best case outcome of our work would be if research using our models results in some novel alignment techniques that scale to superhuman LLM-based AGI.
Our full argument is pretty nuanced and it’s hard to do justice to it in a few sentences, so I recommend reading the alignment section of the recent NeoX 20B paper, which outlines some of these arguments (and especially which concrete directions in particular we’re interested in) in far more detail.
Thank you for giving more context to EleutherAI’s stance on acceleration and linking to your newest paper.
I support the claim that your open model contributes to AI safety research, and I generally agree with the improvements for the alignment landscape. I can also understand why you are not detailing possible failure modes of realising LLM, as this would basically be stating a bunch of infohazards. But at least for me, this opens the space for discussing until which point to open up previously closed models for the sake of alignment research. If an aligned researcher can benefit from access, so could a non-aligned researcher, hence the ” accidental acceleration.”
A bit of clarification about EleutherAI’s stance: the compressed version of our argument is that a) for a variety of reasons, including the fact that our models are far behind the frontier, we believe that the AGI capabilities contribution of our release is very small, and b) we believe that there’s a significant chance that alignment research that has any meaningful chance of generalizing to AGI requires access to large language models. I would say that the best case outcome of our work would be if research using our models results in some novel alignment techniques that scale to superhuman LLM-based AGI.
Our full argument is pretty nuanced and it’s hard to do justice to it in a few sentences, so I recommend reading the alignment section of the recent NeoX 20B paper, which outlines some of these arguments (and especially which concrete directions in particular we’re interested in) in far more detail.
thanks for the clarification!
Thank you for giving more context to EleutherAI’s stance on acceleration and linking to your newest paper.
I support the claim that your open model contributes to AI safety research, and I generally agree with the improvements for the alignment landscape. I can also understand why you are not detailing possible failure modes of realising LLM, as this would basically be stating a bunch of infohazards.
But at least for me, this opens the space for discussing until which point to open up previously closed models for the sake of alignment research. If an aligned researcher can benefit from access, so could a non-aligned researcher, hence the ” accidental acceleration.”