Example projects you’re not allowed to do, if they involve other model families:
using Llama 2 as part of an RLAIF setup, which you might want to do when investigating Constitutional AI or decomposition or faithfulness of chain-of-thought or many many other projects;
using Llama 2 in auto-interpretability schemes to e.g. label detected features in smaller models, if this will lead to improvements in non-Llama-2 models;
fine-tuning other or smaller models on synthetic data produced by Llama 2, which has some downsides but is a great way to check for signs of life of a proposed technique
In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it’s differentially risky for any organisation which Meta might like to legally harass.
Thanks, that makes sense! I did not fully realize that the phrase in the terms is really just “improve any other large language model”, which is indeed so vague/general that it could be interpreted to include almost any activity that would entail using Llama-2 in conjunction with other models.
Thanks a lot for the context!
Out of curiosity, why does the model training restriction make it much less useful for safety research?
Example projects you’re not allowed to do, if they involve other model families:
using Llama 2 as part of an RLAIF setup, which you might want to do when investigating Constitutional AI or decomposition or faithfulness of chain-of-thought or many many other projects;
using Llama 2 in auto-interpretability schemes to e.g. label detected features in smaller models, if this will lead to improvements in non-Llama-2 models;
fine-tuning other or smaller models on synthetic data produced by Llama 2, which has some downsides but is a great way to check for signs of life of a proposed technique
In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it’s differentially risky for any organisation which Meta might like to legally harass.
Thanks, that makes sense! I did not fully realize that the phrase in the terms is really just “improve any other large language model”, which is indeed so vague/general that it could be interpreted to include almost any activity that would entail using Llama-2 in conjunction with other models.