(later again, here’s Yan LeCun testifying under oath: “so first of all Llama system was not made open source … we released it in a way that did not authorize commercial use, we kind of vetted the people who could download the model it was reserved to researchers and academics”)
While their custom licence permits some commercial uses, it is not an OSI approved license, and because it violates the open source definition it never will be. Specifically, the llama 2 licence violates:
Source code. It’s a little ambiguous what this means for a trained model; I’d claim that an open model release should include the training code (yes) and dataset (no), along with sufficient instructions for others to reproduce the results. However, you could also argue that weights are in fact “the preferred form in which a programmer would modify the program”, so this is not an important objection.
No Discrimination Against Persons or Groups. See the ban on use by anyone who has, or is affiliated with anyone who has, more than 700M active users. As a side note, Snapchat recently announced that they had 750M active users, so this looks pretty targeted at competing social media (including Tiktok, Google, etc.). As a consequence, the Llama 2 license also violates OSD 7. Distribution of License: “the rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.”
No Discrimination Against Fields of Endeavor. If you can’t use Llama 2 to—for example—train another model, it’s by definition not open source. Their entire acceptable use policy is included by reference and contains a wide variety of sometimes ambiguous restrictions.
So, why does this matter?
As an open-source maintainer and PSF Fellow, I have no objection to the existence of commercially licensed software. I use much of it, and have sold commercial licenses for software that I’ve written too. However, people—and especially megacorps—misrepresenting their closed-off projects as open source is an infuriating form of parasitism on a reputation painstakingly built over decades.
The restriction on model training makes Llama 2 much less useful for AI safety research, but it incurs just as much direct (roughly all via misuse IMO) and acceleration risk as an open-source release.
Using a custom license adds substantial legal risk for prospective commercial users, especially given the very broad restrictions imposed by the acceptable use policy. This reduces the economic upside enormously relative to standard open terms, and leaves Meta’s competitors particularly at risk of lawsuits if they attempt to use Llama 2.
To summarize, Meta gets a better cost/benefit tradeoff by using a custom, non-open-source license especially if people incorrectly percieve it as open source; everyone else is worse off; and it seems to me like they’re deliberately misrepresenting what they’ve done for their own gain. This really, really annoys me.
When someone describes Llama 2 as “open source”, please correct them: Meta is offering a limited commercial license which discriminates against specific users and bans many valuable use-cases, including in alignment research.
Example projects you’re not allowed to do, if they involve other model families:
using Llama 2 as part of an RLAIF setup, which you might want to do when investigating Constitutional AI or decomposition or faithfulness of chain-of-thought or many many other projects;
using Llama 2 in auto-interpretability schemes to e.g. label detected features in smaller models, if this will lead to improvements in non-Llama-2 models;
fine-tuning other or smaller models on synthetic data produced by Llama 2, which has some downsides but is a great way to check for signs of life of a proposed technique
In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it’s differentially risky for any organisation which Meta might like to legally harass.
Thanks, that makes sense! I did not fully realize that the phrase in the terms is really just “improve any other large language model”, which is indeed so vague/general that it could be interpreted to include almost any activity that would entail using Llama-2 in conjunction with other models.
Llama 2 is not open source.
(a few days after this comment, here’s a concurring opinion from the Open Source Initiative—as close to authoritative as you can get)
(later again, here’s Yan LeCun testifying under oath: “so first of all Llama system was not made open source … we released it in a way that did not authorize commercial use, we kind of vetted the people who could download the model it was reserved to researchers and academics”)
While their custom licence permits some commercial uses, it is not an OSI approved license, and because it violates the open source definition it never will be. Specifically, the llama 2 licence violates:
Source code. It’s a little ambiguous what this means for a trained model; I’d claim that an open model release should include the training code (yes) and dataset (no), along with sufficient instructions for others to reproduce the results. However, you could also argue that weights are in fact “the preferred form in which a programmer would modify the program”, so this is not an important objection.
No Discrimination Against Persons or Groups. See the ban on use by anyone who has, or is affiliated with anyone who has, more than 700M active users. As a side note, Snapchat recently announced that they had 750M active users, so this looks pretty targeted at competing social media (including Tiktok, Google, etc.). As a consequence, the Llama 2 license also violates OSD 7. Distribution of License: “the rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.”
No Discrimination Against Fields of Endeavor. If you can’t use Llama 2 to—for example—train another model, it’s by definition not open source. Their entire acceptable use policy is included by reference and contains a wide variety of sometimes ambiguous restrictions.
So, why does this matter?
As an open-source maintainer and PSF Fellow, I have no objection to the existence of commercially licensed software. I use much of it, and have sold commercial licenses for software that I’ve written too. However, people—and especially megacorps—misrepresenting their closed-off projects as open source is an infuriating form of parasitism on a reputation painstakingly built over decades.
The restriction on model training makes Llama 2 much less useful for AI safety research, but it incurs just as much direct (roughly all via misuse IMO) and acceleration risk as an open-source release.
Using a custom license adds substantial legal risk for prospective commercial users, especially given the very broad restrictions imposed by the acceptable use policy. This reduces the economic upside enormously relative to standard open terms, and leaves Meta’s competitors particularly at risk of lawsuits if they attempt to use Llama 2.
To summarize, Meta gets a better cost/benefit tradeoff by using a custom, non-open-source license especially if people incorrectly percieve it as open source; everyone else is worse off; and it seems to me like they’re deliberately misrepresenting what they’ve done for their own gain. This really, really annoys me.
When someone describes Llama 2 as “open source”, please correct them: Meta is offering a limited commercial license which discriminates against specific users and bans many valuable use-cases, including in alignment research.
Huh, that’s very useful context, thanks! Seems like pretty sad behaviour.
Thanks a lot for the context!
Out of curiosity, why does the model training restriction make it much less useful for safety research?
Example projects you’re not allowed to do, if they involve other model families:
using Llama 2 as part of an RLAIF setup, which you might want to do when investigating Constitutional AI or decomposition or faithfulness of chain-of-thought or many many other projects;
using Llama 2 in auto-interpretability schemes to e.g. label detected features in smaller models, if this will lead to improvements in non-Llama-2 models;
fine-tuning other or smaller models on synthetic data produced by Llama 2, which has some downsides but is a great way to check for signs of life of a proposed technique
In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it’s differentially risky for any organisation which Meta might like to legally harass.
Thanks, that makes sense! I did not fully realize that the phrase in the terms is really just “improve any other large language model”, which is indeed so vague/general that it could be interpreted to include almost any activity that would entail using Llama-2 in conjunction with other models.