Anish Tondwalkar comments on “Open Source AI” is a lie, but it doesn’t have to be

Anish Tondwalkar 16 May 2024 23:24 UTC
2 points
2
Thanks for writing this and for the diagram, which I think is clearer than anything I’ve seen that’s attempted to clarify this mess.

That said, the terminology remains fantastically confusing.

> Mistral’s models are also not open source, but in a slightly more nuanced manner. Instead of releasing all artifacts describing their models, Mistral licensed the model weights using the Apache 2.0 license, which meets the requirements for a license to be Open Source. Unfortunately, however, no other artifacts were released. As a result, Mistral’s models can be used as-is by anyone, but the transparency that should go hand-in-hand with Open Source is no longer present.

IIUC, this means that Mistral’s models are both Open Source (in the sense of the license being an Open Source license), and Open Source AI (in the sense that they adhere to the OSAID—the first question in the OSAID FAQ is “Why is the original training dataset not required?”), but it is not classified as Open Source according to the classification scheme used in the diagram. Perhaps we need a 5th term to make these clearer?
- jacobhaimes 23 May 2024 19:23 UTC
  1 point
  0
  Parent
  Thanks for responding! I really appreciate engagement on this, and your input.
  I would disagree that Mistral’s models are considered Open Source using the current OSAID. Although training data itself is not required to be considered Open Source, a certain level of documentation is [source]. Mistral’s models do not meet these standards, however, if they did, I would happily place them in the Open Source column of the diagram (at least, if I were to update this article and/or make a new post).