It doesn’t seem like a huge deal to depend on the existence of smaller LLMs—they’ll be cheap compared to the bigger one, and many LM series already contain smaller models. Not transferring between sites seems like a problem for any kind of reconstruction based metric because there’s actually just differently important information in different parts of the model.
It doesn’t seem like a huge deal to depend on the existence of smaller LLMs—they’ll be cheap compared to the bigger one, and many LM series already contain smaller models. Not transferring between sites seems like a problem for any kind of reconstruction based metric because there’s actually just differently important information in different parts of the model.