Apparently a “Sydney” model existed at least as early as 17 Dec 2021.
Carlos Ramón Guevara
Karma: 3
Why do you think that GPT-3 has untied embeddings?
Apparently a “Sydney” model existed at least as early as 17 Dec 2021.
Why do you think that GPT-3 has untied embeddings?
Still elementwise, so yeah