Also notable: NVIDIA trained a half-order-of-magnitude larger model https://nv-adlr.github.io/MegatronLM?utm_campaign=NLP%20News&utm_medium=email&utm_source=Revue%20newsletter
Also notable: NVIDIA trained a half-order-of-magnitude larger model https://nv-adlr.github.io/MegatronLM?utm_campaign=NLP%20News&utm_medium=email&utm_source=Revue%20newsletter