Do you have links handy?
Various discussion in this reddit thread: https://www.reddit.com/r/mlscaling/comments/trwkck/training_computeoptimal_large_language_models/
In particular this comment: https://www.reddit.com/r/mlscaling/comments/trwkck/comment/i2pc6bk/?utm_source=reddit&utm_medium=web2x&context=3
Dang, I’ve been missing out on juicy Gwern comments! I better follow them on reddit...
Do you have links handy?
Various discussion in this reddit thread: https://www.reddit.com/r/mlscaling/comments/trwkck/training_computeoptimal_large_language_models/
In particular this comment: https://www.reddit.com/r/mlscaling/comments/trwkck/comment/i2pc6bk/?utm_source=reddit&utm_medium=web2x&context=3
Dang, I’ve been missing out on juicy Gwern comments! I better follow them on reddit...