One piece of evidence for ML people not understanding things is the low popularity of uParameterization. After it was released there’s every theoretical and epirical reason to use it but most big projects (like llama 2?) just don’t.
Never heard of it, and I can’t even find it with a google search. What’s the “U” stand for? Unitary?
He means ‘µ-Parametrization’ or ‘µP’ and is just being lazy about not spelling out ‘mu-parameterization’ as the standard ASCII form.
One piece of evidence for ML people not understanding things is the low popularity of uParameterization. After it was released there’s every theoretical and epirical reason to use it but most big projects (like llama 2?) just don’t.
Never heard of it, and I can’t even find it with a google search. What’s the “U” stand for? Unitary?
He means ‘µ-Parametrization’ or ‘µP’ and is just being lazy about not spelling out ‘mu-parameterization’ as the standard ASCII form.