TBH i have only glanced at the abstracts of those papers, and my linking them shouldn’t be considered an endorsement. On priors I would be somewhat surprised if something like ‘g’ didn’t exist for LLMs—it stems naturally from scaling laws after all—but you have a good point about correlations of finetuned submodels. The degree of correlation or ‘variance explained by g’ in particular doesn’t seem like a sturdy metric to boast about as it will just depend heavily on the particular set of models and evaluations used.
I feel like they should have excluded different finetunings of the same base models, as surely including them pushes up the correlations.
TBH i have only glanced at the abstracts of those papers, and my linking them shouldn’t be considered an endorsement. On priors I would be somewhat surprised if something like ‘g’ didn’t exist for LLMs—it stems naturally from scaling laws after all—but you have a good point about correlations of finetuned submodels. The degree of correlation or ‘variance explained by g’ in particular doesn’t seem like a sturdy metric to boast about as it will just depend heavily on the particular set of models and evaluations used.