I don’t agree with the following claims (which might misrepresent you):
“Skill levels” are domain agnostic.
Frontier oversight, control, evals, and non-”science of DL” interp research is strictly easier in practice than frontier agent foundations and “science of DL” interp research.
The main reason there is more funding/interest in the former category than the latter is due to skill issues, rather than worldview differences and clarity of scope.
MATS has mid researchers relative to other programs.
Y’know, you probably have the data to do a quick-and-dirty check here. Take a look at the GRE/SAT scores on the applications (both for applicant pool and for accepted scholars). If most scholars have much-less-than-perfect scores, then you’re probably not hiring the top tier (standardized tests have a notoriously low ceiling). And assuming most scholars aren’t hitting the test ceiling, you can also test the hypothesis about different domains by looking at the test score distributions for scholars in the different areas.
We don’t collect GRE/SAT scores, but we do have CodeSignal scores and (for the first time) a general aptitude test developed in collaboration with SparkWave. Many MATS applicants have maxed out scores for the CodeSignal and general aptitude tests. We might share these stats later.
I don’t agree with the following claims (which might misrepresent you):
“Skill levels” are domain agnostic.
Frontier oversight, control, evals, and non-”science of DL” interp research is strictly easier in practice than frontier agent foundations and “science of DL” interp research.
The main reason there is more funding/interest in the former category than the latter is due to skill issues, rather than worldview differences and clarity of scope.
MATS has mid researchers relative to other programs.
Y’know, you probably have the data to do a quick-and-dirty check here. Take a look at the GRE/SAT scores on the applications (both for applicant pool and for accepted scholars). If most scholars have much-less-than-perfect scores, then you’re probably not hiring the top tier (standardized tests have a notoriously low ceiling). And assuming most scholars aren’t hitting the test ceiling, you can also test the hypothesis about different domains by looking at the test score distributions for scholars in the different areas.
We don’t collect GRE/SAT scores, but we do have CodeSignal scores and (for the first time) a general aptitude test developed in collaboration with SparkWave. Many MATS applicants have maxed out scores for the CodeSignal and general aptitude tests. We might share these stats later.