I would also suspect that #2 (finding/generating good researchers) is more valuable than #1 (generating or accelerating good research during the MATS program itself).
One problem with #2 is that it’s usually harder to evaluate and takes longer to evaluate. #2 requires projections, often over the course of years. #1 is still difficult to evaluate (what is “good alignment research” anyways?) but seems easier in comparison.
Also, I would expect correlations between #1 and #2. Like, one way to evaluate “how good are we doing at training researchers//who are the best researchers” is to ask “how good is the research they are producing//who produced the best research in this 3-month period?”
This process is (of course) imperfect. For example, someone might have great output because their mentor handed them a bunch of ready-to-go-projects, but the scholar didn’t actually have to learn the important skills of “forming novel ideas” or “figuring out how to prioritize between many different directions.”
But in general, I think it’s a pretty decent way to evaluate things. If someone has produced high-quality and original research during the MATS program, that sure does seem like a strong signal for their future potential. Likewise, in the opposite extreme, if during the entire summer cohort there were 0 instances of useful original work, that doesn’t necessarily mean something is wrong, but it would make me go “hmmm, maybe we should brainstorm possible changes to the program that could make it more likely that we see high-quality original output next time, and then we see how much those proposed changes trade-off against other desireada.”
(It seems quite likely to me that the MATS team has already considered all of this; just responding on the off-chance that something here is useful!)
I would also suspect that #2 (finding/generating good researchers) is more valuable than #1 (generating or accelerating good research during the MATS program itself).
One problem with #2 is that it’s usually harder to evaluate and takes longer to evaluate. #2 requires projections, often over the course of years. #1 is still difficult to evaluate (what is “good alignment research” anyways?) but seems easier in comparison.
Also, I would expect correlations between #1 and #2. Like, one way to evaluate “how good are we doing at training researchers//who are the best researchers” is to ask “how good is the research they are producing//who produced the best research in this 3-month period?”
This process is (of course) imperfect. For example, someone might have great output because their mentor handed them a bunch of ready-to-go-projects, but the scholar didn’t actually have to learn the important skills of “forming novel ideas” or “figuring out how to prioritize between many different directions.”
But in general, I think it’s a pretty decent way to evaluate things. If someone has produced high-quality and original research during the MATS program, that sure does seem like a strong signal for their future potential. Likewise, in the opposite extreme, if during the entire summer cohort there were 0 instances of useful original work, that doesn’t necessarily mean something is wrong, but it would make me go “hmmm, maybe we should brainstorm possible changes to the program that could make it more likely that we see high-quality original output next time, and then we see how much those proposed changes trade-off against other desireada.”
(It seems quite likely to me that the MATS team has already considered all of this; just responding on the off-chance that something here is useful!)