(EDIT: I just saw Ryan posted a comment a few minutes before mine, I agree substantially with it)
As a Google DeepMind employee I’m obviously pretty biased, but this seems pretty reasonable to me, assuming it’s about alignment/similar teams at those labs? (If it’s about capabilities teams, I agree that’s bad!)
I think the alignment teams generally do good and useful work, especially those in a position to publish on it. And it seems extremely important that whoever makes AGI has a world-class alignment team! And some kinds of alignment research can only really be done with direct access to frontier models. MATS scholars tend to be pretty early in their alignment research career, and I also expect frontier lab alignment teams are a better place to learn technical skills especially engineering, and generally have a higher talent density there.
UK AISI/US AISI/METR seem like solid options for evals, but basically just work on evals, and Ryan says down thread that only 18% of scholars work on evals/demos. And I think it’s valuable both for frontier labs to have good evals teams and for there to be good external evaluators (especially in government), I can see good arguments favouring either option.
44% of scholars did interpretability, where in my opinion the Anthropic team is clearly a fantastic option, and I like to think DeepMind is also a decent option, as is OpenAI. Apollo and various academic labs are the main other places you can do mech interp. So those career preferences seem pretty reasonable to me there for interp scholars.
17% are on oversight/control, and for oversight I think you generally want a lot of compute and access to frontier models? I am less sure for control, and think Redwood is doing good work there, but as far as I’m aware they’re not hiring.
This is all assuming that scholars want to keep working in the same field they did MATS for, which in my experience is often but not always true.
I’m personally quite skeptical of inexperienced researchers trying to start new orgs—starting a new org and having it succeed is really, really hard, and much easier with more experience! So people preferring to get jobs seems great by my lights
Thanks, Neel! I responded in greater detail to Ryan’s comment but just wanted to note here that I appreciate yours as well & agree with a lot of it.
My main response to this is something like “Given that MATS selects the mentors and selects the fellows, MATS has a lot of influence over what the fellows are interested in. My guess is that MATS’ current mentor pool & selection process overweights interpretability and underweights governance + technical governance, relative to what I think would be ideal.”
I see this is strongly disagree voted—I don’t mind, but I’d be curious for people to reply with which parts they disagree with! (Or at least disagree react to specific lines). I make a lot of claims in that comment, though I personally think they’re all pretty reasonable. The one about not wanting inexperienced researchers to start orgs, or “alignment teams at scaling labs are good actually” might be spiciest?
(EDIT: I just saw Ryan posted a comment a few minutes before mine, I agree substantially with it)
As a Google DeepMind employee I’m obviously pretty biased, but this seems pretty reasonable to me, assuming it’s about alignment/similar teams at those labs? (If it’s about capabilities teams, I agree that’s bad!)
I think the alignment teams generally do good and useful work, especially those in a position to publish on it. And it seems extremely important that whoever makes AGI has a world-class alignment team! And some kinds of alignment research can only really be done with direct access to frontier models. MATS scholars tend to be pretty early in their alignment research career, and I also expect frontier lab alignment teams are a better place to learn technical skills especially engineering, and generally have a higher talent density there.
UK AISI/US AISI/METR seem like solid options for evals, but basically just work on evals, and Ryan says down thread that only 18% of scholars work on evals/demos. And I think it’s valuable both for frontier labs to have good evals teams and for there to be good external evaluators (especially in government), I can see good arguments favouring either option.
44% of scholars did interpretability, where in my opinion the Anthropic team is clearly a fantastic option, and I like to think DeepMind is also a decent option, as is OpenAI. Apollo and various academic labs are the main other places you can do mech interp. So those career preferences seem pretty reasonable to me there for interp scholars.
17% are on oversight/control, and for oversight I think you generally want a lot of compute and access to frontier models? I am less sure for control, and think Redwood is doing good work there, but as far as I’m aware they’re not hiring.
This is all assuming that scholars want to keep working in the same field they did MATS for, which in my experience is often but not always true.
I’m personally quite skeptical of inexperienced researchers trying to start new orgs—starting a new org and having it succeed is really, really hard, and much easier with more experience! So people preferring to get jobs seems great by my lights
Thanks, Neel! I responded in greater detail to Ryan’s comment but just wanted to note here that I appreciate yours as well & agree with a lot of it.
My main response to this is something like “Given that MATS selects the mentors and selects the fellows, MATS has a lot of influence over what the fellows are interested in. My guess is that MATS’ current mentor pool & selection process overweights interpretability and underweights governance + technical governance, relative to what I think would be ideal.”
I see this is strongly disagree voted—I don’t mind, but I’d be curious for people to reply with which parts they disagree with! (Or at least disagree react to specific lines). I make a lot of claims in that comment, though I personally think they’re all pretty reasonable. The one about not wanting inexperienced researchers to start orgs, or “alignment teams at scaling labs are good actually” might be spiciest?