I agree with Tsvi here (as I’m sure will shock you :)).
I’d make a few points:
“our revealed preferences largely disagree with point 1”—this isn’t clear at all. We know MATS’ [preferences, given the incentives and constraints under which MATS operates]. We don’t know what you’d do absent such incentives and constraints.
I note also that “but we aren’t Refine” has the form [but we’re not doing x], rather than [but we have good reasons not to do x]. (I don’t think MATS should be Refine, but “we’re not currently 20% Refine-on-ramp” is no argument that it wouldn’t be a good idea)
MATS is in a stronger position than most to exert influence on the funding landscape. Sure, others should make this case too, but MATS should be actively making a case for what seems most important (to you, that is), not only catering to the current market.
Granted, this is complicated by MATS’ own funding constraints—you have more to lose too (and I do think this is a serious factor, undesirable as it might be).
If you believe that the current direction of the field isn’t great, then “ensure that our program continues to meet the talent needs of safety teams” is simply the wrong goal.
Of course the right goal isn’t diametrically opposed to that—but still, not that.
There’s little reason to expect the current direction of the field to be close to ideal:
At best, the accuracy of the field’s collective direction will tend to correspond to its collective understanding—which is low.
There are huge commercial incentives exerting influence.
There’s no clarity on what constitutes (progress towards) genuine impact.
There are many incentives to work on what’s already not neglected (e.g. things with easily located “tight empirical feedback loops”). The desirable properties of the non-neglected directions are a large part of the reason they’re not neglected.
Similar arguments apply to [field-level self-correction mechanisms].
Given (4), there’s an inherent sampling bias in taking [needs of current field] as [what MATS should provide]. Of course there’s still an efficiency upside in catering to [needs of current field] to a large extent—but efficiently heading in a poor direction still sucks.
I think it’s instructive to consider extreme-field-composition thought experiments: suppose the field were composed of [10,000 researchers doing mech interp] [10 researchers doing agent foundations].
Where would there be most jobs? Most funding? Most concrete ideas for further work? Does it follow that MATS would focus almost entirely on meeting the needs of all the mech interp orgs? (I expect that almost all the researchers in that scenario would claim mech interp is the most promising direction)
If you think that feedback loops along the lines of [[fast legible work on x] --> [x seems productive] --> [more people fund and work on x]] lead to desirable field dynamics in an AIS context, then it may make sense to cater to the current market. (personally, I expect this to give a systematically poor signal, but it’s not as though it’s easy to find good signals)
If you don’t expect such dynamics to end well, it’s worth considering to what extent MATS can be a field-level self-correction mechanism, rather than a contributor to predictably undesirable dynamics.
I’m not claiming this is easy!!
I’m claiming that it should be tried.
Detailing what job and funding opportunities should exist in the technical AI safety field is beyond the scope of this report.
Understandable, but do you know anyone who’s considering this? As the core of their job, I mean—not on a [something they occasionally think/talk about for a couple of hours] level. It’s non-obvious to me that anyone at OpenPhil has time for this.
It seems to me that the collective ‘decision’ we’ve made here is something like:
Any person/team doing this job would need:
Extremely good AIS understanding.
To be broadly respected.
Have a lot of time.
Nobody like this exists.
We’ll just hope things work out okay using a passive distributed approach.
To my eye this leads to a load of narrow optimization according to often-not-particularly-enlightened metrics—lots of common incentives, common metrics, and correlated failure.
Oh and I still think MATS is great :) - and that most of these issues are only solvable with appropriate downstream funding landscape alterations. That said, I remain hopeful that MATS can nudge things in a helpful direction.
I plan to respond regarding MATS’ future priorities when I’m able (I can’t speak on behalf of MATS alone here and we are currently examining priorities in the lead up to our Winter 2024-25 Program), but in the meantime I’ve added some requests for proposals to my Manifund Regrantor profile.
RFPs seem a good tool here for sure. Other coordination mechanisms too. (And perhaps RFPs for RFPs, where sketching out high-level desiderata is easier than specifying parameters for [type of concrete project you’d like to see])
Oh and I think the MATS Winter Retrospective seems great from the [measure a whole load of stuff] perspective. I think it’s non-obvious what conclusions to draw, but more data is a good starting point. It’s on my to-do-list to read it carefully and share some thoughts.
I agree with Tsvi here (as I’m sure will shock you :)).
I’d make a few points:
“our revealed preferences largely disagree with point 1”—this isn’t clear at all. We know MATS’ [preferences, given the incentives and constraints under which MATS operates]. We don’t know what you’d do absent such incentives and constraints.
I note also that “but we aren’t Refine” has the form [but we’re not doing x], rather than [but we have good reasons not to do x]. (I don’t think MATS should be Refine, but “we’re not currently 20% Refine-on-ramp” is no argument that it wouldn’t be a good idea)
MATS is in a stronger position than most to exert influence on the funding landscape. Sure, others should make this case too, but MATS should be actively making a case for what seems most important (to you, that is), not only catering to the current market.
Granted, this is complicated by MATS’ own funding constraints—you have more to lose too (and I do think this is a serious factor, undesirable as it might be).
If you believe that the current direction of the field isn’t great, then “ensure that our program continues to meet the talent needs of safety teams” is simply the wrong goal.
Of course the right goal isn’t diametrically opposed to that—but still, not that.
There’s little reason to expect the current direction of the field to be close to ideal:
At best, the accuracy of the field’s collective direction will tend to correspond to its collective understanding—which is low.
There are huge commercial incentives exerting influence.
There’s no clarity on what constitutes (progress towards) genuine impact.
There are many incentives to work on what’s already not neglected (e.g. things with easily located “tight empirical feedback loops”). The desirable properties of the non-neglected directions are a large part of the reason they’re not neglected.
Similar arguments apply to [field-level self-correction mechanisms].
Given (4), there’s an inherent sampling bias in taking [needs of current field] as [what MATS should provide]. Of course there’s still an efficiency upside in catering to [needs of current field] to a large extent—but efficiently heading in a poor direction still sucks.
I think it’s instructive to consider extreme-field-composition thought experiments: suppose the field were composed of [10,000 researchers doing mech interp] [10 researchers doing agent foundations].
Where would there be most jobs? Most funding? Most concrete ideas for further work? Does it follow that MATS would focus almost entirely on meeting the needs of all the mech interp orgs? (I expect that almost all the researchers in that scenario would claim mech interp is the most promising direction)
If you think that feedback loops along the lines of [[fast legible work on x] --> [x seems productive] --> [more people fund and work on x]] lead to desirable field dynamics in an AIS context, then it may make sense to cater to the current market. (personally, I expect this to give a systematically poor signal, but it’s not as though it’s easy to find good signals)
If you don’t expect such dynamics to end well, it’s worth considering to what extent MATS can be a field-level self-correction mechanism, rather than a contributor to predictably undesirable dynamics.
I’m not claiming this is easy!!
I’m claiming that it should be tried.
Understandable, but do you know anyone who’s considering this? As the core of their job, I mean—not on a [something they occasionally think/talk about for a couple of hours] level. It’s non-obvious to me that anyone at OpenPhil has time for this.
It seems to me that the collective ‘decision’ we’ve made here is something like:
Any person/team doing this job would need:
Extremely good AIS understanding.
To be broadly respected.
Have a lot of time.
Nobody like this exists.
We’ll just hope things work out okay using a passive distributed approach.
To my eye this leads to a load of narrow optimization according to often-not-particularly-enlightened metrics—lots of common incentives, common metrics, and correlated failure.
Oh and I still think MATS is great :) - and that most of these issues are only solvable with appropriate downstream funding landscape alterations. That said, I remain hopeful that MATS can nudge things in a helpful direction.
I plan to respond regarding MATS’ future priorities when I’m able (I can’t speak on behalf of MATS alone here and we are currently examining priorities in the lead up to our Winter 2024-25 Program), but in the meantime I’ve added some requests for proposals to my Manifund Regrantor profile.
RFPs seem a good tool here for sure. Other coordination mechanisms too.
(And perhaps RFPs for RFPs, where sketching out high-level desiderata is easier than specifying parameters for [type of concrete project you’d like to see])
Oh and I think the MATS Winter Retrospective seems great from the [measure a whole load of stuff] perspective. I think it’s non-obvious what conclusions to draw, but more data is a good starting point. It’s on my to-do-list to read it carefully and share some thoughts.