Yes, absolutely. Five years ago, people were more honest about it, saying ~explicitly and out loud “ah, the real problems are too difficult; and I must eat and have friends; so I will work on something else, and see if I can get funding on the basis that it’s vaguely related to AI and safety”.
I have met somewhere between 100-200 AI Safety people in the past ~2 years; people for whom AI Safety is their ‘main thing’.
The vast majority of them are doing tractable/legible/comfortable things. Most are surprisingly naive; have less awareness of the space than I do (and I’m just a generalist lurker who finds this stuff interesting; not actively working on the problem).
Few are actually staring into the void of the hard problems; where hard here is loosely defined as ‘unknown unknowns, here be dragons, where do I even start’.
Fewer still progress from staring into the void to actually trying things.
I think some amount of this is natural and to be expected; I think even in an ideal world we probably still have a similar breakdown—a majority who aren’t contributing (yet)[1], a minority who are—and I think the difference is more in the size of those groups.
I think it’s reasonable to aim for a larger, higher quality, minority; I think it’s tractable to achieve progress through mindfully shaping the funding landscape.
Think it’s worth mentioning that all newbies are useless, and not all newbies remain newbies. Some portion of the majority are actually people who will progress to being useful after they’ve gained experience and wisdom.
it’s tractable to achieve progress through mindfully shaping the funding landscape
This isn’t clear to me, where the crux (though maybe it shouldn’t be) is “is it feasible for any substantial funders to distinguish actually-trying research from other”.
Yeah, I agree sometimes people decide to work on problems largely because they’re tractable [edit: or because they’re good for safety getting alignment research or other good work out of early AGIs]. I’m unconvinced of the flinching away or dishonest characterization.
Do you think that funders are aware that >90% [citation needed!] of the money they give to people, to do work described as helping with “how to make world-as-we-know-it ending AGI without it killing everyone”, is going to people who don’t even themselves seriously claim to be doing research that would plausibly help with that goal? If they are aware of that, why would they do that? If they aren’t aware of it, don’t you think that it should at least be among your very top hypotheses, that those researchers are behaving materially deceptively, one way or another, call it what you will?
On the contrary, I think ~all of the “alignment researchers” I know claim to be working on the big problem, and I think ~90% of them are indeed doing work that looks good in terms of the big problem. (Researchers I don’t know are likely substantially worse but not a ton.)
In particular I think all of the alignment-orgs-I’m-socially-close-to do work that looks good in terms of the big problem: Redwood, METR, ARC. And I think the other well-known orgs are also good.
This doesn’t feel odd: these people are smart and actually care about the big problem; if their work was in the even if this succeeds it obviously wouldn’t be helpful category they’d want to know (and, given the “obviously,” would figure that out).
Possibly the situation is very different in academia or MATS-land; for now I’m just talking about the people around me.
Yes, absolutely. Five years ago, people were more honest about it, saying ~explicitly and out loud “ah, the real problems are too difficult; and I must eat and have friends; so I will work on something else, and see if I can get funding on the basis that it’s vaguely related to AI and safety”.
To the extent that anecdata is meaningful:
I have met somewhere between 100-200 AI Safety people in the past ~2 years; people for whom AI Safety is their ‘main thing’.
The vast majority of them are doing tractable/legible/comfortable things. Most are surprisingly naive; have less awareness of the space than I do (and I’m just a generalist lurker who finds this stuff interesting; not actively working on the problem).
Few are actually staring into the void of the hard problems; where hard here is loosely defined as ‘unknown unknowns, here be dragons, where do I even start’.
Fewer still progress from staring into the void to actually trying things.
I think some amount of this is natural and to be expected; I think even in an ideal world we probably still have a similar breakdown—a majority who aren’t contributing (yet)[1], a minority who are—and I think the difference is more in the size of those groups.
I think it’s reasonable to aim for a larger, higher quality, minority; I think it’s tractable to achieve progress through mindfully shaping the funding landscape.
Think it’s worth mentioning that all newbies are useless, and not all newbies remain newbies. Some portion of the majority are actually people who will progress to being useful after they’ve gained experience and wisdom.
This isn’t clear to me, where the crux (though maybe it shouldn’t be) is “is it feasible for any substantial funders to distinguish actually-trying research from other”.
Yeah, I agree sometimes people decide to work on problems largely because they’re tractable [edit: or because they’re good for safety getting alignment research or other good work out of early AGIs]. I’m unconvinced of the flinching away or dishonest characterization.
Do you think that funders are aware that >90% [citation needed!] of the money they give to people, to do work described as helping with “how to make world-as-we-know-it ending AGI without it killing everyone”, is going to people who don’t even themselves seriously claim to be doing research that would plausibly help with that goal? If they are aware of that, why would they do that? If they aren’t aware of it, don’t you think that it should at least be among your very top hypotheses, that those researchers are behaving materially deceptively, one way or another, call it what you will?
I do not.
On the contrary, I think ~all of the “alignment researchers” I know claim to be working on the big problem, and I think ~90% of them are indeed doing work that looks good in terms of the big problem. (Researchers I don’t know are likely substantially worse but not a ton.)
In particular I think all of the alignment-orgs-I’m-socially-close-to do work that looks good in terms of the big problem: Redwood, METR, ARC. And I think the other well-known orgs are also good.
This doesn’t feel odd: these people are smart and actually care about the big problem; if their work was in the even if this succeeds it obviously wouldn’t be helpful category they’d want to know (and, given the “obviously,” would figure that out).
Possibly the situation is very different in academia or MATS-land; for now I’m just talking about the people around me.