I can’t remember the key bit of Ben’s post (and wasn’t able to find it quickly on skimming). But my hot take is:
It seems obviously net-positive for contributions on x-risk to be rewarded with status within the AI Safety community, but it’s not obviously net-positive for those contributions to be rewarded with serious money or status in the broader world.
If the latter gets too large, then you start getting swarmed with people who want money and prestige but don’t necessarily understand how to contibute, who are incentivized to degrade the signal of what’s actually important.
To quote a conversation with habryka: there are two ways to make AI Safety prestigious. The first way is to make serious AI Safety work (i.e. solving the alignment problem) prestigious. The second is to change the definition of AI Safety to be something more obviously prestigious in the first place (which may get you things like ‘solving problems with self-driving cars’). And the latter is often easier to do.
So, if you’re making it easier for not-fully-aligned people to join the movement, motivated by prestige, they’ll want to start changing the movement to make it easier for them to get ahead.
This isn’t to say this obviously cashes out to “net-negative” either, just that it’s something to be aware of.
Principal-agent problems certainly matter! But despite that, collaboration based on extrinsic rewards (instead of selfless agreement on every detail) has been a huge success story for mankind. Is our task unusually prone to principal-agent problems, compared to other tasks? In my experience, the opposite is true: AI alignment research is unusually easy to evaluate in detail, compared to checking the work of a contractor building your house or a programmer writing code for your company.
If the latter gets too large, then you start getting swarmed with people who want money and prestige but don’t necessarily understand how to contibute, who are incentivized to degrade the signal of what’s actually important.
During this decade the field of AI in general became one of the most prestigious and high-status academic fields to work in. But as far as I can tell, it hasn’t slowed down the rate of progress in advancing AI capability. If anything, it has sped it up—by quite a bit. It’s possible that a lot of newcomers to the field are largely driven by the prospect of status gain and money. And there are quite a few “AI” hype-driven startups that have popped up and seem doomed to fail, but despite this, it doesn’t seem to be slowing the pace of the most productive research groups. Maybe the key here is that if you suddenly increase the prestige of a scientific field by a dramatic amount, you are bound to get a lot of nonsense or fraudulent activity, but this might be constrained to being outside of serious research circles. And the most serious people working in the field are likely to be helped by the rising tide as well, due to increased visibility and funding to their labs and so on.
It’s also my understanding that the last few years (during the current AI boom) have been some of the most successful (financially and productively) for MIRI in their entire history.
I can’t remember the key bit of Ben’s post (and wasn’t able to find it quickly on skimming). But my hot take is:
It seems obviously net-positive for contributions on x-risk to be rewarded with status within the AI Safety community, but it’s not obviously net-positive for those contributions to be rewarded with serious money or status in the broader world.
If the latter gets too large, then you start getting swarmed with people who want money and prestige but don’t necessarily understand how to contibute, who are incentivized to degrade the signal of what’s actually important.
To quote a conversation with habryka: there are two ways to make AI Safety prestigious. The first way is to make serious AI Safety work (i.e. solving the alignment problem) prestigious. The second is to change the definition of AI Safety to be something more obviously prestigious in the first place (which may get you things like ‘solving problems with self-driving cars’). And the latter is often easier to do.
So, if you’re making it easier for not-fully-aligned people to join the movement, motivated by prestige, they’ll want to start changing the movement to make it easier for them to get ahead.
This isn’t to say this obviously cashes out to “net-negative” either, just that it’s something to be aware of.
Principal-agent problems certainly matter! But despite that, collaboration based on extrinsic rewards (instead of selfless agreement on every detail) has been a huge success story for mankind. Is our task unusually prone to principal-agent problems, compared to other tasks? In my experience, the opposite is true: AI alignment research is unusually easy to evaluate in detail, compared to checking the work of a contractor building your house or a programmer writing code for your company.
During this decade the field of AI in general became one of the most prestigious and high-status academic fields to work in. But as far as I can tell, it hasn’t slowed down the rate of progress in advancing AI capability. If anything, it has sped it up—by quite a bit. It’s possible that a lot of newcomers to the field are largely driven by the prospect of status gain and money. And there are quite a few “AI” hype-driven startups that have popped up and seem doomed to fail, but despite this, it doesn’t seem to be slowing the pace of the most productive research groups. Maybe the key here is that if you suddenly increase the prestige of a scientific field by a dramatic amount, you are bound to get a lot of nonsense or fraudulent activity, but this might be constrained to being outside of serious research circles. And the most serious people working in the field are likely to be helped by the rising tide as well, due to increased visibility and funding to their labs and so on.
It’s also my understanding that the last few years (during the current AI boom) have been some of the most successful (financially and productively) for MIRI in their entire history.
This is an interesting point I hadn’t considered. Still mulling it over a bit.