I disagree with the view that it’s bad to spend the first few months prizing top researchers who would have done the work anyway. This _in and of itself_ is cleary burning cash, yet the point is to change incentives over a longer time-frame.
If you think research output is heavy-tailed, what you should expect to observe is something like this happening for a while, until promising tail-end researchers realise there’s a stable stream of value to be had here, and put in the effort required to level up and contribute themselves. It’s not implausible to me that would take a >1 year of prizes.
Expecting lots of important counterfactual work, that beats the current best work, to be come out of the woodwork within ~6 months seems to assume that A) making progress on alignment is quite tractable, and B) the ability to do so is fairly widely distributed across people; both to a seemingly unjustified extent.
I personally think prizes should be announced together with precommitments to keep delivering them for a non-trivial amount of time. I believe this because I think changing incentives involves changing expectations, in a way that changes medium-term planning. I expect people to have qualitatively different thoughts if their S1 reliably believes that fleshing out the-kinds-of-thoughts-that-take-6-months-to-flesh-out will be reward after those 6 months.
That’s expensive, in terms of both money and trust.
I disagree with the view that it’s bad to spend the first few months prizing top researchers who would have done the work anyway. This _in and of itself_ is cleary burning cash, yet the point is to change incentives over a longer time-frame.
If you think research output is heavy-tailed, what you should expect to observe is something like this happening for a while, until promising tail-end researchers realise there’s a stable stream of value to be had here, and put in the effort required to level up and contribute themselves. It’s not implausible to me that would take a >1 year of prizes.
Expecting lots of important counterfactual work, that beats the current best work, to be come out of the woodwork within ~6 months seems to assume that A) making progress on alignment is quite tractable, and B) the ability to do so is fairly widely distributed across people; both to a seemingly unjustified extent.
I personally think prizes should be announced together with precommitments to keep delivering them for a non-trivial amount of time. I believe this because I think changing incentives involves changing expectations, in a way that changes medium-term planning. I expect people to have qualitatively different thoughts if their S1 reliably believes that fleshing out the-kinds-of-thoughts-that-take-6-months-to-flesh-out will be reward after those 6 months.
That’s expensive, in terms of both money and trust.