Personally, my long-term goal is a world where high-quality work on alignment is consistently funded, and where people doing high-quality work on alignment have plenty of money. I think that an effort to restrict to counterfactually-additional alignment work would “save” some money (in the sense that I’d have the money rather than some researcher who is doing alignment work) but wouldn’t be great for that long-term goal.
Also, if you actually think about the dynamics they are pretty crappy, even if you only avoid “obvious” cases. For example, it would become really hard for anyone to actually assess counterfactual impact, since every winner would need to make it look like there was at least a plausible counterfactual impact. (I already wish there was less implicit social pressure in that direction.)
Personally, my long-term goal is a world where high-quality work on alignment is consistently funded, and where people doing high-quality work on alignment have plenty of money. I think that an effort to restrict to counterfactually-additional alignment work would “save” some money (in the sense that I’d have the money rather than some researcher who is doing alignment work) but wouldn’t be great for that long-term goal.
Also, if you actually think about the dynamics they are pretty crappy, even if you only avoid “obvious” cases. For example, it would become really hard for anyone to actually assess counterfactual impact, since every winner would need to make it look like there was at least a plausible counterfactual impact. (I already wish there was less implicit social pressure in that direction.)
On reflection I strongly agree that social pressure around counterfactualness is a net harm for motivation.