Rohin Shah comments on Don’t accelerate problems you’re trying to solve

Rohin Shah 18 Feb 2023 15:29 UTC
LW: 6 AF: 3
1
AF
In that case, I think you should try and find out what the incentive gradient is like for other people before prescribing the actions that they should take. I’d predict that for a lot of alignment researchers your list of incentives mostly doesn’t resonate, relative to things like:
1. Active discomfort at potentially contributing to a problem that could end humanity
2. Social pressure + status incentives from EAs / rationalists to work on safety and not capabilities
3. Desire to work on philosophical or mathematical puzzles, rather than mucking around in the weeds of ML engineering
4. Wanting to do something big-picture / impactful / meaningful (tbc this could apply to both alignment and capabilities)
For reference, I’d list (2) and (4) as the main things that affects me, with maybe a little bit of (3), and I used to also be pretty affected by (1). None of the things you listed feel like they affect me much (now or in the past), except perhaps wishful thinking (though I don’t really see that as an “incentive”).