I think different types of safety research have pretty different effects on concentration of power risk.
As others have mentioned, if the alternative to human concentration of power is AI takeover, that’s hardly an improvement. So I think the main ways in which proliferating AI safety research could be bad are:
“Safety” research might be more helpful for letting humans use AIs to concentrate power than they are for preventing AI takeover.
Actors who want to build AIs to grab power might also be worried about AI takeover, and if good(-seeming) safety techniques are available, they might be less worried about that and are more likely to go ahead with building those AIs.
There are interesting discussions to be had on the extent to which these issues apply. But it seems clearer that they apply to pretty different extents depending on the type of safety research. For example:
Work trying to demonstrate risks from AI doesn’t seem very worrisome on either 1. or 2. (and in fact, should have the opposite effect of 2. if anything).
AI control (as opposed to alignment) seems comparatively unproblematic IMO: it’s less of an issue for 1., and while 2. could apply in principle, I expect the default to be that many actors won’t be worried enough about scheming to slow down much even if there were no control techniques. (The main exception are worlds in which we get extremely obvious evidence of scheming.)
To be clear, I do agree this is a very important problem, and I thought this post had interesting perspectives on it!
I think different types of safety research have pretty different effects on concentration of power risk.
As others have mentioned, if the alternative to human concentration of power is AI takeover, that’s hardly an improvement. So I think the main ways in which proliferating AI safety research could be bad are:
“Safety” research might be more helpful for letting humans use AIs to concentrate power than they are for preventing AI takeover.
Actors who want to build AIs to grab power might also be worried about AI takeover, and if good(-seeming) safety techniques are available, they might be less worried about that and are more likely to go ahead with building those AIs.
There are interesting discussions to be had on the extent to which these issues apply. But it seems clearer that they apply to pretty different extents depending on the type of safety research. For example:
Work trying to demonstrate risks from AI doesn’t seem very worrisome on either 1. or 2. (and in fact, should have the opposite effect of 2. if anything).
AI control (as opposed to alignment) seems comparatively unproblematic IMO: it’s less of an issue for 1., and while 2. could apply in principle, I expect the default to be that many actors won’t be worried enough about scheming to slow down much even if there were no control techniques. (The main exception are worlds in which we get extremely obvious evidence of scheming.)
To be clear, I do agree this is a very important problem, and I thought this post had interesting perspectives on it!