I think safety work gets less and less valuable at crunch time actually.
[...]
Whatever prosaic alignment and control measures you’ve figured out, you’ll now be using that in an attempt to play this breakneck game of getting useful work out of a potentially misaligned AI ecosystem
Sure, but you have to actually implement these alignment/control methods at some point? And likely these can’t be (fully) implemented far in advance. I usually use the term “crunch time” in a way which includes the period where you scramble to implement in anticipation of the powerful AI.
One (oversimplified) model is that there are two trends:
Implementation and research on alignment/control methods becomes easier because of AIs (as test subjects).
AIs automate away work on alignment/control.
Eventually, the second trend implies that safety work is less valuable, but probably safety work has already massively gone up in value by this point.
(Also, note that the default way of automating safety work will involve large amounts of human labor for supervision. Either due to issues with AIs or because of lack of trust in these AIs systems (e.g. human labor is needed for a control scheme.)
Sure, but you have to actually implement these alignment/control methods at some point? And likely these can’t be (fully) implemented far in advance. I usually use the term “crunch time” in a way which includes the period where you scramble to implement in anticipation of the powerful AI.
One (oversimplified) model is that there are two trends:
Implementation and research on alignment/control methods becomes easier because of AIs (as test subjects).
AIs automate away work on alignment/control.
Eventually, the second trend implies that safety work is less valuable, but probably safety work has already massively gone up in value by this point.
(Also, note that the default way of automating safety work will involve large amounts of human labor for supervision. Either due to issues with AIs or because of lack of trust in these AIs systems (e.g. human labor is needed for a control scheme.)