Nathan Helm-Burger comments on Shallow review of technical AI safety, 2024

Nathan Helm-Burger 7 Jan 2025 20:51 UTC
5 points
0
That’s not how I see it. I see it as widening the safety margin. If there’s a model which would just barely be strong enough to do dangerous scheming and escaping stuff, but we have Control measures in place, then we have a chance to catch it before catastrophe occurs. Also, it extends the range where we can safely get useful work out of the increasingly capable models. This is important because linearly increasingly capable models are expected to have superlinear positive effects on the capacity they give us to accelerate Alignment research.