The “alignment technique generalise across human contributions to architectures” isn’t about the SLT threat model. It’s about the “AIs do AI capabilities research” threat model.
The “alignment technique generalise across human contributions to architectures” isn’t about the SLT threat model. It’s about the “AIs do AI capabilities research” threat model.