We should not study alignment and interpretability because that improves AI capabilities = We should not build steering wheels and airbags because that improves car capabilities. Not a perfect metaphor of course, but it surmises how I feel.
We should not study alignment and interpretability because that improves AI capabilities = We should not build steering wheels and airbags because that improves car capabilities. Not a perfect metaphor of course, but it surmises how I feel.