Nora talks sometimes about the alignment field using the term black box wrong. This seems unsupported, from my experience, most in alignment use the term “black box” to describe how their methods treat the AI model, which seems reasonable. Not a fundamental state of the AI model itself.
Nora talks sometimes about the alignment field using the term black box wrong. This seems unsupported, from my experience, most in alignment use the term “black box” to describe how their methods treat the AI model, which seems reasonable. Not a fundamental state of the AI model itself.