Escaque 66 comments on fMRI LIKE APPROACH TO AI ALIGNMENT / DECEPTIVE BEHAVIOUR

Escaque 66 8 Nov 2023 9:48 UTC
1 point
0
For a work implementin this idea, see: https://www.anthropic.com/index/decomposing-language-models-into-understandable-components