RSS

Mateusz Dziemian

Karma: 21

Applied ML Engineer moving into AI safety. BEng EEE @UCL. Mainly interested in alignment, red teaming and agents.

De­cep­tive agents can col­lude to hide dan­ger­ous fea­tures in SAEs

15 Jul 2024 17:07 UTC
27 points
0 comments7 min readLW link