https://arxiv.org/pdf/2107.03374.pdf#page=27
Interesting, thank you! I guess I was thinking of deception as characterized by Evan Hubinger, with mesa-optimizers, bells, whistles, and all. But I can see how a sufficiently large competence-vs-performance gap could also count as deception.
https://arxiv.org/pdf/2107.03374.pdf#page=27
Interesting, thank you! I guess I was thinking of deception as characterized by Evan Hubinger, with mesa-optimizers, bells, whistles, and all. But I can see how a sufficiently large competence-vs-performance gap could also count as deception.