Sammy Martin comments on Deceptive AI ≠ Deceptively-aligned AI

Sammy Martin 8 Jan 2024 11:44 UTC
LW: 4 AF: 3
0
AF
If you want a specific practical example of the difference between the two: we now have AIs capable of being deceptive when not specifically instructed to do so (‘strategic deception’) but not developing deceptive power-seeking goals completely opposite what the overseer wants of them (‘deceptive misalignment’). This from Apollo research on Strategic Deception is the former not the latter,
https://www.apolloresearch.ai/research/summit-demo