Rohin’s opinion: I really liked the experiment demonstrating misalignment, as it seems like it accurately captures the aspects that we expect to see with existentially risky misaligned AI systems: they will “know” how to do the thing we want, they simply won’t be “motivated” to actually do it.
Nic jokes:
My counter joke (in EAI) was:
(GPT-3 is an agent-predicting agent.)