zeshen comments on We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”

zeshen 10 May 2024 10:10 UTC
3 points
−1
There’ll be discussions about how these systems will eventually become dangerous, and safety-concerned groups might even set up testing protocols (“safety evals”).
My impression is that safety evals were deemed irrelevant because a powerful enough AGI, being deceptively aligned, would pass all of them anyway. We didn’t expect the first general-ish AIs to be so dumb, like how GPT-4 was being so blatant and explicit about lying to the TaskRabbit worker.