On the idea of “we can’t just choose not to build AGI”. It seems like much of the concern here is predicated on the idea that so many actors are not taking safety seriously, so someone will inevitably build AGI when the technology has advanced sufficiently.
I wonder if struggles with AIs that are strong enough to cause a disaster but not strong enough to win instantly may change this perception? I can imagine there being very little gap if any between those two types of AI if there is a hard takeoff, but to me it seems quite possible for there be some time at that stage. Some sort of small/moderate disaster with a less powerful AI might get all the relevant players to realize the danger. At that point, humans have done reasonably well at not doing things that seem very likely to destroy the world immediately (e.g. nuclear war).
Though we’ve been less good at putting good safeguards in place to prevent it from happening. And even if all groups that could create AI agree to stop, eventually someone will think they know how to do it. And we still only get the one chance.
All that is to say I don’t think it’s implausible that we’ll be able to coordinate well enough to buy more time, though it’s unclear whether that will do much to avoiding eventual doom.
Would the ability to deceive humans when specifically prompted to do so be considered an example? I would think that large LMs get better at devising false stories about the real world that people could not distinguish from true stories.