>Second, I imagine that such a near-miss would make Demis Hassabis etc. less likely to build and use AGIs in an aggressive pivotal-act-type way. Instead, I think there would be very strong internal and external pressures (employees, government scrutiny, public scrutiny) preventing him and others from doing much of anything with AGIs at all.
I feel I should note that while this does indeed form part of a debunk of the “good guy with an AGI” idea, it is in and of itself a possible reason for hope. After all, if nobody anywhere dares to make AGI, well, then, AGI X-risk isn’t going to happen. The trouble is getting the Overton Window to the point where sufficient bloodthirst to actually produce that outcome (i.e. nuclear-armed countries saying “if anyone attempts to build AGI, everyone who cooperated in doing it hangs or gets life without parole, and if any country does not enforce this vigorously we will invade, and if they have nukes or have a bigger army than us then we pre-emptively nuke them because their retaliation is still higher-EV than letting them finish”) is seen as something other than insanity, which a warning shot could well pull off.
This is not a permanent solution—questions of eventual societal relaxation aside, humanity cannot expand past K2 without the Jihad breaking down unless FTL is a thing—but it buys a lot of breathing time, which is the key missing ingredient you note in a lot of these plans.
There are a couple of things that are making me really nervous about the idea of donating:
“AI safety” is TTBOMK a broad term and encompasses prosaic alignment as well as governance. I am of the strong opinion that prosaic alignment is a blind alley that’s mostly either wasted effort or actively harmful due to producing fake alignment that makes people not abandon neural nets. ~97% of my P(not doom) routes through Butlerian Jihad against neural nets (with or without a nuclear war buying us more time) that lasts long enough to build GOFAI. And frankly, I don’t spend that much time on LW, so I’ve little idea which of these efforts (or others!) gets most of the benefit you claim from the site.
As noted above, I think a substantial chunk of useful futures (though not a vast majority) route through nuclear war destroying the neural-net sector for a substantial amount of time (via blast wiping out factories, EMP destroying much of existing chip stocks, destruction of power and communication infrastructure reducing the profitability of AI, economic collapse more broadly, and possibly soft errors). As such, I’ve been rather concerned for years about the fact that the Ratsphere’s main IRL presence is in the Bay Area and thus nuke-bait; we want to disproportionately survive that, not die in it. Insofar as Lighthaven is in the Bay Area, I am thus questioning whether its retention is +EV.