Seth Herd comments on If we solve alignment, do we die anyway?

Seth Herd 3 Jan 2025 2:19 UTC
8 points
2
Thanks!
I think hardware regulation has little chance of success because we’re not doing it yet, I think we’re only about one generation from big enough systems to train AGI-agent-capable LLMs, and algorithmic improvement has no obvious limits, so even current-gen systems can train AGI after some years of algorithmic improvements.

Beyond that, I see absolutely no moves toward regulating hardware (in the West) - more like throwing money toward accelerating it.
There have been at least two nuclear close calls and perhaps a few more we don’t know about.

I’m not saying anyone is going to press the big world-ending button because somebody hacked and fried their AGI datacenter; I’m saying they might issue threats when it became clear that the US is taking control of the entire future by creating AGI and making sure no one can counter it by building their own. And I’m worried that those threats would be answered, and someone foolish might initiate a chain of hostilities that didn’t stop in time.

I hope this doesnt’ happen, and I mostly share your optimism that sanity would prevail. But we have had two human beings for whom protocol said to to fire nukes and they each refused. I don’t want to risk more people than that following their conscience instead of their orders; soldiers do terrible things including sacrificing their own lives pretty frequently.
- otto.barten 8 Jan 2025 10:16 UTC
  1 point
  0
  Parent
  1. I don’t strongly disagree re architectures, but I do think we are uncertain about this. Depending on AGI architecture, different forms of regulation may or may not work. Work should be carried out to determine which regulation works for how many flops needed for takeover-level AI.
    
    That it’s not happening yet is 1) no reason it won’t (xrisk awareness is just too low, but slowly rising) and 2) equally applicable to the alternative you propose, universal surveillance.
    
    If we treat universal surveillance seriously, we should consider its downsides as well. First, there’s no proof it would work: I’m not sure an AI, even a future one, would necessarily catch all actions towards building AGI. I have no idea what these actions are, and no idea which actions a surveillance AI with some real-world sensors can catch (or could be blocked etc.). I think we should not be more than 70% confident this would technically work. Second, currently we have power vacuums in the world, such as failed states, revolutions, criminal groups, or just instances were those in power are unable to project their power effectively. How would we apply universal surveillance to those power vacuums? Or do we assume they won’t exist anymore, and if so, why is that assumption justified? Third, universal surveillance is arguably the world’s least popular policy. It seems outright impossible to implement this in any democratic way. Perhaps the plan is to implement it by force through an AGI, then I would file it as a form of pivotal act. If we’re anyway in pivotal act territory, I’d strongly prefer Yudkowsky’s “subtly modifying all GPUs such that they can no longer train an AGI” (kind of hardware regulation, really) over universal surveillance.
    
    I think research is urgently required into how to implement a pause effectively. We have one report almost finished on the topic that mostly focuses on hardware regulation. PauseAI is working on a Building a pause button-project that is a bit similar. Other orgs should do work on this as well, and compare options such as hardware regulation, universal surveillance, data regulation, etc. and conclude in which AGI regime (how many flops, how much hardware required) these options are valid.
  2. True, I guess we’re not in significant disagreement here.