MichaelDickens comments on Analysis of Global AI Governance Strategies

MichaelDickens Dec 5, 2024, 1:24 AM
8 points
0

Cooperative Development (CD) is favored when alignment is easy and timelines are longer. [...]

Strategic Advantage (SA) is more favored when alignment is easy but timelines are short (under 5 years)

I somewhat disagree with this. CD is favored when alignment is easy with extremely high probability. A moratorium is better given even a modest probability that alignment is hard, because the downside to misalignment is so much larger than the downside to a moratorium.[1] The same goes for SA—it’s only favored when you are extremely confident about alignment + timelines.

[1] Unless you believe a moratorium has a reasonable probability of permanently preventing friendly AI from being developed.
- MichaelDickens Dec 5, 2024, 1:31 AM
  10 points
  4
  Parent
  Also, I don’t feel that this article adequately addressed the downside of SA that it accelerates an arms race. SA is only favored when alignment is easy with high probability and you’re confident that you will win the arms race, and you’re confident that it’s better for you to win than for the other guy[1], and you’re talking about a specific kind of alignment where an “aligned” AI doesn’t necessarily behave ethically, it just does what its creator intends.
  
  [1] How likely is a US-controlled (or, more accurately, Sam Altman/Dario Amodei/Mark Zuckerberg-controlled) AGI to usher in a global utopia? How likely is a China-controlled AGI to do the same? I think people are too quick to take it for granted that the former probability is larger than the latter.
  - Sammy Martin Dec 11, 2024, 11:32 AM
    4 points
    0
    Parent
    We do discuss this in the article and tried to convey that it is a very significant downside of SA. All 3 plans have enormous downsides though, so a plan posing massive risks is not disqualifying. The key is understanding when these risks might be worth taking given the alternatives.
    CD might be too weak if TAI is offense-dominant, regardless of regulations or cooperative partnerships, and result in misuse or misalignment catastrophe
    If GM fails it might blow any chance of producing protective TAI and hand over the lead to the most reckless actors.
    SA might directly provoke a world war or produce unaligned AGI ahead of schedule.
    SA is favored when alignment is easy or moderately difficult (e.g. at the level where interpretability probes, scalable oversight etc. help) with high probability, and you expect to win the arms race. But it doesn’t require you to be the ‘best’. The key isn’t whether US control is better than Chinese control, but whether centralized development under any actor is preferable to widespread proliferation of TAI capabilities to potentially malicious actors
    Regarding whether the US (remember on SA there’s assumed to be extensive government oversight) is better than the CCP: I think the answer is yes and I talk a bit more about why here. I don’t consider US AI control being better than Chinese AI control to be the most important argument in favor of SA, however. That fact alone doesn’t remotely justify SA: you also need easy/moderate alignment and you need good evidence than an arms race is likely unavoidable regardless of what we recommend.
- Nathan Helm-Burger Dec 6, 2024, 5:17 AM
  4 points
  −6
  Parent
  I think moratorium is basically intractable short of a totalitarian world government cracking down on all personal computers.
  
  Unless you mean just a moratorium on large training runs, in which case I think it buys a minor delay at best, and comes with counterproductive pressures on researchers to focus heavily on diverse small-scale algorithmic efficiency experiments.
  - MichaelDickens Dec 6, 2024, 7:27 PM
    1 point
    0
    Parent
    I don’t think controlling compute would be qualitatively harder than controlling, say, pseudoephedrine.
    
    (I think it would be harder, but not qualitatively harder—the same sorts of strategies would work.)
    - Nathan Helm-Burger Dec 7, 2024, 12:00 AM
      2 points
      0
      Parent
      I agree that some amount of control is possible.
      
      But if we are in a scenario in the future where the offense-defense balance of bioweapons remains similar to how it is today, then a single dose of pseudoephedrine going unregulated by the government and getting turned into methamphetamine could result in the majority of humanity being wiped out.
      
      Pseudoephedrine is regulated, yes, but not so strongly that literally none slips past the enforcement. With the stakes so high, a mostly effective enforcement scheme doesn’t cut it.
      - MichaelDickens Dec 7, 2024, 1:55 AM
        3 points
        3
        Parent
        That’s only true if a single GPU (or small number of GPUs) is sufficient to build a superintelligence, right? I expect it to take many years to go from “it’s possible to build superintelligence with a huge multi-billion-dollar project” and “it’s possible to build superintelligence on a few consumer GPUs”. (Unless of course someone does build a superintelligence which then figures out how to make GPUs many orders of magnitude cheaper, but at that point it’s moot.)
        Nathan Helm-Burger Dec 7, 2024, 3:49 PM
        2 points
        −2
        Parent
        Sadly, no. It doesn’t take superintelligence to be deadly. Even current open-weight LLMs, like Llama 3 70B, know quite a lot about genetic engineering. The combination of a clever and malicious human, and an LLM able to offer help and advice is sufficient.
        
        Furthermore, there is the consideration of “seed AI” which is competent enough to improve and not plateau. If you have a competent human helping it and getting it unstuck, then the bar is even lower. My prediction is that the bar for “seed AI” is lower than the bar for AGI.
- Sammy Martin Dec 11, 2024, 11:20 AM
  2 points
  0
  Parent
  Let me clarify an important point: The strategy preferences outlined in the paper are conditional statements—they describe what strategy is optimal given certainty about timeline and alignment difficulty scenarios. When we account for uncertainty and the asymmetric downside risks—where misalignment could be catastrophic—the calculation changes significantly. However, it’s not true that GM’s only downside is that it might delay the benefits of TAI.
  Misalignment (or catastrophic misuse) has a much larger downside than a successful moratorium. That is true, but trying to do a moratorium, losing your lead, and then someone else developing catastrophically misaligned AI when you could have developed a defense against it if you’d adopted CD or SA has just as large a downside.
  And GM has a lower chance of being adopted than CD or SA, so the downside to pushing for a moratorium is not necessarily lower.
  Since a half-successful moratorium is the worst of all worlds (assuming that alignment is feasible) because you lose out on your chances of developing defenses against unaligned or misused AGI, it’s not always true that the moratorium plan has fewer downsides than the others.
  However, I agree with your core point—if we were to model this with full probability distributions over timeline and alignment difficulty, GM would likely be favored more heavily than our conditional analysis suggests, especially if we place significant probability on short timelines or hard alignment