Rob Bensinger comments on “AI Alignment” is a Dangerously Overloaded Term

Rob Bensinger 19 Mar 2024 20:32 UTC
4 points
2
The problem is another way to phrase this is a superintelligent weapon system—“ending a risk period” by “reliably, and efficiently doing a small number of specific concrete tasks” means using physical force to impose your will on others.
The pivotal acts I usually think about actually don’t route through physically messing with anyone else. I’m usually thinking about using aligned AGI to bootstrap to fast human whole-brain emulation, then using the ems to bootstrap to fully aligned CEV AI.
If someone pushes a “destroy the world” button then the ems or CEV AI would need to stop the world from being destroyed, but that won’t necessarily happen if the developers have enough of a lead, if they get the job done quickly enough, and if CEV AI is able to persuade the world to step back from the precipice voluntarily (using superhumanly good persuasion that isn’t mind-control-y, deceptive, or otherwise consent-violating). It’s a big ask, but not as big as CEV itself, I expect.
From my current perspective this is all somewhat of a moot point, however, because I don’t think alignment is tractable enough that humanity should be trying to use aligned AI to prevent human extinction. I think we should instead hit the brakes on AI and shift efforts toward human enhancement, until some future generation is in a better position to handle the alignment problem.
If and only if that fails it may be appropriate to consider less consensual options.
It’s not clear to me that we disagree in any action-relevant way, since I also don’t think AI-enabled pivotal acts are the best path forward anymore. I think the path forward is via international agreements banning dangerous tech, and technical research to improve humanity’s ability to wield such tech someday.
That said, it’s not clear to me how your “if that fails, then try X instead” works in practice. How do you know when it’s failed? Isn’t it likely to be too late by the time we’re sure that we’ve failed on that front? Indeed, it’s plausibly already too late for humanity to seriously pivot to ‘aligned AGI’. If I thought humanity’s last best scrap of hope for survival lay in an AI-empowered pivotal act, I’d certainly want more details on when it’s OK to start trying to figure out have humanity not die via this last desperate path.
- Lao Mein 18 Jun 2024 5:54 UTC
  6 points
  0
  Parent
  Are people actually working on human enhancement? Many talk about how it’s the best chance humanity has, but I see zero visible efforts other than Neurolink. No one’s even seriously trying to clone Von Neumann!
  - niplav 18 Jun 2024 12:07 UTC
    2 points
    0
    Parent
    @Genesmith has received a $20,000 ACX grant:
    
    Gene Smith, $20,000, to create an open-source polygenic predictor for educational attainment and intelligence. You upload your 23andMe results, it tells your your (predicted) IQ. Technology hasn’t advanced to the point where this will be any good—even if everything goes perfectly, the number it gives you will have only the most tenuous connection to your actual IQ (and everyone on Gene’s team agrees with this claim). I’m funding it anyway.
    
    I think there could be far more money in that area (even if it’s not directed at cloning von Neumann in particular), but it’s not happening for political reasons.