ryan_greenblatt comments on The Checklist: What Succeeding at AI Safety Will Involve

ryan_greenblatt 5 Sep 2024 17:15 UTC
LW: 4 AF: 3
2
AF
Is your perspective something like:

With (properly motivated) qualitatively wildly superhuman AI, you can end the Acute Risk Period using means which aren’t massive crimes despite not collaborating with the US government. This likely involves novel social technology. More minimally, if you did have a sufficiently aligned AI of this power level, you could just get it to work on ending the Acute Risk Period in a basically legal and non-norms-violating way. (Where e.g. super persuasion would clearly violate norms.)

I think that even having the ability to easily take over the world as a private actor is pretty norms violating. I’m unsure about the claim that if you put this aside, there is a way to end the acute risk period (edit: without US government collaboration and) without needing truly insanely smart AIs. I suppose that if you go smart enough this is possible though pre-existing norms also just get more confusing in the regime where you can steer the world to whatever outcome you want.

So overall, I’m not sure I disagree with this perspective exactly. I think the overriding consideration for me is that this seems like a crazy and risky proposal at multiple levels.

To be clear, you are explicitly not endorsing this as a plan nor claiming this is Anthropic’s plan.
- RobertM 6 Sep 2024 5:56 UTC
  LW: 4 AF: 2
  0
  AF Parent
  Is your perspective something like:
  Something like that, though I’m much less sure about “non-norms-violating”, because many possible solutions seem like they’d involve something qualitatively new (and therefore de-facto norm-violating, like nearly all new technology). Maybe a very superhuman TAI could arrange matters such that things just seem to randomly end up going well rather than badly, without introducing any new^[1] social or material technology, but that does seem quite a bit harder.
  I’m pretty uncertain about, if something like that ended up looking norm-violating, it’d be norm-violating like Uber was^[2], or like super-persuasian. That question seems very contingent on empirical questions that I think we don’t have much insight into, right now.
  I’m unsure about the claim that if you put this aside, there is a way to end the acute risk period without needing truly insanely smart AIs.
  I didn’t mean to make the claim that there’s a way to end the acute risk period without needing truly insanely smart AIs (if you put aside centrally-illegal methods); rather, that an AI would probably need to be relatively low on the “smarter than humans” scale to need to resort to methods that were obviously illegal to end the acute risk period.
  1. ^
    In ways that are obvious to humans.
  2. ^
    Minus the part where Uber was pretty obviously illegal in many places where it operated.