ryan_greenblatt comments on Daniel Kokotajlo’s Shortform

ryan_greenblatt 22 Feb 2024 0:08 UTC
5 points
0
[Nitpick]

we’d have a rogue ASI on our hands

FWIW it doesn’t seem obvious to me that it wouldn’t be sufficiently corrigible by default.

I’d be at about 25% that if you end up with an ASI by accident, you’ll notice before it ends up going rogue. This aren’t great odds of course.
- Daniel Kokotajlo 22 Feb 2024 0:52 UTC
  2 points
  0
  Parent
  I guess I was including that under “hopefully it would have internalized enough human ethics that things would be OK” but yeah I guess that was unclear and maybe misleading.
  - ryan_greenblatt 22 Feb 2024 2:01 UTC
    2 points
    0
    Parent
    Yeah, I guess corrigible might not require any human ethics. Might just be that the AI doesn’t care about seizing power (or care about anything really) or similar.