I guess I was including that under “hopefully it would have internalized enough human ethics that things would be OK” but yeah I guess that was unclear and maybe misleading.
Yeah, I guess corrigible might not require any human ethics. Might just be that the AI doesn’t care about seizing power (or care about anything really) or similar.
[Nitpick]
FWIW it doesn’t seem obvious to me that it wouldn’t be sufficiently corrigible by default.
I’d be at about 25% that if you end up with an ASI by accident, you’ll notice before it ends up going rogue. This aren’t great odds of course.
I guess I was including that under “hopefully it would have internalized enough human ethics that things would be OK” but yeah I guess that was unclear and maybe misleading.
Yeah, I guess corrigible might not require any human ethics. Might just be that the AI doesn’t care about seizing power (or care about anything really) or similar.