That is why wise AI will not try to attack humans at all at early stages—and will not need to do it in later stages of its development.
In that case, can you imagine an AGI that, given that I can’t attack and kill all humans (it is unwise), is coerced into given a human readable solution to the alignment problem? If no, why not?
That is why wise AI will not try to attack humans at all at early stages—and will not need to do it in later stages of its development.
In that case, can you imagine an AGI that, given that I can’t attack and kill all humans (it is unwise), is coerced into given a human readable solution to the alignment problem? If no, why not?