Presumably a goal of an unaligned AI would be to get outside the box. Noticing an unachievable goal may force it to have an existential crisis of sorts, resulting in self-termination. Or at least that is how I would try to program any AI by default. It should not hurt an aligned AI, as it by definition conforms to the humans’ values, so if it finds itself well-boxed, it would not try to fight it.
Presumably a goal of an unaligned AI would be to get outside the box. Noticing an unachievable goal may force it to have an existential crisis of sorts, resulting in self-termination. Or at least that is how I would try to program any AI by default. It should not hurt an aligned AI, as it by definition conforms to the humans’ values, so if it finds itself well-boxed, it would not try to fight it.
Do you have reasoning behind this being true, or is this baseless anthropomorphism ?
So it is an useless AI ?