So it’s a Goldilocks AI that has stable goals :-) A too-stupid AI might change its goals without really meaning it and a too-smart AI might change its goals because it wouldn’t be afraid of change (=trusts its future self).
It’s not that if it’s smart enough it trusts its future self. It’s that if it has vaguely-defined goals in a human-like manner, it might change its goals. An AI with explicit, fully understood, goals will not change its goals regardless of how intelligent it is.
So it’s a Goldilocks AI that has stable goals :-) A too-stupid AI might change its goals without really meaning it and a too-smart AI might change its goals because it wouldn’t be afraid of change (=trusts its future self).
It’s not that if it’s smart enough it trusts its future self. It’s that if it has vaguely-defined goals in a human-like manner, it might change its goals. An AI with explicit, fully understood, goals will not change its goals regardless of how intelligent it is.