If you start out with an already superintelligent system, you do not get to ‘align’ it, further modify its preferences, because it will not let you do that.
As usual , that starts about three steps in. You need to first show that alignment as opposed to control is the only route to safety, that the supertinteligent system necessarily has goals of its own, and that it wants to be goal stable.
As usual , that starts about three steps in. You need to first show that alignment as opposed to control is the only route to safety, that the supertinteligent system necessarily has goals of its own, and that it wants to be goal stable.