One thing that might help is to make the decisions before you have the power—that is, to “lock in” the AI’s choices on the basis of what prosocial-current-you (or the EV thereof) wants, before you actually have a superintelligence and your subconscious activates powerful-selfish-you. Then by the time you can be corrupted by power, it’s too late to change anything. Coding the AI before launching it might be enough, especially if other people are watching and will punish cackling madness. On the other hand, being alone coding the AI would be a position of great power at least by proxy, and so might still be open to corruption.
One thing that might help is to make the decisions before you have the power—that is, to “lock in” the AI’s choices on the basis of what prosocial-current-you (or the EV thereof) wants, before you actually have a superintelligence and your subconscious activates powerful-selfish-you. Then by the time you can be corrupted by power, it’s too late to change anything. Coding the AI before launching it might be enough, especially if other people are watching and will punish cackling madness. On the other hand, being alone coding the AI would be a position of great power at least by proxy, and so might still be open to corruption.