If I have to overpower or negotiate with it to get something I might validly want, we’re back to corrigibility.
Not really because an AI optimizing for your empowerment actually wants to give you more options/power/choice—that’s not something you need to negotiate, that’s just what it wants to do. In fact one of the most plausible outcomes after uploading is that it realizes giving all its computational resources to humans is the best human empowering use of that compute and that it no longer has a reason to exist.
Human values/utility are complex and also non-stationary, they drift/change over time. So any error in modeling them compounds, and if you handle that uncertainty correctly you get a max entropy uncertain distribution over utility functions in the future. Optimizing for empowerment is equivalent to optimizing for that max entropy utility distribution—at least for a wide class of values/utilities.
Not really because an AI optimizing for your empowerment actually wants to give you more options/power/choice—that’s not something you need to negotiate, that’s just what it wants to do. In fact one of the most plausible outcomes after uploading is that it realizes giving all its computational resources to humans is the best human empowering use of that compute and that it no longer has a reason to exist.
Human values/utility are complex and also non-stationary, they drift/change over time. So any error in modeling them compounds, and if you handle that uncertainty correctly you get a max entropy uncertain distribution over utility functions in the future. Optimizing for empowerment is equivalent to optimizing for that max entropy utility distribution—at least for a wide class of values/utilities.