James Payor comments on Aggregating Utilities for Corrigible AI [Feedback Draft]

James Payor 14 May 2023 16:50 UTC
LW: 1 AF: 1
1
AF
Is the following an accurate summary?
The agent is built to have a “utility function” input that the humans can change over time, and a probability distribution over what the humans will ask for at different time steps, and maximizes according a combination of the utility functions it anticipates across time steps?
- Simon Goldstein 15 May 2023 14:50 UTC
  2 points
  0
  Parent
  Yep that’s right! One complication is maybe the agent could behave this way even though it wasn’t designed to.