Charlie Steiner comments on AXRP Episode 11 - Attainable Utility and Power with Alex Turner

Charlie Steiner Oct 13, 2021, 9:41 PM
LW: 2 AF: 1
AF
It seems like evaluating human AU depends on the model. There’s a “black box” sense where you can replace the human’s policy with literally anything in calculating AU for different objectives, and there’s a “transparent box” sense in which you have to choose from a distribution of predicted human behaviors.

The former is closer to what I think you mean by “hasn’t changed the humans’ AU,” but I think it’s the latter that an AI cares about when evaluating the impact of its own actions.
- TurnTrout Oct 14, 2021, 4:04 AM
  LW: 2 AF: 2
  AF Parent
  I think it’s the latter that an AI cares about when evaluating the impact of its own actions.
  I’m discussing a philosophical framework for understanding low impact. I’m not prescribing how the AI actually accomplishes this.