JBlack comments on EDT with updating double counts

JBlack 13 Oct 2021 5:20 UTC
7 points
I’m not sure what this is, but it’s not EDT.
The correct decision under EDT is the action A that maximizes Sum P(O_i | A) U(O_i, A), where the O_i are the possible outcomes and U is the utility function given outcome O_i for action A. For this agent, the utility is not a function of O_i and A, so EDT cannot be applied.
- paulfchristiano 13 Oct 2021 21:57 UTC
  6 points
  Parent
  I’m using EDT to mean the agent that calculates expected utility conditioned on each statement of the form “I take action A” and then chooses the action for which the expected utility is highest. I’m not sure what you mean by saying the utility is not a function of O_i, isn’t “how much money me and my copies earn” a function of the outcome?
  (In your formulation I don’t know what P(|A) means, given that A is an action and not an event, but if I interpret it as “Probability given that I take action A” then it looks like it’s basically what I’m doing?)
  - JBlack 15 Oct 2021 12:06 UTC
    1 point
    Parent
    The “me and my copies” that this agent bases its utility on are split across possible worlds with different outcomes. EDT requires a function that maps an action and an outcome to a utility value, and no such function exists for this agent.
    Edit: as an example, what is the utility of this agent winning $1000 in a game where they don’t know the chance of winning? They don’t even know themselves what their own utility is, because their utility doesn’t just depend upon the outcome. If you credibly tell them afterward that they were nearly certain to win, they value the same $1000 very much greater than if you tell them that there was a 1 in a million chance that they would win.
    For this sort of agent that values nonexistent and causally-disconnected people, we need some different class of decision theory altogether, and I’m not sure it can even be made rationally consistent.