dxu comments on Invulnerable Incomplete Preferences: A Formal Statement

dxu 31 Oct 2023 16:33 UTC
2 points
0

In your example, DSM permits the agent to end up with either A+ or B. Neither is strictly dominated, and neither has become mandatory for the agent to choose over the other. The agent won’t have reason to push probability mass from one towards the other.

But it sounds like the agent’s initial choice between A and B is forced, yes? (Otherwise, it wouldn’t be the case that the agent is permitted to end up with either A+ or B, but not A.) So the presence of A+ within a particular continuation of the decision tree influences the agent’s choice at the initial node, in a way that causes it to reliably choose one incomparable option over another.

Further thoughts: under the original framing, instead of choosing between A and B (while knowing that B can later be traded for A+), the agent instead chooses whether to go “up” or “down” to receive (respectively) A, or a further choice between A+ and B. It occurs to me that you might be using this representation to argue for a qualitative difference in the behavior produced, but if so, I’m not sure how much I buy into it.

For concreteness, suppose the agent starts out with A, and notices a series of trades which first involves trading A for B, and then B for A+. It seems to me that if I frame the problem like this, the structure of the resulting tree should be isomorphic to that of the decision problem I described, but not necessarily the “up”/”down” version—at least, not if you consider that version to play a key role in DSM’s recommendation.

(In particular, my frame is sensitive to which state the agent is initialized in: if it is given B to start, then it has no particular incentive to want to trade that for either A or A+, and so faces no incentive to trade at all. If you initialize the agent with A or B at random, and institute the rule that it doesn’t trade by default, then the agent will end up with A+ when initialized with A, and B when initialized with B—which feels a little similar to what you said about DSM allowing both A+ and B as permissible options.)

It sounds like you want to make it so that the agent’s initial state isn’t taken into account—in fact, it sounds like you want to assign values only to terminal nodes in the tree, take the subset of those terminal nodes which have maximal utility within a particular incomparability class, and choose arbitrarily among those. My frame, then, would be equivalent to using the agent’s initial state as a tiebreaker: whichever terminal node shares an incomparability class with the agent’s initial state will be the one the agent chooses to steer towards.

...in which case, assuming I got the above correct, I think I stand by my initial claim that this will lead to behavior which, while not necessarily “trammeling” by your definition, is definitely consequentialist in the worrying sense: an agent initialized in the “shutdown button not pressed” state will perform whatever intermediate steps are needed to navigate to the maximal-utility “shutdown button not pressed” state it can foresee, including actions which prevent the shutdown button from being pressed.