The “damage” from shooting your own foot is defined in the terms of the utility-number.
Say I pick a dominated strategy that nets me 2 apples and the dominating strategy nets me 3 apples. If on another level of modelling I can know that the first apples are clean and the 2 apples in the dominating arrangement have worms I might be happy to be dominated. Apple-level damage is okay (while nutritional level damage might not be). All deductive results are tautologies but “if you can’t model the agent as trying to achieve goal X then it’s inefficient at achieving X” seems very far from “incoherent agents are stupid”.
If some of the apples are clean and others have worms, then that is modeled in your preference ordering: you prefer clean apples to wormy ones, perhaps at some exchange rate, etc. We then stipulate that all the apples are clean (or all are wormy, or all have an equal chance of being clean vs. wormy, etc.), and the analysis proceeds as before.
That said, your general point is worth exploring. If we suppose, as Eliezer says, that
Alice … prefers having more fruit to less fruit, ceteris paribus, for each category of fruit
… and if we further suppose that her preferences are intransitive, then we conclude that Alice’s strategy is strictly dominated by some other.
That is—Alice’s strategy is strictly dominated in terms of apples (or fruit in general). It can’t be dominated in utility, of course, because we cannot construct a utility function from Alice’s preferences (on account of their intransitivity)!
Well, and so what? Is this bad according to Alice’s own preferences? Can we show this? How would we do that? By asking Alice whether she prefers the outcome (5 apples and 1 orange) to the initial state (8 apples and 1 orange)? But what good is that? If Alice’s preferences are circular, then it’s entirely possible (in fact, it’s true) that the outcome (5 apples and 1 orange) both dominates, and is dominated by, the initial state (8 apples and 1 orange).
(More accurately, that’s true if we’re permitted to say that if strategy X dominates Y, and Y dominates Z, then X dominates Z. It’s not possible for an agent to prefer X to Y and, simultaneously, Y to X, however intransitive their preferences are, if they still obey the completeness axiom. Of course, if an agent’s preferences are intransitive and incomplete, then it can prefer X to Y, and also Y to X.)
The point is this: it’s not so easy to show that an agent’s strategy is sub-optimal according to its own preferences if those preferences violate the axioms. We can gesture at some intuitive considerations like “well, that’s obviously stupid”, but these amount to little more than the fact that we find the violated axioms intuitively attractive in the given case.
I was thinking of another agent judging my strategies and making a backed argument why I am wrong. If someone said “you were suboptimal on fruit front, I fixed that mistake for you” and I arrive at a table with 2 worm apples, I would be annoyed/pissed. I am assuming that the other agent can’t evaluate their cleanness—it’s all fruit to them. Moreover it might be that worm apples are rare and observing my trade activity it might be inductively well supported that I seem to value “fruit-maximization” a great deal (nutrition maximisation with clean fruit is just fruit maximisation). And it might be important to understand that he didn’t mean to cause wormy apples (he isn’t even capable of meaning that) but his actions might have infact caused it.
In the case that wormy apples are frequent the hypothesis that I am a fruit-maximiser is violated clearly enough that he knows to be on shaky grounds on modelling me as a fruitmaximiser. For some very unskilled traders they might confuse one type of fruit with another and be inconsistent because they can’t get their fruit categories straight. At some midskill “fruitmaximisement” peaks and those that don’t understand things beyond that point will confuse those that are yet to get to fruitmaximization and those that are past that. Expecting super-intelligent things to be consistent kind of assumes that if a metric ever becomes a good goal higher levels will never be weaker on that metric, that maximation strictly grows and never decreases with ability for all submetrics.
The “damage” from shooting your own foot is defined in the terms of the utility-number.
Say I pick a dominated strategy that nets me 2 apples and the dominating strategy nets me 3 apples. If on another level of modelling I can know that the first apples are clean and the 2 apples in the dominating arrangement have worms I might be happy to be dominated. Apple-level damage is okay (while nutritional level damage might not be). All deductive results are tautologies but “if you can’t model the agent as trying to achieve goal X then it’s inefficient at achieving X” seems very far from “incoherent agents are stupid”.
If some of the apples are clean and others have worms, then that is modeled in your preference ordering: you prefer clean apples to wormy ones, perhaps at some exchange rate, etc. We then stipulate that all the apples are clean (or all are wormy, or all have an equal chance of being clean vs. wormy, etc.), and the analysis proceeds as before.
That said, your general point is worth exploring. If we suppose, as Eliezer says, that
… and if we further suppose that her preferences are intransitive, then we conclude that Alice’s strategy is strictly dominated by some other.
That is—Alice’s strategy is strictly dominated in terms of apples (or fruit in general). It can’t be dominated in utility, of course, because we cannot construct a utility function from Alice’s preferences (on account of their intransitivity)!
Well, and so what? Is this bad according to Alice’s own preferences? Can we show this? How would we do that? By asking Alice whether she prefers the outcome (5 apples and 1 orange) to the initial state (8 apples and 1 orange)? But what good is that? If Alice’s preferences are circular, then it’s entirely possible (in fact, it’s true) that the outcome (5 apples and 1 orange) both dominates, and is dominated by, the initial state (8 apples and 1 orange).
(More accurately, that’s true if we’re permitted to say that if strategy X dominates Y, and Y dominates Z, then X dominates Z. It’s not possible for an agent to prefer X to Y and, simultaneously, Y to X, however intransitive their preferences are, if they still obey the completeness axiom. Of course, if an agent’s preferences are intransitive and incomplete, then it can prefer X to Y, and also Y to X.)
The point is this: it’s not so easy to show that an agent’s strategy is sub-optimal according to its own preferences if those preferences violate the axioms. We can gesture at some intuitive considerations like “well, that’s obviously stupid”, but these amount to little more than the fact that we find the violated axioms intuitively attractive in the given case.
I was thinking of another agent judging my strategies and making a backed argument why I am wrong. If someone said “you were suboptimal on fruit front, I fixed that mistake for you” and I arrive at a table with 2 worm apples, I would be annoyed/pissed. I am assuming that the other agent can’t evaluate their cleanness—it’s all fruit to them. Moreover it might be that worm apples are rare and observing my trade activity it might be inductively well supported that I seem to value “fruit-maximization” a great deal (nutrition maximisation with clean fruit is just fruit maximisation). And it might be important to understand that he didn’t mean to cause wormy apples (he isn’t even capable of meaning that) but his actions might have infact caused it.
In the case that wormy apples are frequent the hypothesis that I am a fruit-maximiser is violated clearly enough that he knows to be on shaky grounds on modelling me as a fruitmaximiser. For some very unskilled traders they might confuse one type of fruit with another and be inconsistent because they can’t get their fruit categories straight. At some midskill “fruitmaximisement” peaks and those that don’t understand things beyond that point will confuse those that are yet to get to fruitmaximization and those that are past that. Expecting super-intelligent things to be consistent kind of assumes that if a metric ever becomes a good goal higher levels will never be weaker on that metric, that maximation strictly grows and never decreases with ability for all submetrics.