johnswentworth comments on Why Not Subagents?

johnswentworth Jun 29, 2023, 1:47 AM
3 points
−2
For a Caprice-Rule-abiding agent to avoid pursuing dominated strategies in single-sweetening money-pumps, that agent must be non-myopic: specifically, it must recognise that trading in A for B and then B for A+ is an available sequence of trades. And you might think that this is where my proposal falls down: actual agents will sometimes be myopic, so actual agents can’t always use the Caprice Rule to avoid pursuing dominated strategies, so actual agents are incentivised to avoid pursuing dominated strategies by instead probabilistically precommitting to take certain trades in ways that make their preferences complete (as you suggest).
That’s almost the counterargument that I’d give, but importantly not quite. The problem with the Caprice Rule is not that the agent needs to be non-myopic, but that the agent needs to know in advance which trades will be available. The agent can be non-myopic—i.e. have a model of future trades and optimize for future state—but still not know which trades it will actually have an opportunity to make. E.g. in the pizza example, when David and I are offered to trade mushroom for anchovy, we don’t yet know whether we’ll have an opportunity to trade anchovy for pepperoni later on.
More general point: I think relying on decision trees as our main model of the agents’ “environment” does not match the real world well, especially when using relatively small/simple trees. It seems to me that things like the Caprice rule are mostly exploiting ways in which decision trees are a poor model of realistic environments.
The assumption that we know in advance which trades will be available is one aspect of the problem, which could in-principle be handled by adding random choice nodes to the trees.
Another place where I suspect this is relevant (though I haven’t pinned it down yet): the argument in the post has a corner case when the probability of being offered some trade is zero. In that case, the agent will be indifferent between the completion and its original preferences, because the completion will just add a preference which will never actually be traded upon. I suspect that most of your examples are doing a similar thing—it’s telling that, in all your counterexamples, the agent is indifferent between original preferences and the completion; it doesn’t actively prefer the incomplete preferences. (Unless I’m missing something, in which case please correct me!) That makes me think that the small decision trees implicitly contain a lot of assumptions that various trades have zero probability of happening, which is load-bearing for your counterexamples. In a larger world, with a lot more opportunities to trade between various things, I’d expect that sort of issue to be much less relevant.
What links here?
- What do coherence arguments actually prove about agentic behavior? by sunwillrise (Jun 1, 2024, 9:37 AM; 123 points)
- EJT Jun 29, 2023, 9:05 PM
  9 points
  6
  Parent
  The problem with the Caprice Rule is not that the agent needs to be non-myopic, but that the agent needs to know in advance which trades will be available. The agent can be non-myopic—i.e. have a model of future trades and optimize for future state—but still not know which trades it will actually have an opportunity to make.
  It’s easy to extend the Caprice Rule to this kind of case. Suppose we have an agent that’s uncertain whether – conditional on trading mushroom (A) for anchovy (B) – it will later have the chance to trade in anchovy (B) for pepperoni (A+). Suppose in its model the probabilities are 50-50.
  Then our agent with a model of future trades can consider what it would choose conditional on finding itself in node 2: it can decide with what probability p it would choose A+, with the remaining probability 1-p going to B. Then, since choosing B at node 1 has a 0.5 probability of taking the agent to node 2 and a 0.5 probability of taking the agent to node 3, the agent can regard the choice of B at node 1 as the lottery 0.5p(A+)+(1-0.5p)(B) (since, conditional on choosing B at node 1, the agent will end up with A+ with probability 0.5p and end up with B otherwise).
  So for an agent with a model of future trades, the choice at node 1 is a choice between A and 0.5p(A+)+(1-0.5p)(B). What we’ve specified about the agent’s preferences over the outcomes A, B, and A+ doesn’t pin down what its preferences will be between A and 0.5p(A+)+(1-0.5p)(B) but either way the Caprice-Rule-abiding agent will not pursue a dominated strategy. If it strictly prefers one of A and 0.5p(A+)+(1-0.5p)(B) to the other, it will reliably choose its preferred option. If it has no preference, neither choice will constitute a dominated strategy.
  And this point generalises to arbitrarily complex/realistic decision trees, with more choice-nodes, more chance-nodes, and more options. Agents with a model of future trades can use their model to predict what they’d do conditional on reaching each possible choice-node, and then use those predictions to determine the nature of the options available to them at earlier choice-nodes. The agent’s model might be defective in various ways (e.g. by getting some probabilities wrong, or by failing to predict that some sequences of trades will be available) but that won’t spur the agent to change its preferences, because the dilemma from my previous comment recurs: if the agent is aware that some lottery is available, it won’t choose any dispreferred lottery; if the agent is unaware that some lottery is available and chooses a dispreferred lottery, the agent’s lack of awareness means it won’t be spurred by this fact to change its preferences. To get over this dilemma, you still need the ‘non-myopic optimiser deciding the preferences of a myopic agent’ setting, and my previous points apply: results from that setting don’t vindicate coherence arguments, and we humans as non-myopic optimisers could decide to create artificial agents with incomplete preferences.
  What links here?
  - Jeremy Gillen Jun 19, 2024, 3:46 AM
    4 points
    0
    Parent
    If it has no preference, neither choice will constitute a dominated strategy.
    I think this statement doesn’t make sense. If it has no preference between choices at node 1, then it has some chance of choosing outcome A. But if it does so, then that strategy is dominated by the strategy that always chooses the top branch, and chooses A+ if it can. This is because 50% of the time, it will get a final outcome of A when the dominating strategy gets A+, and otherwise the two strategies give incomparable outcomes.
    I’m assuming dominated means a strategy that gives a final outcome that is incomparable or > in the partial order of preferences, for all possible settings of random variables. (And strictly > for at least one setting of random variables). Maybe my definition is wrong? But it seems like this is the definition I want.
    - EJT Jun 19, 2024, 3:12 PM
      5 points
      0
      Parent
      We say that a strategy is dominated iff it leads to a lottery that is dispreferred to the lottery led to by some other available strategy. So if the lottery 0.5p(A+)+(1-0.5p)(B) isn’t preferred to the lottery A, then the strategy of choosing A isn’t dominated by the strategy of choosing 0.5p(A+)+(1-0.5p)(B). And if 0.5p(A+)+(1-0.5p)(B) is preferred to A, then the Caprice-rule-abiding agent will choose 0.5p(A+)+(1-0.5p)(B).
      You might think that agents must prefer lottery 0.5p(A+)+(1-0.5p)(B) to lottery A, for any A, A+, and B and for any p>0. That thought is compatible with my point above. But also, I don’t think the thought is true:
      Think about your own preferences.
      Let A be some career as an accountant, A+ be that career as an accountant with an extra $1 salary, and B be some career as a musician. Let p be small. Then you might reasonably lack a preference between 0.5p(A+)+(1-0.5p)(B) and A. That’s not instrumentally irrational.
      Think about incomplete preferences on the model of imprecise exchange rates.
      Here’s a simple example of the IER model. You care about two things: love and money. Each career gets a real-valued love score and a real-valued money score. Your exchange rate for love and money is imprecise, running from 0.4 to 0.6. On one proto-exchange-rate, love gets a weight of 0.4 and money gets a weight of 0.6, on another proto-exchange rate, love gets a weight of 0.6 and money gets a weight of 0.4. You weakly prefer one career to another iff it gets at least as high an overall score on both proto-exchange-rates. If one career gets a highger score on one proto-exchange-rate and the other gets a higher score on the other proto-exchange-rate, you have a preferential gap between the two careers. Let A’s <love, money> score be <0, 10>, A+’s score be <0, 11>, and B’s score be <10, 0>. A+ is preferred to A, because 0.4(0)+0.6(11) is greater than 0.4(0)+0.6(10), and 0.6(0)+0.4(11) is greater than 0.6(0)+0.4(10), but the agent lacks a preference between A+ and B, because 0.4(0)+0.6(11) is greater than 0.4(10)+0.6(0), but 0.6(0)+0.4(11) is less than 0.6(10)+0.4(0). And the agent lacks a preference between A and B for the same sort of reason.
      To keep things simple, let p=0.2, so your choice is between 0.1(A+)+0.9(B) and A. The expected <love, money> score of the former is <9, 0.11>. The expected <love, money> score of the latter is <0, 10>. You lack a preference between them, because 0.6(9)+0.4(0.11) is greater than 0.6(0)+0.4(10), and 0.4(0)+0.6(10) is greater than 0.4(9)+0.6(0.11).
      The general principle that you appeal to (If X is weakly preferred to or pref-gapped with Y in every state of nature, and X is strictly preferred to Y in some state of nature, then the agent must prefer X to Y) implies that rational preferences can be cyclic. B must be preferred to p(B-)+(1-p)(A+), which must be preferred to A, which must be preferred to p(A-)+(1-p)B+, which must be preferred to B.
      - Jeremy Gillen Jun 20, 2024, 12:37 AM
        5 points
        0
        Parent
        It seems we define dominance differently. I believe I’m defining it a similar way as “uniformly better” here. [Edit: previously I put a screenshot from that paper in this comment, but translating from there adds a lot of potential for miscommunication, so I’m replacing it with my own explanation in the next paragraph, which is more tailored to this context.].
        A strategy outputs a decision, given a decision tree with random nodes. With a strategy plus a record of the outcome of all random nodes we can work out the final outcome reached by that strategy (assuming the strategy is deterministic for now). Let’s write this like Outcome(strategy, environment_random_seed). Now I think that we should consider a strategy s to dominate another strategy s* if for all possible environment_random_seeds, Outcome(s, seed) ≥ Outcome(s*,seed), and for some random seed, Outcome(s, seed*) > Outcome(s*, seed*). (We can extend this to stochastic strategies, but I want to avoid that unless you think it’s necessary, because it will reduce clarity).
        In other words, a strategy is better if it always turns out to do “equally” well or better than the other strategy, no matter the state of nature. By this definition, a strategy that chooses A at the first node will be dominated.
        Relating this to your response:
        We say that a strategy is dominated iff it leads to a lottery that is dispreferred to the lottery led to by some other available strategy. So if the lottery 0.5p(A+)+(1-0.5p)(B) isn’t preferred to the lottery A, then the strategy of choosing A isn’t dominated by the strategy of choosing 0.5p(A+)+(1-0.5p)(B). And if 0.5p(A+)+(1-0.5p)(B) is preferred to A, then the Caprice-rule-abiding agent will choose 0.5p(A+)+(1-0.5p)(B).
        I don’t like that you’ve created a new lottery at the chance node, cutting off the rest of the decision tree from there. The new lottery wasn’t in the initial preferences. The decision about whether to go to that chance node should be derived from the final outcomes, not from some newly created terminal preference about that chance node. Your dominance definition depends on this newly created terminal preference, which isn’t a definition that is relevant to what I’m interested in.
        I’ll try to back up and summarize my motivation, because I expect any disagreement is coming from there. My understanding of the point of the decision tree is that it represents the possible paths to get to a final outcome. We have some preference partial order over final outcomes. We have some way of ranking strategies (dominance). What we want out of this is to derive results about the decisions the agent must make in the intermediate stage, before getting to a final outcome.
        If it has arbitrary preferences about non-final states, then it’s behavior is entirely unconstrained and we cannot derive any results about its decisions in the intermediate state.
        So we should only use a definition of dominance that depends on final outcomes, then any strategy that doesn’t always choose B at decision node 1 will be dominated by a strategy that does, according to the original preference partial order.
        (I’ll respond to the other parts of your response in another comment, because it seems important to keep the central crux debate in one thread without cluttering it with side-tracks).
        What links here?
        What do coherence arguments actually prove about agentic behavior? by sunwillrise (Jun 1, 2024, 9:37 AM; 123 points)
        EJT Jun 25, 2024, 2:49 PM
        9 points
        0
        Parent
        Things are confusing because there are lots of different dominance relations that people talk about. There’s a dominance relation on strategies, and there are (multiple) dominance relations on lotteries.
        Here are the definitions I’m working with.
        A strategy is a plan about which options to pick at each choice-node in a decision-tree.
        Strategies yield lotteries (rather than final outcomes) when the plan involves passing through a chance-node. For example, consider the decision-tree below:
        A strategy specifies what option the agent would pick at choice-node 1, what option the agent would pick at choice-node 2, and what option the agent would pick at choice-node 3.
        Suppose that the agent’s strategy is {Pick B at choice-node 1, Pick A+ at choice-node 2, Pick B at choice-node 3}. This strategy doesn’t yield a final outcome, because the agent doesn’t get to decide what happens at the chance-node. Instead, the strategy yields the lottery 0.5(A+)+0.5(B). This just says that: if the agent executes the strategy, then there’s a 0.5 probability that they end up with final outcome A+ and a 0.5 probability that they end up with final outcome B.
        The dominance relation on strategies has to refer to the lotteries yielded by strategies, rather than the final outcomes yielded by strategies, because strategies don’t yield final outcomes when the agent passes through a chance-node.^[1] So we define the dominance relation on strategies as follows:
        Strategy Dominance (relation)
        A strategy S is dominated by a strategy S’ iff S yields a lottery X that is strictly dispreferred to the lottery X’ yielded by S’.
        Now for the dominance relations on lotteries.^[2] One is:
        Statewise Dominance (relation)
        Lottery X statewise-dominates lottery Y iff, in each state [environment_random_seed], X yields a final outcome weakly preferred to the final outcome yielded by Y, and in some state [environment_random_seed], X yields a final outcome strictly preferred to the final outcome yielded by Y.
        Another is:
        Statewise Pseudodominance (relation)
        Lottery X statewise-pseudodominates lottery Y iff, in each state [environment_random_seed], X yields a final outcome weakly preferred to or pref-gapped to the final outcome yielded by Y, and in some state [environment_random_seed], X yields a final outcome strictly preferred to the final outcome yielded by Y.
        The lottery A (that yields final outcome A for sure) is statewise-pseudodominated by the lottery 0.5(A+)+0.5(B), but it isn’t statewise-dominated by 0.5(A+)+0.5(B). That’s because the agent has a preferential gap between the final outcomes A and B.
        Advanced agents with incomplete preferences over final outcomes will plausibly satisfy the Statewise Dominance Principle:
        Statewise Dominance Principle
        If lottery X statewise-dominates lottery Y, then the agent strictly prefers X to Y.
        And that’s because agents that violate the Statewise Dominance Principle are ‘shooting themselves in the foot’ in the relevant sense. If the agent executes a strategy that yields a statewise-dominated lottery, then there’s another available strategy that—in each state—gives a final outcome that is at least as good in every respect that the agent cares about, and—in some state—gives a final outcome that is better in some respect that the agent cares about.
        But advanced agents with incomplete preferences over final outcomes plausibly won’t satisfy the Statewise Pseudodominance Principle:
        Statewise Pseudodominance Principle
        If lottery X statewise-pseudodominates lottery Y, then the agent strictly prefers X to Y.
        And that’s for the reasons that I gave in my comment above. Condensing:
        A statewise-pseudodominated lottery can be such that, in some state, that lottery is better than all other available lotteries in some respect that the agent cares about.
        The statewise pseudodominance relation is cyclic, so the Statewise Pseudodominance Principle would lead to cyclic preferences.
        ^
        You say:
        The decision about whether to go to that chance node should be derived from the final outcomes, not from some newly created terminal preference about that chance node.
        But:
        - The decision can also depend on the probabilities of those final outcomes.
        - The decision is constrained by preferences over final outcomes and probabilities of those final outcomes. I’m supposing that the agent’s preferences over lotteries depends only on these lotteries’ possible final outcomes and their probabilities. I’m not supposing that the agent has newly created terminal preferences/arbitrary preferences about non-final states.
        ^
        There are stochastic versions of each of these relations, which ignore how states line up across lotteries and instead talk about probabilities of outcomes. I think everything I say below is also true for the stochastic versions.
        Jeremy Gillen Jul 11, 2024, 9:47 AM
        6 points
        0
        Parent
        [Edit: I think I misinterpreted EJT in a way that invalidates some of this comment, see downthread comment clarifying this].
        That is really helpful, thanks. I had been making a mistake, in that I thought that there was an argument from just “the agent thinks it’s possible the agent will run into a money pump” that concluded “the agent should complete that preference in advance”. But I was thinking sloppily and accidentally sometimes equivocating between pref-gaps and indifference. So I don’t think this argument works by itself, but I think it might be made to work with an additional assumption.
        One intuition that I find convincing is that if I found myself at outcome A in the single sweetening money pump, I would regret having not made it to A+. This intuition seems to hold even if I imagine A and B to be of incomparable value.
        In order to avoid this regret, I would try to become the sort of agent that never found itself in that position. I can see that if I always follow the Caprice rule, then it’s a little weird to regret not getting A+, because that isn’t a counterfactually available option (counterfacting on decision 1). But this feels like I’m being cheated. I think the reason that if feels like I’m being cheated is that I feel like getting to A+ should be a counterfactually available option.
        One way to make it a counterfactually available option in the thought experiment is to introduce another choice before choice 1 in the decision tree. The new choice (0), is the choice about whether to maintain the same decision algorithm (call this incomplete), or complete the preferential gap between A and B (call this complete).
        I think the choice complete statewise dominates incomplete. This is because the choice incomplete results in a lottery {B: $q p$ , A+: $q (1 - p)$ , A: $(1 - q)$ } for $q < 1$ .^[1] However, the choice complete results in the lottery {B: $p$ , A+: $(1 - p)$ , A:0}.
        Do you disagree with this? I think this allows us to create a money pump, by charging the agent $ $ϵ$ for the option to complete its own preferences.
        The statewise pseudodominance relation is cyclic, so the Statewise Pseudodominance Principle would lead to cyclic preferences.
        This still seems wrong to me, because I see lotteries as being an object whose purpose is to summarize random variables and outcomes. So it’s weird to compare lotteries that depend on the same random variables (they are correlated), as if they are independent. This seems like a sidetrack though, and it’s plausible to me that I’m just confused about your definitions here.
        ^
        Letting $p$ be the probability that the agent chooses 2A+ and $(1 - p)$ the probability the agent chooses 2B (following your comment above). And $q$ is defined similarly, for choice 1.
        What links here?
        Jeremy Gillen's comment on Another argument against utility-centric alignment paradigms by Fiora Sunshine (Sep 24, 2024, 11:09 AM; 2 points)
        Jeremy Gillen Jul 18, 2024, 7:20 AM
        2 points
        0
        Parent
        I made a mistake again. As described above, complete only pseudodominates incomplete.
        But this is easily patched with the trick described in the OP. So we need the choice complete to make two changes to the downstream decisions. First, change decision 1 to always choose up (as before), second, change the distribution of Decision 2 to { $1 - q (1 - p)$ , $q (1 - p)$ }, because this keeps the probability of B constant. Fixed diagram:
        Now the lottery for complete is {B: $q (1 - p)$ , A+: $1 - q (1 - p)$ , A: $0$ }, and the lottery for incomplete is {B: $q (1 - p)$ , A+: $p q$ , A: $(1 - q)$ }. So overall, there is a pure shift of probability from A to A+.
        [Edit 23/7: hilariously, I still had the probabilities wrong, so fixed them, again].
        What links here?
        Jeremy Gillen's comment on Jeremy Gillen’s Shortform by Jeremy Gillen (Sep 6, 2024, 11:15 AM; 25 points)
        Jeremy Gillen Oct 2, 2024, 12:16 PM
        5 points
        0
        Parent
        I think the above money pump works, if the agent sometimes chooses the A path, but I was incorrect in thinking that the caprice rule sometimes chooses the A path.
        I misinterpreted one of EJT’s comments as saying it might choose the A path. The last couple of days I’ve been reading through some of the sources he linked to in the original “there are no coherence theorems” post and one of them (Gustafsson) made me realize I was interpreting him incorrectly, by simplifying the decision tree in a way that doesn’t make sense. I only realized this yesterday.
        
        Now I think that the caprice rule is essentially equivalent to updatelessness. If I understand correctly, it would be equivalent to 1. choosing the best policy by ranking them in the partial order of outcomes (randomizing over multiple maxima), then 2. implementing that policy without further consideration. And this makes it immune to money pumps and renders any self-modification pointless. It also makes it behaviorally indistinguishable from an agent with complete preferences, as far as I can tell.
        The same updatelessness trick seems to apply to all money pump arguments. It’s what scott uses in this post to avoid the independence money pump.
        
        So currently I’m thinking updatelessness removes most of the justification for the VNM axioms (including transitivity!). But I’m confused because updateless policies still must satisfy local properties like “doesn’t waste resources unless it helps achieve the goal”, which is intuitively what the money pump arguments represent. So there must be some way to recover properties like this. Maybe via John’s approach here.
        But I’m only maybe 80% sure of my new understanding, I’m still trying to work through it all.
        What links here?
        Jeremy Gillen's comment on Jeremy Gillen’s Shortform by Jeremy Gillen (Oct 2, 2024, 1:12 PM; 2 points)
        dxu Nov 1, 2024, 11:27 PM
        8 points
        0
        Parent
        It looks to me like the “updatelessness trick” you describe (essentially, behaving as though certain non-local branches of the decision tree are still counterfactually relevant even though they are not — although note that I currently don’t see an obvious way to use that to avoid the usual money pump against intransitivity) recovers most of the behavior we’d see under VNM anyway; and so I don’t think I understand your confusion re: VNM axioms.
        
        E.g. can you give me a case in which (a) we have an agent that exhibits preferences against whose naive implementation there exists some kind of money pump (not necessarily a repeatable one), (b) the agent can implement the updatelessness trick in order to avoid the money pump without modifying their preferences, and yet (c) the agent is not then representable as having modified their preferences in the relevant way?
        Jeremy Gillen Nov 5, 2024, 7:42 PM
        2 points
        0
        Parent
        Good point.
        What I meant by updatelessness removes most of the justification is the reason given here at the very beginning of “Against Resolute Choice”. In order to make a money pump that leads the agent in a circle, the agent has to continue accepting trades around a full preference loop. But if it has decided on the entire plan beforehand, it will just do any plan that involves <1 trip around the preference loop. (Although it’s unclear how it would settle on such a plan, maybe just stopping its search after a given time). It won’t (I think?) choose any plan that does multiple loops, because they are strictly worse.
        After choosing this plan though, I think it is representable as VNM rational, as you say. And I’m not sure what to do with this. It does seem important.
        However, I think Scott’s argument here satisfies (a) (b) and (c). I think the independence axiom might be special in this respect, because the money pump for independence is exploiting an update on new information.
        Expand this thread
        EJT Nov 19, 2024, 1:46 PM
        3 points
        0
        Parent
        I don’t think agents that avoid the money pump for cyclicity are representable as satisfying VNM, at least holding fixed the objects of preference (as we should). Resolute choosers with cyclic preferences will reliably choose B over A- at node 3, but they’ll reliably choose A- over B if choosing between these options ex nihilo. That’s not VNM representable, because it requires that the utility of A- be greater than the utility of B and. that the utility of B be greater than the utility of A-
        Jeremy Gillen Feb 3, 2025, 6:24 PM
        2 points
        0
        Parent
        Perhaps I’m misusing the word “representable”? But what I meant was that any single sequence of actions generate by the agent could also have been generated by an outcome-utility maximizer (that has the same world model). This seems like the relevant definition, right?
        EJT Nov 19, 2024, 1:37 PM
        3 points
        0
        Parent
        It also makes it behaviorally indistinguishable from an agent with complete preferences, as far as I can tell.
        That’s not right. As I say in another comment:
        And an agent abiding by the Caprice Rule can’t be represented as maximising utility, because its preferences are incomplete. In cases where the available trades aren’t arranged in some way that constitutes a money-pump, the agent can prefer (/reliably choose) A+ over A, and yet lack any preference between (/stochastically choose between) A+ and B, and lack any preference between (/stochastically choose between) A and B. Those patterns of preference/behaviour are allowed by the Caprice Rule.
        Or consider another example. The agent trades A for B, then B for A, then declines to trade A for B+. That’s compatible with the Caprice rule, but not with complete preferences.
        Or consider the pattern of behaviour that (I elsewhere argue) can make agents with incomplete preferences shutdownable. Agents abiding by the Caprice rule can refuse to pay costs to shift probability mass between A and B, and refuse to pay costs to shift probability mass between A and B+. Agents with complete preferences can’t do that.
        The same updatelessness trick seems to apply to all money pump arguments.
        [I’m going to use the phrase ‘resolute choice’ rather than ‘updatelessness.’ That seems like a more informative and less misleading description of the relevant phenomenon: making a plan and sticking to it. You can stick to a plan even if you update your beliefs. Also, in the posts on UDT, ‘updatelessness’ seems to refer to something importantly distinct from just making a plan and sticking to it.]
        That’s right, but the drawbacks of resolute choice depend on the money pump to which you apply it. As Gustafsson notes, if an agent uses resolute choice to avoid the money pump for cyclic preferences, that agent has to choose against their strict preferences at some point. For example, they have to choose B at node 3 in the money pump below, even though—were they facing that choice ex nihilo—they’d prefer to choose A-.
        There’s no such drawback for agents with incomplete preferences using resolute choice. As I note in this post, agents with incomplete preferences using resolute choice need never choose against their strict preferences. The agent’s past plan only has to serve as a tiebreaker: forcing a particular choice between options between which they’d otherwise lack a preference. For example, they have to choose B at node 2 in the money pump below. Were they facing that choice ex nihilo, they’d lack a preference between B and A-.
        Jeremy Gillen Feb 3, 2025, 6:17 PM
        2 points
        0
        Parent
        That’s not right
        Are you saying that my description (following) is incorrect?
        [incomplete preferences w/ caprice] would be equivalent to 1. choosing the best policy by ranking them in the partial order of outcomes (randomizing over multiple maxima), then 2. implementing that policy without further consideration.
        Or are you saying that it is correct, but you disagree that this implies that it is “behaviorally indistinguishable from an agent with complete preferences”? If this is the case, then I think we might disagree on the definition of “behaviorally indistinguishable”? I’m using it like: If you observe a single sequence of actions from this agent (and knowing the agent’s world model), can you construct a utility function over outcomes that could have produced that sequence.
        Or consider another example. The agent trades A for B, then B for A, then declines to trade A for B+. That’s compatible with the Caprice rule, but not with complete preferences.
        This is compatible with a resolute outcome-utility maximizer (for whom A is a maxima). There’s no rule that says an agent must take the shortest route to the same outcome (right?).
        As Gustafsson notes, if an agent uses resolute choice to avoid the money pump for cyclic preferences, that agent has to choose against their strict preferences at some point.
        ...
        There’s no such drawback for agents with incomplete preferences using resolute choice.
        Sure, but why is that a drawback? It can’t be money pumped, right? Agents following resolute choice often choose against their local strict preferences in other decision problems. (E.g. Newcomb’s). And this is considered an argument in favour of resolute choice.
      - Jeremy Gillen Jun 20, 2024, 1:29 AM
        3 points
        0
        Parent
        (sidetrack comment, this is not the main argument thread)
        Think about your own preferences.
        Let A be some career as an accountant, A+ be that career as an accountant with an extra $1 salary, and B be some career as a musician. Let p be small. Then you might reasonably lack a preference between 0.5p(A+)+(1-0.5p)(B) and A. That’s not instrumentally irrational.
        I find this example unconvincing, because any agent that has finite precision in their preference representation will have preferences that are a tiny bit incomplete in this manner. As such, a version of myself that could more precisely represent the value-to-me of different options would be uniformly better than myself, by my own preferences. But the cost is small here. The amount of money I’m leaving on the table is usually small, relative to the price of representing and computing more fine-grained preferences.
        I think it’s really important to recognize the places where toy models can only approximately reflect reality, and this is one of them. But it doesn’t reduce the force of the dominance argument. The fact that humans (or any bounded agent) can’t have exactly complete preferences doesn’t mean that it’s impossible for them to be better by their own lights.
        Think about incomplete preferences on the model of imprecise exchange rates.
        I appreciate you writing out this more concrete example, but that’s not where the disagreement lies. I understand partially ordered preferences. I didn’t read the paper though. I think it’s great to study or build agents with partially ordered preferences, if it helps get other useful properties. It just seems to me that they will inherently leave money on the table. In some situations this is well worth it, so that’s fine.
        The general principle that you appeal to (If X is weakly preferred to or pref-gapped with Y in every state of nature, and X is strictly preferred to Y in some state of nature, then the agent must prefer X to Y) implies that rational preferences can be cyclic. B must be preferred to p(B-)+(1-p)(A+), which must be preferred to A, which must be preferred to p(A-)+(1-p)B+, which must be preferred to B.
        No, hopefully the definition in my other comment makes this clear. I believe you’re switching the state of nature for each comparison, in order to construct this cycle.
        EJT Jun 25, 2024, 3:33 PM
        1 point
        0
        Parent
        There could be agents that only have incomplete preferences because they haven’t bothered to figure out the correct completion. But there could also be agents with incomplete preferences for which there is no correct completion. The question is whether these agents are pressured by money-pump arguments to settle on some completion.
        I understand partially ordered preferences.
        Yes, apologies. I wrote that explanation in the spirit of ‘You probably understand this, but just in case...’. I find it useful to give a fair bit of background context, partly to jog my own memory, partly as a just-in-case, partly in case I want to link comments to people in future.
        I believe you’re switching the state of nature for each comparison, in order to construct this cycle.
        I don’t think this is true. You can line up states of nature in any way you like.