It seems like you deliberately picked completeness because that’s where Dutch book arguments are least compelling, and that you’d agree with the more usual Dutch book arguments.
But I think even the Dutch book for completeness makes some sense. You just have to separate “how the agent internally represents its preferences” from “what it looks like the agent us doing.” You describe an agent that dodges the money-pump by simply acting consistently with past choices. Internally this agent has an incomplete representation of preferences, plus a memory. But externally it looks like this agent is acting like it assigns equal value to whatever indifferent things it thought of choosing between first. If humans don’t get to control the order this agent considers options, or if we let it run for a long time and it’s already experienced the things humans might try to present to it from them on, then it will look like it’s acting according to complete preferences.
I don’t know if you’re still working on this, but if don’t already know of the literature on choice supportive bias and similar processes that occur in humans, they look to me a lot like heuristics that probably harden a human agent into being “more coherent” over time (especially in proximity to other ways of updating value estimation processes), and likely have an adaptive role in improving (regularizing?) instrumental value estimates.
Your essay seemed consistent with the claim that “in the past, as verifiable by substantial scholarship, no one ever proved exactly X” but your essay never actually showed “X is provably false” that I noticed?
And, indeed, maybe you can prove it one way or the other for some X, where X might be (as you seem to claim) “naive coherence is impossible” or maybe where some X’ or X″ are “sophisticated coherence is approached by algorithm L as t goes to infinity” (or whatever)?
For my money, the thing to do here might be to focus on Value-of-Information, since VoI seems to me like a super super super important concept, and potentially a way to bridge questions of choice and knowledge and costly information gathering actions.
Thanks! I’ll have a think about choice-supportive bias and how it applies.
I think it is provably false that any agent not representable as an expected-utility-maximizer is liable to pursue dominated strategies. Agents with incomplete preferences aren’t representable as expected-utility-maximizers, and they can make themselves immune from pursuing dominated strategies by acting in accordance with the following policy: ‘if I previously turned down some option X, I will not choose any option that I strictly disprefer to X.’
I don’t know about you, but I’m actually OK dithering a bit, and going in circles, and doing things that mere entropy can “make me notice regret based on syntactically detectable behavioral signs” (like not even active adversarial optimization pressure like that which is somewhat inevitably generated in predator prey contexts).
For example, in my twenties I formed an intent, and managed to adhere to the habit somewhat often, where I’d flip a coin any time I noticed decisions where the cost to think about it in an explicit way was probably larger than the difference in value between the likely outcomes.
(Sometimes I flipped coins and then ignored the coin if I noticed I was sad with that result, as a way to cheaply generate that mental state of having an intuitive internally accessible preference without having to put things into words or do math. When I noticed that that stopped working very well, I switched to flipping a coin, then “if regret, flip again, and follow my head on head, and follow the first coin on tails”. The double flipping protocol seemed to help make ALL the first coins have “enough weight” for me to care about them sometimes, even when I always then stopped for a second to see if I was happy or sad or bored by the first coin flip. And of course I do such things much much much less now, and lately have begun to consider taking a personal vow to refuse to randomize, except towards enemies, for an experimental period of time.)
The plans and the hopes here sort of naturally rely on “getting better at preferring things wisely over time”!
And the strategy relies pretty critically on having enough MEMORY to hope to build up data on various examples of different ways that similar situations went in the past, such as to learn from mistakes and thereby rarely “lack a velleity” and to reduce the rate at which I justifiably regret past velleities or choices.
And a core reason I think that sentience and sapience reliably convergently evolve is almost exactly “to store and process memories to enable learning (including especially contextually sensitive instrumental preference learning) inside a single lifetime”.
(Credit assignment and citation note: Cristof Koch was the first researcher who I heard explain surprising experimental data that suggested that “minds are common” with stuff like bee learning, and eventually I stopped being surprised when I heard about low key bee numeracy or another crazy mental power of cuttlefish. I didn’t know about EITHER of the experiments I just linked to, but I paused to find “the kinds of things one finds here if one actually looks”. That I found such links, for me, was a teensy surprise, and slightly contributes to my posterior belief in a claim roughly like “something like ‘coherence’ is convergently useful and is what our minds were originally built, by evolution, to approximately efficiently implement”.)
Basically, I encourage you, when you go try to prove that “Agents with incomplete preferences can make themselves immune from pursuing dominated strategies by following plan P” to consider the resource costs of those plans (like the cost in memory) and to ask whether those resources are being used optimally, or whether a different use of them could get better results faster.
Also… I expect that the proofs you attempt might actually succeed if you have “agents in isolation” or “agents surrounded only by agents that respect property rights” but to fail if you consider the case of adversarial action space selection in an environment of more than one agent (like where wolves seek undominated strategies for eating sheep, and no sheep is able to simply ‘turn down’ the option of being eaten by an arbitrarily smart wolf without itself doing something clever and potentially memory-or-VNM-perfection-demanding).
I do NOT think you will prove “in full generality, nothing like coherence is pragmatically necessary to avoid dutch booking” but I grant that I’m not sure about this! I have noticed from experience that my mathematical intuitions are actually fallible. That’s why real math is worth the elbow grease! <3
That separation between internal preferences and external behaviour is already implicit in Dutch books. Decision theory is about external behaviour, not internal representations. It talks about what agents do, not how agents work inside. As parts of decision theory, a preference, to them, is about something the system does or does not do in a given situation. When they talk about someone preferring pizza without pineapple, it’s about that person paying money to not have pineapple on their pizza in some range of situations, not some definition related to computations about pineapples and pizzas in that person’s brain.
The OP claims that the policy “if I previously turned down some option X, I will not choose any option that I strictly disprefer to X” escapes the money pump but “never requires them to change or act against their preferences”.
But it’s not clear to me what conceptual difference there is supposed to be between “I will modify my action policy to hereafter always choose B over A-” and “I will modify my preferences to strictly prefer B over A-, removing the preference gap and bringing my preferences closer to completeness”.
Ah yep, apologies, I meant to say “never requires them to change or act against their strict preferences.”
Whether there’s a conceptual difference will depend on our definition of ‘preference.’ We could define ‘preference’ as follows: an agent prefers X to Y iff the agent reliably chooses X over Y.′ In that case, modifying the policy is equivalent to forming a preference.
But we could also define ‘preference’ so that it requires more than just reliable choosing. For example, we might also require that (when choosing between lotteries) the agent always take opportunities to shift probability mass away from Y and towards X.
On the latter definition, modifying the policy need not be equivalent to forming a preference, because it only involves the reliably choosing and not the shifting of probability mass.
And the latter definition might be more pertinent in this context, where our interest is in whether agents will be expected utility maximizers.
But also, even if we go with the former definition, I think it matters a lot whether money-pumps compel rational agents to complete all their preferences up front, or whether money-pumps just compel agents to resolve preferential gaps over time, conditional on them coming to face choices that are arranged like a money-pump (and only completing their preferences if and once they’ve faced a sufficiently diverse range of choices). In particular, I think it matters in the context of the shutdown problem. I talk a bit more about this here.
If it doesn’t move probability mass, won’t it still be vulnerable to probabilistic money pumps? e.g. in the single-souring pump, you could just replace the choice between A- and B with a choice between two lotteries that have different mixtures of A- and B.
I have also left a reply to the comment you linked.
You describe an agent that dodges the money-pump by simply acting consistently with past choices. Internally this agent has an incomplete representation of preferences, plus a memory. But externally it looks like this agent is acting like it assigns equal value to whatever indifferent things it thought of choosing between first.
Not sure I follow this / agree. Seems to me that in the “Single-Souring Money Pump” case:
If the agent systematically goes down at node 1, all we learn is that the agent doesn’t strictly prefer [B or A-] to A.
If the agent systematically goes up at node 1 and down at node 2, all we learn is that the agent doesn’t strictly prefer [A or A-] to B.
So this doesn’t tell us what the agent would do if they were faced with just a choice between A and B, or A- and B. We can’t conclude “equal value” here.
It seems like you deliberately picked completeness because that’s where Dutch book arguments are least compelling, and that you’d agree with the more usual Dutch book arguments.
But I think even the Dutch book for completeness makes some sense. You just have to separate “how the agent internally represents its preferences” from “what it looks like the agent us doing.” You describe an agent that dodges the money-pump by simply acting consistently with past choices. Internally this agent has an incomplete representation of preferences, plus a memory. But externally it looks like this agent is acting like it assigns equal value to whatever indifferent things it thought of choosing between first. If humans don’t get to control the order this agent considers options, or if we let it run for a long time and it’s already experienced the things humans might try to present to it from them on, then it will look like it’s acting according to complete preferences.
Great points. Thinking about these kinds of worries is my next project, and I’m still trying to figure out my view on them.
I don’t know if you’re still working on this, but if don’t already know of the literature on choice supportive bias and similar processes that occur in humans, they look to me a lot like heuristics that probably harden a human agent into being “more coherent” over time (especially in proximity to other ways of updating value estimation processes), and likely have an adaptive role in improving (regularizing?) instrumental value estimates.
Your essay seemed consistent with the claim that “in the past, as verifiable by substantial scholarship, no one ever proved exactly X” but your essay never actually showed “X is provably false” that I noticed?
And, indeed, maybe you can prove it one way or the other for some X, where X might be (as you seem to claim) “naive coherence is impossible” or maybe where some X’ or X″ are “sophisticated coherence is approached by algorithm L as t goes to infinity” (or whatever)?
For my money, the thing to do here might be to focus on Value-of-Information, since VoI seems to me like a super super super important concept, and potentially a way to bridge questions of choice and knowledge and costly information gathering actions.
Thanks! I’ll have a think about choice-supportive bias and how it applies.
I think it is provably false that any agent not representable as an expected-utility-maximizer is liable to pursue dominated strategies. Agents with incomplete preferences aren’t representable as expected-utility-maximizers, and they can make themselves immune from pursuing dominated strategies by acting in accordance with the following policy: ‘if I previously turned down some option X, I will not choose any option that I strictly disprefer to X.’
I don’t know about you, but I’m actually OK dithering a bit, and going in circles, and doing things that mere entropy can “make me notice regret based on syntactically detectable behavioral signs” (like not even active adversarial optimization pressure like that which is somewhat inevitably generated in predator prey contexts).
For example, in my twenties I formed an intent, and managed to adhere to the habit somewhat often, where I’d flip a coin any time I noticed decisions where the cost to think about it in an explicit way was probably larger than the difference in value between the likely outcomes.
(Sometimes I flipped coins and then ignored the coin if I noticed I was sad with that result, as a way to cheaply generate that mental state of having an intuitive internally accessible preference without having to put things into words or do math. When I noticed that that stopped working very well, I switched to flipping a coin, then “if regret, flip again, and follow my head on head, and follow the first coin on tails”. The double flipping protocol seemed to help make ALL the first coins have “enough weight” for me to care about them sometimes, even when I always then stopped for a second to see if I was happy or sad or bored by the first coin flip. And of course I do such things much much much less now, and lately have begun to consider taking a personal vow to refuse to randomize, except towards enemies, for an experimental period of time.)
The plans and the hopes here sort of naturally rely on “getting better at preferring things wisely over time”!
And the strategy relies pretty critically on having enough MEMORY to hope to build up data on various examples of different ways that similar situations went in the past, such as to learn from mistakes and thereby rarely “lack a velleity” and to reduce the rate at which I justifiably regret past velleities or choices.
And a core reason I think that sentience and sapience reliably convergently evolve is almost exactly “to store and process memories to enable learning (including especially contextually sensitive instrumental preference learning) inside a single lifetime”.
(Credit assignment and citation note: Cristof Koch was the first researcher who I heard explain surprising experimental data that suggested that “minds are common” with stuff like bee learning, and eventually I stopped being surprised when I heard about low key bee numeracy or another crazy mental power of cuttlefish. I didn’t know about EITHER of the experiments I just linked to, but I paused to find “the kinds of things one finds here if one actually looks”. That I found such links, for me, was a teensy surprise, and slightly contributes to my posterior belief in a claim roughly like “something like ‘coherence’ is convergently useful and is what our minds were originally built, by evolution, to approximately efficiently implement”.)
Basically, I encourage you, when you go try to prove that “Agents with incomplete preferences can make themselves immune from pursuing dominated strategies by following plan P” to consider the resource costs of those plans (like the cost in memory) and to ask whether those resources are being used optimally, or whether a different use of them could get better results faster.
Also… I expect that the proofs you attempt might actually succeed if you have “agents in isolation” or “agents surrounded only by agents that respect property rights” but to fail if you consider the case of adversarial action space selection in an environment of more than one agent (like where wolves seek undominated strategies for eating sheep, and no sheep is able to simply ‘turn down’ the option of being eaten by an arbitrarily smart wolf without itself doing something clever and potentially memory-or-VNM-perfection-demanding).
I do NOT think you will prove “in full generality, nothing like coherence is pragmatically necessary to avoid dutch booking” but I grant that I’m not sure about this! I have noticed from experience that my mathematical intuitions are actually fallible. That’s why real math is worth the elbow grease! <3
That separation between internal preferences and external behaviour is already implicit in Dutch books. Decision theory is about external behaviour, not internal representations. It talks about what agents do, not how agents work inside. As parts of decision theory, a preference, to them, is about something the system does or does not do in a given situation. When they talk about someone preferring pizza without pineapple, it’s about that person paying money to not have pineapple on their pizza in some range of situations, not some definition related to computations about pineapples and pizzas in that person’s brain.
Making a similar point from a different angle:
The OP claims that the policy “if I previously turned down some option X, I will not choose any option that I strictly disprefer to X” escapes the money pump but “never requires them to change or act against their preferences”.
But it’s not clear to me what conceptual difference there is supposed to be between “I will modify my action policy to hereafter always choose B over A-” and “I will modify my preferences to strictly prefer B over A-, removing the preference gap and bringing my preferences closer to completeness”.
Ah yep, apologies, I meant to say “never requires them to change or act against their strict preferences.”
Whether there’s a conceptual difference will depend on our definition of ‘preference.’ We could define ‘preference’ as follows: an agent prefers X to Y iff the agent reliably chooses X over Y.′ In that case, modifying the policy is equivalent to forming a preference.
But we could also define ‘preference’ so that it requires more than just reliable choosing. For example, we might also require that (when choosing between lotteries) the agent always take opportunities to shift probability mass away from Y and towards X.
On the latter definition, modifying the policy need not be equivalent to forming a preference, because it only involves the reliably choosing and not the shifting of probability mass.
And the latter definition might be more pertinent in this context, where our interest is in whether agents will be expected utility maximizers.
But also, even if we go with the former definition, I think it matters a lot whether money-pumps compel rational agents to complete all their preferences up front, or whether money-pumps just compel agents to resolve preferential gaps over time, conditional on them coming to face choices that are arranged like a money-pump (and only completing their preferences if and once they’ve faced a sufficiently diverse range of choices). In particular, I think it matters in the context of the shutdown problem. I talk a bit more about this here.
If it doesn’t move probability mass, won’t it still be vulnerable to probabilistic money pumps? e.g. in the single-souring pump, you could just replace the choice between A- and B with a choice between two lotteries that have different mixtures of A- and B.
I have also left a reply to the comment you linked.
Not sure I follow this / agree. Seems to me that in the “Single-Souring Money Pump” case:
If the agent systematically goes down at node 1, all we learn is that the agent doesn’t strictly prefer [B or A-] to A.
If the agent systematically goes up at node 1 and down at node 2, all we learn is that the agent doesn’t strictly prefer [A or A-] to B.
So this doesn’t tell us what the agent would do if they were faced with just a choice between A and B, or A- and B. We can’t conclude “equal value” here.