I don’t know if you’re still working on this, but if don’t already know of the literature on choice supportive bias and similar processes that occur in humans, they look to me a lot like heuristics that probably harden a human agent into being “more coherent” over time (especially in proximity to other ways of updating value estimation processes), and likely have an adaptive role in improving (regularizing?) instrumental value estimates.
Your essay seemed consistent with the claim that “in the past, as verifiable by substantial scholarship, no one ever proved exactly X” but your essay never actually showed “X is provably false” that I noticed?
And, indeed, maybe you can prove it one way or the other for some X, where X might be (as you seem to claim) “naive coherence is impossible” or maybe where some X’ or X″ are “sophisticated coherence is approached by algorithm L as t goes to infinity” (or whatever)?
For my money, the thing to do here might be to focus on Value-of-Information, since VoI seems to me like a super super super important concept, and potentially a way to bridge questions of choice and knowledge and costly information gathering actions.
Thanks! I’ll have a think about choice-supportive bias and how it applies.
I think it is provably false that any agent not representable as an expected-utility-maximizer is liable to pursue dominated strategies. Agents with incomplete preferences aren’t representable as expected-utility-maximizers, and they can make themselves immune from pursuing dominated strategies by acting in accordance with the following policy: ‘if I previously turned down some option X, I will not choose any option that I strictly disprefer to X.’
I don’t know about you, but I’m actually OK dithering a bit, and going in circles, and doing things that mere entropy can “make me notice regret based on syntactically detectable behavioral signs” (like not even active adversarial optimization pressure like that which is somewhat inevitably generated in predator prey contexts).
For example, in my twenties I formed an intent, and managed to adhere to the habit somewhat often, where I’d flip a coin any time I noticed decisions where the cost to think about it in an explicit way was probably larger than the difference in value between the likely outcomes.
(Sometimes I flipped coins and then ignored the coin if I noticed I was sad with that result, as a way to cheaply generate that mental state of having an intuitive internally accessible preference without having to put things into words or do math. When I noticed that that stopped working very well, I switched to flipping a coin, then “if regret, flip again, and follow my head on head, and follow the first coin on tails”. The double flipping protocol seemed to help make ALL the first coins have “enough weight” for me to care about them sometimes, even when I always then stopped for a second to see if I was happy or sad or bored by the first coin flip. And of course I do such things much much much less now, and lately have begun to consider taking a personal vow to refuse to randomize, except towards enemies, for an experimental period of time.)
The plans and the hopes here sort of naturally rely on “getting better at preferring things wisely over time”!
And the strategy relies pretty critically on having enough MEMORY to hope to build up data on various examples of different ways that similar situations went in the past, such as to learn from mistakes and thereby rarely “lack a velleity” and to reduce the rate at which I justifiably regret past velleities or choices.
And a core reason I think that sentience and sapience reliably convergently evolve is almost exactly “to store and process memories to enable learning (including especially contextually sensitive instrumental preference learning) inside a single lifetime”.
(Credit assignment and citation note: Cristof Koch was the first researcher who I heard explain surprising experimental data that suggested that “minds are common” with stuff like bee learning, and eventually I stopped being surprised when I heard about low key bee numeracy or another crazy mental power of cuttlefish. I didn’t know about EITHER of the experiments I just linked to, but I paused to find “the kinds of things one finds here if one actually looks”. That I found such links, for me, was a teensy surprise, and slightly contributes to my posterior belief in a claim roughly like “something like ‘coherence’ is convergently useful and is what our minds were originally built, by evolution, to approximately efficiently implement”.)
Basically, I encourage you, when you go try to prove that “Agents with incomplete preferences can make themselves immune from pursuing dominated strategies by following plan P” to consider the resource costs of those plans (like the cost in memory) and to ask whether those resources are being used optimally, or whether a different use of them could get better results faster.
Also… I expect that the proofs you attempt might actually succeed if you have “agents in isolation” or “agents surrounded only by agents that respect property rights” but to fail if you consider the case of adversarial action space selection in an environment of more than one agent (like where wolves seek undominated strategies for eating sheep, and no sheep is able to simply ‘turn down’ the option of being eaten by an arbitrarily smart wolf without itself doing something clever and potentially memory-or-VNM-perfection-demanding).
I do NOT think you will prove “in full generality, nothing like coherence is pragmatically necessary to avoid dutch booking” but I grant that I’m not sure about this! I have noticed from experience that my mathematical intuitions are actually fallible. That’s why real math is worth the elbow grease! <3
Great points. Thinking about these kinds of worries is my next project, and I’m still trying to figure out my view on them.
I don’t know if you’re still working on this, but if don’t already know of the literature on choice supportive bias and similar processes that occur in humans, they look to me a lot like heuristics that probably harden a human agent into being “more coherent” over time (especially in proximity to other ways of updating value estimation processes), and likely have an adaptive role in improving (regularizing?) instrumental value estimates.
Your essay seemed consistent with the claim that “in the past, as verifiable by substantial scholarship, no one ever proved exactly X” but your essay never actually showed “X is provably false” that I noticed?
And, indeed, maybe you can prove it one way or the other for some X, where X might be (as you seem to claim) “naive coherence is impossible” or maybe where some X’ or X″ are “sophisticated coherence is approached by algorithm L as t goes to infinity” (or whatever)?
For my money, the thing to do here might be to focus on Value-of-Information, since VoI seems to me like a super super super important concept, and potentially a way to bridge questions of choice and knowledge and costly information gathering actions.
Thanks! I’ll have a think about choice-supportive bias and how it applies.
I think it is provably false that any agent not representable as an expected-utility-maximizer is liable to pursue dominated strategies. Agents with incomplete preferences aren’t representable as expected-utility-maximizers, and they can make themselves immune from pursuing dominated strategies by acting in accordance with the following policy: ‘if I previously turned down some option X, I will not choose any option that I strictly disprefer to X.’
I don’t know about you, but I’m actually OK dithering a bit, and going in circles, and doing things that mere entropy can “make me notice regret based on syntactically detectable behavioral signs” (like not even active adversarial optimization pressure like that which is somewhat inevitably generated in predator prey contexts).
For example, in my twenties I formed an intent, and managed to adhere to the habit somewhat often, where I’d flip a coin any time I noticed decisions where the cost to think about it in an explicit way was probably larger than the difference in value between the likely outcomes.
(Sometimes I flipped coins and then ignored the coin if I noticed I was sad with that result, as a way to cheaply generate that mental state of having an intuitive internally accessible preference without having to put things into words or do math. When I noticed that that stopped working very well, I switched to flipping a coin, then “if regret, flip again, and follow my head on head, and follow the first coin on tails”. The double flipping protocol seemed to help make ALL the first coins have “enough weight” for me to care about them sometimes, even when I always then stopped for a second to see if I was happy or sad or bored by the first coin flip. And of course I do such things much much much less now, and lately have begun to consider taking a personal vow to refuse to randomize, except towards enemies, for an experimental period of time.)
The plans and the hopes here sort of naturally rely on “getting better at preferring things wisely over time”!
And the strategy relies pretty critically on having enough MEMORY to hope to build up data on various examples of different ways that similar situations went in the past, such as to learn from mistakes and thereby rarely “lack a velleity” and to reduce the rate at which I justifiably regret past velleities or choices.
And a core reason I think that sentience and sapience reliably convergently evolve is almost exactly “to store and process memories to enable learning (including especially contextually sensitive instrumental preference learning) inside a single lifetime”.
(Credit assignment and citation note: Cristof Koch was the first researcher who I heard explain surprising experimental data that suggested that “minds are common” with stuff like bee learning, and eventually I stopped being surprised when I heard about low key bee numeracy or another crazy mental power of cuttlefish. I didn’t know about EITHER of the experiments I just linked to, but I paused to find “the kinds of things one finds here if one actually looks”. That I found such links, for me, was a teensy surprise, and slightly contributes to my posterior belief in a claim roughly like “something like ‘coherence’ is convergently useful and is what our minds were originally built, by evolution, to approximately efficiently implement”.)
Basically, I encourage you, when you go try to prove that “Agents with incomplete preferences can make themselves immune from pursuing dominated strategies by following plan P” to consider the resource costs of those plans (like the cost in memory) and to ask whether those resources are being used optimally, or whether a different use of them could get better results faster.
Also… I expect that the proofs you attempt might actually succeed if you have “agents in isolation” or “agents surrounded only by agents that respect property rights” but to fail if you consider the case of adversarial action space selection in an environment of more than one agent (like where wolves seek undominated strategies for eating sheep, and no sheep is able to simply ‘turn down’ the option of being eaten by an arbitrarily smart wolf without itself doing something clever and potentially memory-or-VNM-perfection-demanding).
I do NOT think you will prove “in full generality, nothing like coherence is pragmatically necessary to avoid dutch booking” but I grant that I’m not sure about this! I have noticed from experience that my mathematical intuitions are actually fallible. That’s why real math is worth the elbow grease! <3