we need not assume there are “worlds” at all. … In mathematics, it brings to mind pointless topology.
I don’t think the motivation for this is quite the same as the motivation for pointless topology, which is designed to mimic classical topology in a way that Jeffrey-Bolker-style decision theory does not mimic VNM-style decision theory. In pointless topology, a continuous function of locales X→Y is a function from the lattice of open sets of X to the lattice of open sets of Y. So a similar thing here would be to treat a utility function as a function from some lattice of subsets of R (the Borel subsets, for instance) to the lattice of events.
My understanding of the Jeffrey-Bolker framework is that its primary difference from the VNM framework is not its pointlessness, but the fact that it comes with a prior probability distribution over outcomes, which can only be updated by conditioning on events (i.e. updating on evidence that has probability 1 in some worlds and probability 0 in the rest). VNM does not start out with a prior, and allows any probability distribution over outcomes to be compared to any other, and Jeffrey-Bolker only allows comparison of probability distributions obtained by conditioning the prior on an event. Of course, this interpretation requires a fair amount of reading between the lines, since the Jeffrey-Bolker axioms make no explicit mention of any probability distribution, but I don’t see any other reasonable way to interpret them, since if asked which of two events is better, I will often be unable to answer without further information, since the events may contain worlds of widely varying utility. Associating an event with a fixed prior conditioned on the event gives me this additional information needed to answer the question, and I don’t see how any others could work. Starting with a prior that gets conditioned on events that correspond to the agent’s actions seems to build in evidential decision theory as an assumption, which makes me suspicious of it.
In the Jeffrey-Bolker treatment, a world is just a maximally specific event: an event which describes everything completely. But there is no requirement that maximally-specific events exist.
This can be resolved by defining worlds to be minimal non-zero elements of the completion of the Boolean algebra of events, rather than a minimal non-zero event. This is what you seemed to be implicitly doing later with the infinite bitstrings example, where the events were clopen subsets of Cantor space (i.e. sets of infinite bitstrings such that membership in the set only depends on finitely many bits), and this Boolean algebra has no minimal non-zero elements (maximally-specific events), but the minimal non-zero elements of its completion correspond to infinite bitstrings, as desired.
Of course, this interpretation requires a fair amount of reading between the lines, since the Jeffrey-Bolker axioms make no explicit mention of any probability distribution, but I don’t see any other reasonable way to interpret them,
Part of the point of the JB axioms is that probability is constructed together with utility in the representation theorem, in contrast to VNM, which constructs utility via the representation theorem, but takes probability as basic.
This makes Savage a better comparison point, since the Savage axioms are more similar to the VNM framework while also trying to construct probability and utility together with one representation theorem.
VNM does not start out with a prior, and allows any probability distribution over outcomes to be compared to any other, and Jeffrey-Bolker only allows comparison of probability distributions obtained by conditioning the prior on an event.
As a representation theorem, this makes VNM weaker and JB stronger: VNM requires stronger assumptions (it requires that the preference structure include information about all these probability-distribution comparisons), where JB only requires preference comparison of events which the agent sees as real possibilities. A similar remark can be made of Savage.
Starting with a prior that gets conditioned on events that correspond to the agent’s actions seems to build in evidential decision theory as an assumption, which makes me suspicious of it.
Right, that’s fair. Although: James Joyce, the big CDT advocate, is quite the Jeffrey-Bolker fan! See Why We Still Need the Logic of Decision for his reasons.
I don’t think the motivation for this is quite the same as the motivation for pointless topology, which is designed to mimic classical topology in a way that Jeffrey-Bolker-style decision theory does not mimic VNM-style decision theory. [...] So a similar thing here would be to treat a utility function as a function from some lattice of subsets of R (the Borel subsets, for instance) to the lattice of events.
Doesn’t pointless topology allow for some distinctions which aren’t meaningful in pointful topology, though? (I’m not really very familiar, I’m just going off of something I’ve heard.)
Isn’t the approach you mention pretty close to JB? You’re not modeling the VNM/Savage thing of arbitrary gambles; you’re just assigning values (and probabilities) to events, like in JB.
Setting aside VNM and Savage and JB, and considering the most common approach in practice—use the Kolmogorov axioms of probability, and treat utility as a random variable—it seems like the pointless analogue would be close to what you say.
This can be resolved by defining worlds to be minimal non-zero elements of the completion of the Boolean algebra of events, rather than a minimal non-zero event.
Yeah. The question remains, though: should we think of utility as a function of these minimal elements of the completion? Or not? The computability issue I raise is, to me, suggestive of the negative.
This makes Savage a better comparison point, since the Savage axioms are more similar to the VNM framework while also trying to construct probability and utility together with one representation theorem.
Sure, I guess I just always talk about VNM instead of Savage because I never bothered to learn how Savage’s version works. Perhaps I should.
As a representation theorem, this makes VNM weaker and JB stronger: VNM requires stronger assumptions (it requires that the preference structure include information about all these probability-distribution comparisons), where JB only requires preference comparison of events which the agent sees as real possibilities.
This might be true if we were idealized agents who do Bayesian updating perfectly without any computational limitations, but as it is, it seems to me that the assumption that there is a fixed prior is unreasonably demanding. People sometimes update probabilities based purely on further thought, rather than empirical evidence, and a framework in which there is a fixed prior which gets conditioned on events, and banishes discussion of any other probability distributions, would seem to have some trouble handling this.
Doesn’t pointless topology allow for some distinctions which aren’t meaningful in pointful topology, though?
Sure, for instance, there are many distinct locales that have no points (only one of which is the empty locale), whereas there is only one ordinary topological space with no points.
Isn’t the approach you mention pretty close to JB? You’re not modeling the VNM/Savage thing of arbitrary gambles; you’re just assigning values (and probabilities) to events, like in JB.
Assuming you’re referring to “So a similar thing here would be to treat a utility function as a function from some lattice of subsets of R (the Borel subsets, for instance) to the lattice of events”, no. In JB, the set of events is the domain of the utility function, and in what I said, it is the codomain.
I don’t think the motivation for this is quite the same as the motivation for pointless topology, which is designed to mimic classical topology in a way that Jeffrey-Bolker-style decision theory does not mimic VNM-style decision theory. In pointless topology, a continuous function of locales X→Y is a function from the lattice of open sets of X to the lattice of open sets of Y. So a similar thing here would be to treat a utility function as a function from some lattice of subsets of R (the Borel subsets, for instance) to the lattice of events.
My understanding of the Jeffrey-Bolker framework is that its primary difference from the VNM framework is not its pointlessness, but the fact that it comes with a prior probability distribution over outcomes, which can only be updated by conditioning on events (i.e. updating on evidence that has probability 1 in some worlds and probability 0 in the rest). VNM does not start out with a prior, and allows any probability distribution over outcomes to be compared to any other, and Jeffrey-Bolker only allows comparison of probability distributions obtained by conditioning the prior on an event. Of course, this interpretation requires a fair amount of reading between the lines, since the Jeffrey-Bolker axioms make no explicit mention of any probability distribution, but I don’t see any other reasonable way to interpret them, since if asked which of two events is better, I will often be unable to answer without further information, since the events may contain worlds of widely varying utility. Associating an event with a fixed prior conditioned on the event gives me this additional information needed to answer the question, and I don’t see how any others could work. Starting with a prior that gets conditioned on events that correspond to the agent’s actions seems to build in evidential decision theory as an assumption, which makes me suspicious of it.
This can be resolved by defining worlds to be minimal non-zero elements of the completion of the Boolean algebra of events, rather than a minimal non-zero event. This is what you seemed to be implicitly doing later with the infinite bitstrings example, where the events were clopen subsets of Cantor space (i.e. sets of infinite bitstrings such that membership in the set only depends on finitely many bits), and this Boolean algebra has no minimal non-zero elements (maximally-specific events), but the minimal non-zero elements of its completion correspond to infinite bitstrings, as desired.
Part of the point of the JB axioms is that probability is constructed together with utility in the representation theorem, in contrast to VNM, which constructs utility via the representation theorem, but takes probability as basic.
This makes Savage a better comparison point, since the Savage axioms are more similar to the VNM framework while also trying to construct probability and utility together with one representation theorem.
As a representation theorem, this makes VNM weaker and JB stronger: VNM requires stronger assumptions (it requires that the preference structure include information about all these probability-distribution comparisons), where JB only requires preference comparison of events which the agent sees as real possibilities. A similar remark can be made of Savage.
Right, that’s fair. Although: James Joyce, the big CDT advocate, is quite the Jeffrey-Bolker fan! See Why We Still Need the Logic of Decision for his reasons.
Doesn’t pointless topology allow for some distinctions which aren’t meaningful in pointful topology, though? (I’m not really very familiar, I’m just going off of something I’ve heard.)
Isn’t the approach you mention pretty close to JB? You’re not modeling the VNM/Savage thing of arbitrary gambles; you’re just assigning values (and probabilities) to events, like in JB.
Setting aside VNM and Savage and JB, and considering the most common approach in practice—use the Kolmogorov axioms of probability, and treat utility as a random variable—it seems like the pointless analogue would be close to what you say.
Yeah. The question remains, though: should we think of utility as a function of these minimal elements of the completion? Or not? The computability issue I raise is, to me, suggestive of the negative.
Sure, I guess I just always talk about VNM instead of Savage because I never bothered to learn how Savage’s version works. Perhaps I should.
This might be true if we were idealized agents who do Bayesian updating perfectly without any computational limitations, but as it is, it seems to me that the assumption that there is a fixed prior is unreasonably demanding. People sometimes update probabilities based purely on further thought, rather than empirical evidence, and a framework in which there is a fixed prior which gets conditioned on events, and banishes discussion of any other probability distributions, would seem to have some trouble handling this.
Sure, for instance, there are many distinct locales that have no points (only one of which is the empty locale), whereas there is only one ordinary topological space with no points.
Assuming you’re referring to “So a similar thing here would be to treat a utility function as a function from some lattice of subsets of R (the Borel subsets, for instance) to the lattice of events”, no. In JB, the set of events is the domain of the utility function, and in what I said, it is the codomain.