We can’t tell we’re in the all-zero universe by examining any finite number of bits.
What does it mean for the all-zero universe to be infinite, as opposed to not being infinite? Finite universes have a finite number of bits of information describing them (This doesn’t actually negate the point that uncomputable utility functions exist, merely that utility functions that care whether they are in a mostly-empty vs perfectly empty universe are a weak example.
These preferences are required to be coherent with breaking things up into sums, so U(E) = U(E∧A)⋅P(E∧A)+U(E∧¬A)⋅P(E∧¬A)/P(E) -- but we do not define one from the other.
What happens if the author/definer of U(E) is wrong about the probabilities? If U(E) is not defined from, nor defined by, the value of its sums, what bad stuff happens if they aren’t equal? Consider the dyslexic telekinetic at a roulette table, who places a chip on 6, but thinks he placed the chip on 9; Proposition A is “I will win if the ball lands in the ‘9’ cup (or “I have bet on 9”, or all such similar propositions), and event E is that agent exercising their telekinesis to cause the ball to land in the 9 cup. (Putting decisions and actions in the hypothetical to avoid a passive agent)
Is that agent merely *mistaken* about the value of U(E), as a result of their error on P(A) and following the appropriate math? Does their error result in a major change in their utility _function_ _computation_ measurement when they correct their error? Is it considered safe for an agent to justify cascading major changes in utility measurement over many (literally all?) events after updating a probability?
An instantiated entity (one that exists in a world) can only know of events E where such events are either observations that they make, or decisions that they make; I see flaws with an agent who sets forth actions that it believes sufficient to bring about a desired outcome and then feels satisfied that it is done, and also with an agent that is seeking spoofable observations about that desired outcome (in particular, the kind of dynamic where agents will seek evidence that tends to confirm desirable event E, because that evidence makes the agent happy, and evidence against E makes the agent sad, so they avoid such evidence).
What happens if the author/definer of U(E) is wrong about the probabilities? If U(E) is not defined from, nor defined by, the value of its sums, what bad stuff happens if they aren’t equal?
Ultimately, I am advocating a logical-induction like treatment of this kind of thing.
Initial values are based on a kind of “prior”—a distribution of money across traders.
Values are initially inconsistent (indeed, they’re always somewhat inconsistent), but, become more consistent over time as a result of traders correcting inconsistencies. The traders who are better at this get more money, while the chronically inconsistent traders lose money and eventually don’t have influence any more.
Evidence of all sorts can come into the system, at any time. The system might suddenly get information about the utility of some hypothetical example, or a logical proposition about utility, whatever. It can be arbitrarily difficult to connect this evidence to practical cases. However, the traders work to reduce inconsistencies throughout the whole system, and therefore, evidence gets propagated more or less as well as it can be.
There is at least one major step that I did not know of, between the things I think I understand and a market that has currency and traders.
I understand how a market of traders can result in a consensus evaluation of probability, because there is a *correct* evaluation of the probability of a proposition. How does a market of traders result in a consensus evaluation of the utility of an event? If two traders disagree about whether to pull the lever, how is it determined which one gets the currency?
Shares in the event are bought and sold on the market. The share will pay out $1 if the event is true. The share can also be shorted, in which case the shorter gets $1 if the event turns out false. The overall price equilibrates to a probability for the event.
There are several ways to handle utility. One way is to make bets about whether the utility will fall in particular ranges. Another way is for the market to directly contain shares of utility which can be purchased (and shorted). These pay out $U, whatever the utility actually turns out to be—traders give it an actual price by speculating on what the eventual value will be. In either case, we would then assign expected utility to events via conditional betting.
If we want do do reward-learning in a setup like this, the (discounted) rewards can be incremental payouts of the U shares. But note that even if there is no feedback of any kind (IE, the shares of U never actually pay out), the shares equilibrate to a subjective value on the market—like collector’s items. But the market still forces the changes in value over time to be increasingly coherent, and the conditional beliefs about it to be increasingly coherent. This corresponds to fully subjective utility with no outside feedback.
If two traders disagree about whether to pull the lever, how is it determined which one gets the currency?
They make bets about what happens if the lever is or isn’t pulled (including conditional buys/sells of shares of utility). These bets will be evaluated as normal. In this setup we only get feedback on whichever action actually happens—but, this may still be enough data to learn under certain assumptions (which I hope to discuss in a future post). We can also consider more exotic settings in which we do get feedback on both cases even though only one happens; this could be feasible through human feedback about counterfactuals. (I also hope to discuss this alternative in a future post.)
Suppose the utility trading commission discovered that a trader used forbidden methods to short a utility bet (e.g. insider trading, coercing other traders, exploiting a flaw in the marketplace), and takes action to confiscate the illicit gains.
What actions transfer utility from the target? (In systems that pay out money, their bank account is debited; in systems that use blockchain, transactions are added or rolled back manually) what does it mean to take utility from a trader directly?
What does it mean for the all-zero universe to be infinite, as opposed to not being infinite? Finite universes have a finite number of bits of information describing them (This doesn’t actually negate the point that uncomputable utility functions exist, merely that utility functions that care whether they are in a mostly-empty vs perfectly empty universe are a weak example.
What it means here is precisely that it is described by an infinite number of bits—specifically, an infinite number of zeros!
Granted, we could try to reorganize the way we describe the universe so that we have a short code for that world, rather than an infinitely long one. This becomes a fairly subtle issue. I will say a couple of things:
First, it seems to me like the reductionist may want to object to such a reorganization. In the reductive view, it is important that there is a special description of the universe, in which we have isolated the actual basic facts of reality—things resembling particle position and momentum, or what-have-you.
Second, I challenge you to propose a description language which (a) makes the procrastination example computable, (b) maps all worlds onto a description, and (c) does not create any invalid input tapes.
For example, I can make a modified universe-description in which the first bit is ‘1’ if the button ever gets pressed. The rest of the description remains as before, placing a ‘1’ at time-steps when the button is pressed (but offset by one place, to allow for the extra initial bit). So seeing ‘0’ right away tells me I’m in the button-never-pressed world; it now has a 1-bit description, rather than an infinite-bit description. HOWEVER, this description language includes a description which does not correspond to any world, and is therefore invalid: the string which starts with ‘1’ but then contains only zeros forever.
This issue has a variety of potential replies/implications—I’m not saying the situation is clear. I didn’t get into this kind of thing in the post because it seems like there are just too many things to say about it, with no totally clear path.
What does it mean for the all-zero universe to be infinite, as opposed to not being infinite? Finite universes have a finite number of bits of information describing them (This doesn’t actually negate the point that uncomputable utility functions exist, merely that utility functions that care whether they are in a mostly-empty vs perfectly empty universe are a weak example.
What happens if the author/definer of U(E) is wrong about the probabilities? If U(E) is not defined from, nor defined by, the value of its sums, what bad stuff happens if they aren’t equal? Consider the dyslexic telekinetic at a roulette table, who places a chip on 6, but thinks he placed the chip on 9; Proposition A is “I will win if the ball lands in the ‘9’ cup (or “I have bet on 9”, or all such similar propositions), and event E is that agent exercising their telekinesis to cause the ball to land in the 9 cup. (Putting decisions and actions in the hypothetical to avoid a passive agent)
Is that agent merely *mistaken* about the value of U(E), as a result of their error on P(A) and following the appropriate math? Does their error result in a major change in their utility _function_ _computation_ measurement when they correct their error? Is it considered safe for an agent to justify cascading major changes in utility measurement over many (literally all?) events after updating a probability?
An instantiated entity (one that exists in a world) can only know of events E where such events are either observations that they make, or decisions that they make; I see flaws with an agent who sets forth actions that it believes sufficient to bring about a desired outcome and then feels satisfied that it is done, and also with an agent that is seeking spoofable observations about that desired outcome (in particular, the kind of dynamic where agents will seek evidence that tends to confirm desirable event E, because that evidence makes the agent happy, and evidence against E makes the agent sad, so they avoid such evidence).
Ultimately, I am advocating a logical-induction like treatment of this kind of thing.
Initial values are based on a kind of “prior”—a distribution of money across traders.
Values are initially inconsistent (indeed, they’re always somewhat inconsistent), but, become more consistent over time as a result of traders correcting inconsistencies. The traders who are better at this get more money, while the chronically inconsistent traders lose money and eventually don’t have influence any more.
Evidence of all sorts can come into the system, at any time. The system might suddenly get information about the utility of some hypothetical example, or a logical proposition about utility, whatever. It can be arbitrarily difficult to connect this evidence to practical cases. However, the traders work to reduce inconsistencies throughout the whole system, and therefore, evidence gets propagated more or less as well as it can be.
There is at least one major step that I did not know of, between the things I think I understand and a market that has currency and traders.
I understand how a market of traders can result in a consensus evaluation of probability, because there is a *correct* evaluation of the probability of a proposition. How does a market of traders result in a consensus evaluation of the utility of an event? If two traders disagree about whether to pull the lever, how is it determined which one gets the currency?
The mechanism is the same in both cases:
Shares in the event are bought and sold on the market. The share will pay out $1 if the event is true. The share can also be shorted, in which case the shorter gets $1 if the event turns out false. The overall price equilibrates to a probability for the event.
There are several ways to handle utility. One way is to make bets about whether the utility will fall in particular ranges. Another way is for the market to directly contain shares of utility which can be purchased (and shorted). These pay out $U, whatever the utility actually turns out to be—traders give it an actual price by speculating on what the eventual value will be. In either case, we would then assign expected utility to events via conditional betting.
If we want do do reward-learning in a setup like this, the (discounted) rewards can be incremental payouts of the U shares. But note that even if there is no feedback of any kind (IE, the shares of U never actually pay out), the shares equilibrate to a subjective value on the market—like collector’s items. But the market still forces the changes in value over time to be increasingly coherent, and the conditional beliefs about it to be increasingly coherent. This corresponds to fully subjective utility with no outside feedback.
They make bets about what happens if the lever is or isn’t pulled (including conditional buys/sells of shares of utility). These bets will be evaluated as normal. In this setup we only get feedback on whichever action actually happens—but, this may still be enough data to learn under certain assumptions (which I hope to discuss in a future post). We can also consider more exotic settings in which we do get feedback on both cases even though only one happens; this could be feasible through human feedback about counterfactuals. (I also hope to discuss this alternative in a future post.)
Suppose the utility trading commission discovered that a trader used forbidden methods to short a utility bet (e.g. insider trading, coercing other traders, exploiting a flaw in the marketplace), and takes action to confiscate the illicit gains.
What actions transfer utility from the target? (In systems that pay out money, their bank account is debited; in systems that use blockchain, transactions are added or rolled back manually) what does it mean to take utility from a trader directly?
What it means here is precisely that it is described by an infinite number of bits—specifically, an infinite number of zeros!
Granted, we could try to reorganize the way we describe the universe so that we have a short code for that world, rather than an infinitely long one. This becomes a fairly subtle issue. I will say a couple of things:
First, it seems to me like the reductionist may want to object to such a reorganization. In the reductive view, it is important that there is a special description of the universe, in which we have isolated the actual basic facts of reality—things resembling particle position and momentum, or what-have-you.
Second, I challenge you to propose a description language which (a) makes the procrastination example computable, (b) maps all worlds onto a description, and (c) does not create any invalid input tapes.
For example, I can make a modified universe-description in which the first bit is ‘1’ if the button ever gets pressed. The rest of the description remains as before, placing a ‘1’ at time-steps when the button is pressed (but offset by one place, to allow for the extra initial bit). So seeing ‘0’ right away tells me I’m in the button-never-pressed world; it now has a 1-bit description, rather than an infinite-bit description. HOWEVER, this description language includes a description which does not correspond to any world, and is therefore invalid: the string which starts with ‘1’ but then contains only zeros forever.
This issue has a variety of potential replies/implications—I’m not saying the situation is clear. I didn’t get into this kind of thing in the post because it seems like there are just too many things to say about it, with no totally clear path.
The universe that is described by an infinite string of zeroes differs from the universe that is described by the empty string in what manner?