Before I’ve observed anything, there seems to be no reason to believe that I’m more likely to be in one world than another, but we can’t let all their weights be equal.
We can’t? Why not? Estimating the probability of two heads on two coinflips as 25% is giving existence in worlds with heads-heads, heads-tails, tails-heads, and tails-tails equal weight. The same is true of a more complicated proposition like “There is a low probability that Bigfoot exists”—giving every possible arrangement of objects/atoms/information equal weight, and then ruling out the ones that don’t result in the evidence we’ve observed, few of these worlds contain Bigfoot.
Theoretically, it’s not infinite because of the granularity of time/space, speed of light, and so on.
Practically, we can get around this because we only care about a tiny fraction of the possible variation in arrangements of the universe. In a coin flip, we only care about whether a coin is heads-up or tails-up, not the energy state of every subatomic particle in the coin.
This matters in the case of a biased coin—let’s say biased towards heads 66%. This, I think, is what Wei meant when he said we couldn’t just give equal weights to all possible universes—the ones where the coin lands on heads and the ones where it lands on tails. But I think “universes where the coin lands on heads” and “universes where the coin lands on tails” are unnatural categories.
Consider how the probability of winning the lottery isn’t .5 because we choose with equal weight between the two alternatives”I win” and “I don’t win”. Those are unnatural categories, and instead we need to choose with equal weight between “I win”, “John Q. Smith of Little Rock Arkansas wins”, “Mary Brown of San Antonio, Texas, wins” and so on to millions of other people. The unnatural category “I don’t win” contains millions of more natural categories.
So on the biased coin flip, the categories “the coin lands heads” and “the coin lands tails” contains a bunch of categories of lower-level events about collisions of air molecules and coin molecules and amounts of force one can use to flip a coin, and two-thirds of those events are in the “coin lands heads” category. But among those lower-level events, you choose with equal weight.
True, beneath these lower-level categories about collisions of air molecules, there are probably even lower things like vibrations of superstrings or bits in the world-simulation or whatever the lowest level of reality is, but as long as these behave mathematically I don’t see why they prevent us from basing a theory of probability on the effects of low level conditions.
Theoretically, it’s not infinite because of the granularity of time/space, speed of light, and so on.
These initial weights are supposed to be assigned before taking into account anything you have observed. But even now (under the second interpretation in my list) you can’t be sure that the world you’re in is finite. So, suppose there is one possible world for each integer in the set of all integers, or one possible world for each set in the class of all sets. How could one assign equal weight to all possible worlds, and have the weights add up to 1?
Practically, we can get around this because we only care about a tiny fraction of the possible variation in arrangements of the universe. In a coin flip, we only care about whether a coin is heads-up or tails-up, not the energy state of every subatomic particle in the coin.
I don’t think that gets around the problem, because there is an infinite number of possible worlds where the energy state of nearly every subatomic particle encodes some valuable information.
How could one assign equal weight to all possible worlds, and have the weights add up to 1?
By the same method we do calculus. Instead of sum of the possible worlds we integrate over the possible worlds (which is a infinite sum of infinitesimally small values). For explicit construction on how this is done any basic calculus book is enough.
My understanding is that it’s possible to have a uniform distribution over a finite set, or an interval of the reals, but not over all integers, or all reals, which is why I said in the sentence before the one you quotes, “suppose there is one possible world for each integer in the set of all integers.”
There is a 1:1 mapping between “the set of reals in [0,1]” and “the set of all reals”. So take your uniform distribution on [0,1] and put it through such a mapping… and the result is non-uniform. Which pretty much kills the idea of “uniform ⇔ each element has the same probability as each other”.
There is no such thing as a continuous distribution on a set alone, it has to be on a metric space. Even if you make a metric space out of the set of all possible universes, that doesn’t give you a universal prior, because you have to choose what metric it should be uniform with respect to.
(Can you have a uniform “continuous” distribution without a continuum? The rationals in [0,1]?)
As there is the 1:1 mapping between set of all reals and unit interval we can just use the unit interval and define a uniform mapping there. As whatever distribution you choose we can map it into unit interval as Pengvado said.
In case of set of all integers I’m not completely certain. But I’d look at the set of computable reals which we can use for much of mathematics. Normal calculus can be done with just computable reals (set of all numbers where there is an algorithm which provides arbitrary decimal in a finite time). So basically we have a mapping from computable reals on unit interval into set of all integers.
Another question is that is the uniform distribution the entropy maximising distribution when we consider set of all integers?
From a physical standpoint why are you interested in countably infinite probability distributions? If we assume discrete physical laws we’d have finite amount of possible worlds, on the other hand if we assume continuous we’d have uncountably infinite amount which can be mapped into unit interval.
From the top of my head I can imagine set of discrete worlds of all sizes which would be countably infinite. What other kinds of worlds there could be where this would be relevant?
Theoretically, it’s not infinite because of the granularity of time/space, speed of light, and so on.
(Nitpick: Spacetime isn’t quantized AFAIK in standard physics, and then there are still continuous quantum amplitudes.)
This, I think, is what Wei meant when he said we couldn’t just give equal weights to all possible universes—the ones where the coin lands on heads and the ones where it lands on tails. But I think “universes where the coin lands on heads” and “universes where the coin lands on tails” are unnatural categories.
I thought Wei was talking about single worlds (whatever those may be), not sets of worlds. Applied to sets of worlds, this seems correct.
Yvain said the finiteness well, but I think the “infinitely many possible arrangements” needs a little elaboration.
In any continuous probability distributions we have infinitely many (actually uncountably infinitely many) possibilities, and this makes the probability of any single outcome 0. Which is the reason why, in the case of continuous distributions, we talk about probability of the outcome being on a certain interval (a collection of infinitely many arrangements).
So instead of counting the individual arrangements we calculate integrals over some set of arrangements. Infinitely many arrangements is no hindrance to applying probability theory. Actually if we can assume continuous distribution it makes some things much easier.
It does work, actually if we’re using Integers (there are as many integers as Rationals so we don’t need to care about the latter set) we get the good old discrete probability distribution where we either have finite number of possibilities or at most countable infinity of possibilities, e.g set of all Integers.
Real numbers are strictly larger set than integers, so in continuous distribution we have in a sense more possibilities than countably infinite discrete distribution.
We can’t? Why not? Estimating the probability of two heads on two coinflips as 25% is giving existence in worlds with heads-heads, heads-tails, tails-heads, and tails-tails equal weight. The same is true of a more complicated proposition like “There is a low probability that Bigfoot exists”—giving every possible arrangement of objects/atoms/information equal weight, and then ruling out the ones that don’t result in the evidence we’ve observed, few of these worlds contain Bigfoot.
Without an arbitrary upper bound on complexity, there are infinitely many possible arrangements.
Theoretically, it’s not infinite because of the granularity of time/space, speed of light, and so on.
Practically, we can get around this because we only care about a tiny fraction of the possible variation in arrangements of the universe. In a coin flip, we only care about whether a coin is heads-up or tails-up, not the energy state of every subatomic particle in the coin.
This matters in the case of a biased coin—let’s say biased towards heads 66%. This, I think, is what Wei meant when he said we couldn’t just give equal weights to all possible universes—the ones where the coin lands on heads and the ones where it lands on tails. But I think “universes where the coin lands on heads” and “universes where the coin lands on tails” are unnatural categories.
Consider how the probability of winning the lottery isn’t .5 because we choose with equal weight between the two alternatives”I win” and “I don’t win”. Those are unnatural categories, and instead we need to choose with equal weight between “I win”, “John Q. Smith of Little Rock Arkansas wins”, “Mary Brown of San Antonio, Texas, wins” and so on to millions of other people. The unnatural category “I don’t win” contains millions of more natural categories.
So on the biased coin flip, the categories “the coin lands heads” and “the coin lands tails” contains a bunch of categories of lower-level events about collisions of air molecules and coin molecules and amounts of force one can use to flip a coin, and two-thirds of those events are in the “coin lands heads” category. But among those lower-level events, you choose with equal weight.
True, beneath these lower-level categories about collisions of air molecules, there are probably even lower things like vibrations of superstrings or bits in the world-simulation or whatever the lowest level of reality is, but as long as these behave mathematically I don’t see why they prevent us from basing a theory of probability on the effects of low level conditions.
These initial weights are supposed to be assigned before taking into account anything you have observed. But even now (under the second interpretation in my list) you can’t be sure that the world you’re in is finite. So, suppose there is one possible world for each integer in the set of all integers, or one possible world for each set in the class of all sets. How could one assign equal weight to all possible worlds, and have the weights add up to 1?
I don’t think that gets around the problem, because there is an infinite number of possible worlds where the energy state of nearly every subatomic particle encodes some valuable information.
By the same method we do calculus. Instead of sum of the possible worlds we integrate over the possible worlds (which is a infinite sum of infinitesimally small values). For explicit construction on how this is done any basic calculus book is enough.
My understanding is that it’s possible to have a uniform distribution over a finite set, or an interval of the reals, but not over all integers, or all reals, which is why I said in the sentence before the one you quotes, “suppose there is one possible world for each integer in the set of all integers.”
There is a 1:1 mapping between “the set of reals in [0,1]” and “the set of all reals”. So take your uniform distribution on [0,1] and put it through such a mapping… and the result is non-uniform. Which pretty much kills the idea of “uniform ⇔ each element has the same probability as each other”.
There is no such thing as a continuous distribution on a set alone, it has to be on a metric space. Even if you make a metric space out of the set of all possible universes, that doesn’t give you a universal prior, because you have to choose what metric it should be uniform with respect to.
(Can you have a uniform “continuous” distribution without a continuum? The rationals in [0,1]?)
As there is the 1:1 mapping between set of all reals and unit interval we can just use the unit interval and define a uniform mapping there. As whatever distribution you choose we can map it into unit interval as Pengvado said.
In case of set of all integers I’m not completely certain. But I’d look at the set of computable reals which we can use for much of mathematics. Normal calculus can be done with just computable reals (set of all numbers where there is an algorithm which provides arbitrary decimal in a finite time). So basically we have a mapping from computable reals on unit interval into set of all integers.
Another question is that is the uniform distribution the entropy maximising distribution when we consider set of all integers?
From a physical standpoint why are you interested in countably infinite probability distributions? If we assume discrete physical laws we’d have finite amount of possible worlds, on the other hand if we assume continuous we’d have uncountably infinite amount which can be mapped into unit interval.
From the top of my head I can imagine set of discrete worlds of all sizes which would be countably infinite. What other kinds of worlds there could be where this would be relevant?
(Nitpick: Spacetime isn’t quantized AFAIK in standard physics, and then there are still continuous quantum amplitudes.)
I thought Wei was talking about single worlds (whatever those may be), not sets of worlds. Applied to sets of worlds, this seems correct.
Yvain said the finiteness well, but I think the “infinitely many possible arrangements” needs a little elaboration.
In any continuous probability distributions we have infinitely many (actually uncountably infinitely many) possibilities, and this makes the probability of any single outcome 0. Which is the reason why, in the case of continuous distributions, we talk about probability of the outcome being on a certain interval (a collection of infinitely many arrangements).
So instead of counting the individual arrangements we calculate integrals over some set of arrangements. Infinitely many arrangements is no hindrance to applying probability theory. Actually if we can assume continuous distribution it makes some things much easier.
Good point. Does this work over all infinite sets, though? Integers? Rationals?
It does work, actually if we’re using Integers (there are as many integers as Rationals so we don’t need to care about the latter set) we get the good old discrete probability distribution where we either have finite number of possibilities or at most countable infinity of possibilities, e.g set of all Integers.
Real numbers are strictly larger set than integers, so in continuous distribution we have in a sense more possibilities than countably infinite discrete distribution.