0 and 1 aren’t probabilities

Alok Singh1 Jan 2023 0:09 UTC

2 points

Eliezer wrote about this awhile ago too.

Now for a “radically elementary” take^[1].

To simplify, only focus on the case of 0. 1 follows by symmetry. (All values here are non-negative, though with clifford algebras we may extend to negative probabilities too without much difficulty.)

Instead of 0,^[2] take an infinitesimal $ε \sim 0$ . Then we have some quantity that’s strictly bigger than 0.

What events are 0 probability?

In the naive setup, these would normally be 0 probability:

Events that are physically possible, but you have no idea how. A lot of the far future falls into this. Also a lot of things beyond your (current) ken.
Events that would normally be given probability exactly 0, like sampling a given point on a unit interval, uniformly at random.
Things you’re “sure” won’t happen.

In this setup, there’s one answer: events that are actually impossible, absurdities, the empty event. Everything else has at least infinitesimal probability. The dart on the interval would have infinitesimal probability, say $ε$ . Now take the event of hitting the number 2. It’s not even in the sample space. So it’s actually impossible and gets probability 0.

What’s the point of this?

What does it get us over what naive numbers may give?

We get a grading.

Example: flipping a coin.

3 outcomes: heads/tails/it lands on its edge. We assign $ε$ to it landing on the edge, because it’s possible but was out of scope for the usual coin flip experiment. I got this example by axiomatically assuming something is impossible (a 3rd outcome for coin flips), realized that actually something is physically possible but have no idea how (the $ε$ term). Say that I then realize that the wind being a certain way makes landing on the edge less likely, but the wind acting that way is itself even more unlikely, say a probability of $ε^{2}$ . Then I can update my overall “lands on edge” probability to $ε - ε^{2}$ , and the other 2 to $\frac{1 - (ε - ε^{2})}{2} \sim \frac{1}{2}$ .

Notice that the infinitesimals are placeholders, more valuable for their exponent and coefficient than anything else. If we’re “sure” about something, we could give it a value $1 - ε$ . Even better than surety is that someone on 4chan told you, which you could guess a value of $1 - ε^{2}$ . The exact value of $ε$ is less important than the grading it induces, a grading that gets a lot of its value from $ε$ being inaccessible to standard numbers by standard operations. Each level of inaccessibility is like a level of confidence.

To get useful results, we rely on spillover. For a formal explanation, see Robert Goldblatt’s Lectures on The Hyperreals.

Overspill/overflow
Underspill/underflow
Principle of permanence.
1. The basic idea of these principles is “as above ⇔ so below”. Say some internal function has all infinitesimals in its range. Then it must have non-infinitesimals too, since the set of all infinitesimals is known to be external, and images of internal functions over internal sets are internal. This is an example of overspill. Infinitesimal behavior has spilled over into the appreciable domain. Similar results hold for all levels, big and small.

Intuition: “Big standard” ~ “hyperfinite”.

Reason: if $n$ is standard, so is $n + 1$ . So there’s no biggest standard number (you know this already). If $H$ is hyperfinite, then so is $H - 1$ . So the right end of the telescope picture, with the dots, identifies $n$ and $H$ .

Similarly for “arbitarily large infinitesimal” ~ “arbitrarily small but standard and nonzero”. See this pic:

By spillover, we can substitute in small BUT STANDARD values wherever a term is supposed to be infinitesimal (or large but standard whenever we see a hyperfinite term) and get a standard answer out. Our whole system of reasoning gracefully degrades into its own finite approximation. Take the coin example. It doesn’t matter that the chance of landing on the edge isn’t infinitesimal but appreciable. If we set $ε := 0.001$ , then the chance of landing on the edge is still $ε - ε^{2} = 0.001 - {0.001}^{2} = 0.000999$ . The chance of heads/tails is $0.4995005$ , which makes sense. For much more in this vein, see this paper.

^
Radically elementary in the sense of the hyperreals. Like a lot of my recent posts, I don’t explain anything about the hyperreals, since it beat being paralyzed topsorting a whole textbook =(.
^
0 is the only number that’s both standard and infinitesimal, and is the smallest infinitesimal. So 0 is still special and shouldn’t be treated lightly by probability. If anything, infinitesimals reveal just how special 0 is. But we can sidestep the issue with this infinite ladder of levels.

What links here?

rpglover64's comment on Contra Common Knowledge by abramdemski (7 Jan 2023 0:16 UTC; 1 point)

Alok Singh1 Jan 2023 0:09 UTC

2 points

4 comments2 min readLW link

Rationality Probability & Statistics

Shmi 1 Jan 2023 0:28 UTC
4 points
0
Uh, ℏ has a specific meaning and units, why would you use it. Also, what you consider an “elementary take” is not what most readers here would. Your writing style kind of obscures the interesting bullet points you make about probability-zero events.
Dagon 1 Jan 2023 2:32 UTC
2 points
0
What is the probability that the ratio of a circle’s diameter to it’s circumference in a euclidean plane is 12?
- Alok Singh 1 Jan 2023 2:35 UTC
  1 point
  1
  Parent
  I’ll just say $ε^{10000000000}$ for now. Basically wrapping up “is all this an elaborate simulation designed to convince me that pi = 12”
Slider 1 Jan 2023 0:48 UTC
2 points
0
So different infinity levels are actually fields which we do not know how far apart they are and thus can not mix them?
That propabilities are always comparable is a pretty used property so taking that away is not trivial at all.
When you “substitute in” standard values in order to get appriciables-only field, then something that was $1 - e$ can drop below 0.9. This change of order seems really disruptive. Althought being able to determine where it happens might be a plus.
$1 - e$ and $0 + e$ already exist as “almost surely” and “almost never”. Those that require more than real precision do use it but do not like to express it as numbers. For example a dart is almost surely going to land favoring some side of the target. The only way it doesn’t is the dead center bulleye, but since it is a set of measure 0 (translation of “infinidesimal” of infinity-averse crowd) that is fine.
There is the question of hitting the horizontal or vertical axis of the target. Since lines are in a way lexigraphically smaller than areas, treating areas, lines and points as three levels of infinity might be a bit more expressive of designating that areas get reals and others are 0. I do not know how a infinity-averse person would dance around that. That I know that if you compare areas to areas, lines to lines, and points to points having a single real measure is sufficient. So where the usage case would loom just sectioning the activity to 3 less formal sections allows the formalism within each section brrr in the usual way.
I have been intrigued by surreal probabilities before. Mostly what new things they could provide doesn’t accomplish that much outside of theorethising about itself. There is also the issue that infinidesimal doubt can not be disspelled by finite evidence. And no finite repetition of a infinidesimal chance can result in an appriciable total probability.