Mathematicians routinely use “infinite” to mean “infinite in magnitude”. For example, the concept “The natural numbers” is infinite in magnitude, but I have picked it out using only 19 ascii characters. From a computer science perspective, it is a finite concept—finite in information content, the number of bits necessary to point it out.
Each of the objects in the set of the Peano integers is finite. The set of Peano integers, considered as a whole, is infinite in magnitude, but finite in information content.
Mathematician’s routine speech sometimes sounds as if a generic real number is a small thing, something that you could pick up and move around. In fact, a generic real number (since it’s an element of an uncountable set) is infinite in information content—they’re huge, and impossible to encounter, much less pick up.
Lowenheim-Skolem allows you to transform proofs that, on a straightforward reading, claim to be manipulating generic elements of uncountable sets (picking up and moving around real numbers for example), with proofs that claim to be manipulating elements of countable sets—that is, objects that are finite in information content.
In that tranformation, you will probably introduce “objects” which are something like “the double-struck N”, and those objects certainly still satisfy internal predicates like “InfiniteInMagnitude(the double-struck N)”.
However, you’re never forced to believe that mathematicians are routinely doing impossible things—you can always take a formalist stance, pointing out that mathematicians are actually manipulating symbols, which are small, finite-in-information-content things.
However, you’re never forced to believe that mathematicians are routinely doing impossible things—you can always take a formalist stance, pointing out that mathematicians are actually manipulating symbols, which are small, finite-in-information-content things.
So, given this, what exactly is your complaint? You started off criticizing Eliezer (and whomever else) for saying “The integers are countable, but the real number line is uncountable”—I suppose on the grounds that everything in the physical universe is countable, or something. (You weren’t exactly clear.) But now you point out (correctly) that there is a perfectly good interpretation of this statement which in no way depends on there being an uncountable number of physical things anywhere, or otherwise violates your (not-exactly-well-defined) philosophy. So haven’t you just defeated yourself?
I have a knee-jerk response, railing against uncountable sets in general and the real numbers in particular; it’s not pretty and I know how to control it better now.
I’m fairly confident that for your purposes you could live with the computable numbers (that is: those numbers whose decimal expansion can be computed by ), and as long as you didn’t need anything stronger than integration amenable to quadrature, you’d be just fine.
There are people who take this route, but I can’t think of any off the top of my head. Knuth once stated that he’d like to write a calculus book roughly following this path, but, well, he’s got other things on his mind.
EDIT: I should point out also that the computable numbers are countable (by the usual Godel encoding of whatever machine is rattling off the digits for you), and that for all practical intents and purposes they’re probably equivalent to whatever calculus-related mischief is in play at the moment.
There’s some weirdnesses down that route—for example, it turns out that you can’t distinguish zero from nonzero, so the step function is actually uncomputable.
My contrarian claim is that everyone could live with the nameable numbers—that is, the numbers that can be pointed out using a finite number of books to describe them. People who really strongly care about the uncountability of the reals have a hard time coming up with a concrete example of what they’d miss.
My contrarian claim is that everyone could live with the nameable numbers
I don’t understand. Those also seem to fall prey to
it turns out that you can’t distinguish zero from nonzero, so the step function is actually uncomputable.
Also,
People who really strongly care about the uncountability of the reals have a hard time coming up with a concrete example of what they’d miss.
Lebesgue measure theory, Gal(C/R) = Z/2Z, and some pathological examples in the history of differential geometry without which the current definition of a manifold would have been much more difficult to ascertain.
Off the top of my head. There are certainly other things I would miss.
A mistake. I was thinking of C as the so-called “generic complex numbers.” You’re right that if you replace C with the algebraic closure of whatever countable model’s been dreamed up, then C = R[i] and that’s it.
Admittedly I’m only conjecturing that Gal(C/K) will be different for some K countable, but I think there’s good evidence in favor of it. After all, if K is the algebraic closure of Q, then Gal(C/K) is gigantic. It doesn’t seem likely that one could “fix” the other “degrees of freedom” with only countably many irrationals.
Those are theories, which are not generally lost if you switch the underlying definitions aptly—and they are sometimes improved (if the new definitions are better, or if the switch demonstrates an abstraction that was not previously known).
People can’t pick out specific examples of numbers that are lost by switching to using nameable numbers, they can only gesture at broad classes of numbers, like “0.10147829..., choosing subsequent digits according to no specific rule”. If you can describe a specific example (using Lebesgue measure theory if you like), then that description is a name for that number.
I’m not sure what a “nameable number” is. Whatever countable naming scheme you invent, I can “name” a number that’s outside it by the usual diagonal trick: it differs from your first nameable number in the first digit, and so on. (Note this doesn’t require choice, the procedure may be deterministic.) Switching from reals to nameable numbers seems to require adding more complexity than I’m comfortable with. Also, I enjoy having a notion of Martin-Löf random sequences and random reals, which doesn’t play nice with nameability.
By “nameable number” he seems to just mean a definable number—in general an object is called “definable” if there is some first-order property that it and only it satisfies. (Obviously, this dependson just what the surrouding theory is. Sounds like he means “definable in ZFC”.) The set of all definable objects is countable, for obvious reasons.
With this definition, your diagonal trick actually doesn’t work (which is good, because otherwise we’d have a paradox): Definability isn’t a notion expressible in the theory itself, only in the metatheory. Hence if you attempt to “define” something in terms of the set of all definable numbers, the result is only, uh, “metadefinable”. (I gave myself a real headache once over the idea of the “first undefinable ordinal”; thanks to JoshuaZ for pointing out to me why this isn’t a paradox.)
EDIT: I should point out, using definable numbers seems kind of awful, because they’re defined (sorry, metadefined :P ) in terms of logic-stuff that depends on the surrounding theory. Computable numbers, though more restrictive, might behave a little better, I expect...
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… that makes it rather less awful than I suggested.
The set of all definable objects is countable, for obvious reasons.
With this definition, your diagonal trick actually doesn’t work (which is good, because otherwise we’d have a paradox): Definability isn’t a notion expressible in the theory itself, only in the metatheory. Hence if you attempt to “define” something in terms of the set of all definable numbers, the result is only, uh, “metadefinable”.
We could similarly argue that the definable objects should be thought of as “meta-countable” rather than countable, right? The reals-implied-by-a-theory would always be uncountable-in-the-theory. (I’m tempted to imagine a world in which this ended the argument between constructivists and classicists, but realistically, one side or the other would end up feeling uneasy about such a compromise… more likely, both.)
I think you’re confusing levels here. When I spoke of “the surrounding theory” above, I didn’t mean the, uh, actual ambient theory. (Sorry about that—I may have gotten a little mixed up myself) And indeed, like I said, definability only depends on the language, not the theory. Well—of course it still depends on the actual ambient theory. But working internal to that (which I was doing), it only depends on the language. And then one can talk about the metalanguage, staying internal to the same ambient theory, etc… (mind you, all this is assuming that the ambient theory is powerful enough to talk about this sort of thing).
So at no point was I intending to vary the actual ambient theory, like you seem to be talking about.
Warning: I don’t quite understand just how logicians think of these things and so may be confused myself.
You’re correct to point out that I’m being too vague, and I’m making mistakes speaking as if nameable numbers constitute a set or a single alternative to the reals or the rationals.
However, I’ve been a consumer of theorems and proofs that casually use real numbers as if they’re lightweight objects. There is considerable effort involved to parse the underlying concepts out of the theorems and proofs, and re-formalize them using something reasonable (completions of the natural numbers under various operations like additive inverse, multiplicative inverse, square root, roots of polynomials in general, roots of differential equations in general). Those are all different sets of nameable numbers, and they’re all countable.
I would prefer that mathematicians routinely perceived “the reals” as a peculiar construction, and instead of throwing it in routinely when working on concepts in geometry or symmetry as the standard tool to modeling positions and distances, thought about what properties they actually need to get the job they’re doing done.
I know of one mathematician who thinks the real numbers are a peculiar construction in the context of topology because of the pathological things you can do with them — continuous nowhere-differentiable curves, space-filling curves, and so on. That’s why she studies motivic/A1 homotopy theory instead of classical homotopy theory; only polynomial functions are allowed.
I would prefer that mathematicians routinely perceived “the reals” as a peculiar construction, and instead of throwing it in routinely when working on concepts in geometry or symmetry as the standard tool to modeling positions and distances, thought about what properties they actually need to get the job they’re doing done.
Why is it that mathematicians so love the idea of doing their work blindfolded and with their hands tied behind their backs? Someone invented the reals. They’re awesome things. And people invented all sorts of techniques you can use the reals for. Make the most of it! Leave proving stuff about when reals are useful to and how such a peculiar construction can be derived and angsting about how deep and convoluted the basis must be to specialists in angsting about how deep and convoluted the basis for using reals is.
It’s the same as programmers insisting on introducing abstractions decoupling their code from the framework and libraries that they’re using; modularity to prevent dependency hell.
It’s the same as programmers insisting on introducing abstractions decoupling their code from the framework and libraries that they’re using; modularity to prevent dependency hell.
I think it is the same as programmers choosing to use languages with built in support for floating point calculations and importing standard math and stats libraries as appropriate. This is an alternative to rolling your own math functions to model your calculations based off integers or bytes.
Are you suggesting that modeling real numbers with floating point is a good practice?
Yes, it is a standard practice, and it may be the best compromise available for a programmer or a team on a limited time budget, but the enormous differences between real numbers and floating point numbers mean that everything that was true upstream in math-land regarding real numbers, becomes suspect and have to be re-checked, or transformed into corresponding not-quite-the-same theorems and proofs.
If we (downstream consumers of mathematics) could get mathematicians to use realistic primitives, then we could put a lot of numerical analysts out of work (free up their labor to do something more valuable).
Do you think some constructivist representation of numbers can do better than IEEE floats at removing the need for numerical analysis in most engineering tasks, while still staying fast enough? I’m veeeeery skeptical. It would be a huge breakthrough if it were true.
Yes, that’s my position. In fact, if you had hardware support for an intuitionistic / constructivist representation (intervals, perhaps), my bet would be that the circuits would be simpler than the floating-point hardware implementing the IEEE standard now.
I’m not an expert in the field, but it seems to me that intervals require strictly more complexity than IEEE floats (because you still need to do floating-point arithmetic on the endpoints) and will be unusable in many practical problems because they will get too wide. At least that’s the impression I got from reading a lot of Kahan. Or do you have some more clever scheme in mind?
Yes, if you have to embrace the same messy compromises, then I am mistaken. My belief (which is founded on studying logic and mathematics in college, and then software development after college) is that better foundations, with effort, show through to better implementations.
Are you suggesting that modeling real numbers with floating point is a good practice?
Good, certainly. Not a universally optimal practice though. There are times when unlimited precision is preferable, despite the higher computational overhead. There are libraries for that too.
So you would prefer that, instead of having one all-purpose number system that we can embed just about any kind of number we like into (not to mention do all kinds of other things with), we had a collection of distinct number systems, each used for some different ad-hoc purpose? How would this be an improvement?
You might consider the fact that, once upon a time, people actually started with the natural numbers—and then, over the ages, gradually felt the need to expand the system of numbers further and further until they ended up with the standard objects of modern mathematics, such as the real number system.
This was not a historical accident. Each new kind of number corresponds to a new kind of operation people wanted to do, that they couldn’t do with existing numbers. If you want to do subtraction, you need negative numbers; if you want to do division, you need rationals; and if you want to take limits of Cauchy sequences, then you need real numbers.
I don’t understand why this should cause computer-programming types any anxiety. A real number is not some kind of mysterious magical entity; it is just the result of applying an operation, call it “lim”, to an object called a “sequence” (a_n).
Real numbers are used because people want to be able to take limits (the usefulness of doing which was established decisively starting in the 17th century). So long as you allow the taking of limits, you are going to be working with the real numbers, or something equivalent. Yes, you could try to examine every particular limit that anyone has ever taken, and put them all into a special class (or several special classes), but that would be ad-hoc, ugly, and completely unnecessary.
I think you’re being a bit uncharitable here. You’ve just moved the infinitude/”mysterious magicalness” from talking about real numbers to talking about sequences of rational numbers, and it is in fact possible to classify sequences as definable vs. undefinable, as well as computable vs. uncomputable. (Though definability personally seems a bit ad-hoc to me, seeing as it depends on the ambient theory.) I don’t think it’s really extraordinary to claim that an undefinable or uncomputable sequence is a bit mysterious and possibly somehow unreal.
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… that makes it rather less ad-hoc than I suggested.
EDIT: I should probably add, though, that this whole argument seems mostly pointless. Aside from where cousin_it and Jonicholas got to talking about how to represent numbers in computers—and that seems to have hardly anything to do with actual real numbers seeing as those can’t be represented in computers—it seems to be basically just Jonicholas saying “I don’t like the reals for common constructivist reasons” and other people saying “Regardless, they’re valid objects of study”. Is there more to it than that?
I think you’re being a bit uncharitable here. You’ve just moved the infinitude/”mysterious magicalness” from talking about real numbers to talking about sequences of rational numbers
That was deliberate. (How was it uncharitable?)
I don’t think it’s really extraordinary to claim that an undefinable or uncomputable sequence is a bit mysterious and possibly somehow unreal.
It may not be extraordinary, but it’s still a confusion. A confusion that was resolved a century ago, when set theory was axiomatized, and the formalist view emerged. The Cantor/Kronecker debate is over: Cantor was right, Kronecker was wrong.
The source of this confusion seems to be a belief that correspondences between mathematical structures and the physical world are properties of the mathematical structures in question, rather than properties of the physical world. This is a kind of map/territory confusion.
My understanding is this thread was started, and to some extent kept rolling, by an unrelated thread, where I was behaving extremely hostile to EY, and several people went through all my back posts, looking for things to downvote. Patrick found something.
To put myself in the clear, I came across this old comment because I was looking through Doug S’s old posts (because I was idly curious). I replied to your comment because I’m ridiculously pedantic, a virtue in mathematics. I haven’t downvoted any of your comments and I harbor no feelings of antipathy towards you. Eliezer’s a big boy and he can take care of himself.
Now, back to the math debate! I don’t think it’s legitimate to conflate countable sets with sets with finite information content. Here are two counter examples. 1. The set of busy beaver numbers (a subset of the naturals). 2. The digits of Chaitin’s Omega [make an ordered pair of (digit, position) to represent these] (see http://en.wikipedia.org/wiki/Chaitin%27s_constant). It’s been proved that these sets can’t be constructed with any algorithm.
“The set of busy beaver numbers” communicates, points to, refers to, picks out, a concept in natural language that we can both talk about. It has finite information content (a finite sequence of ascii characters suffices to communicate it). An analogous sentence in a sufficiently formal language would still have finite information content.
Note that a description, even if it is finite, is not necessarily in the form that you might desire. Transforming the description “the sequence of the first ten natural numbers” into the format “{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}” is easy, but the analogous transformation of “the first 10 busy beaver numbers” is very difficult if not impossible.
As nshepperd points out, an element of a countable set necessarily has finite information content (you can pick it out by providing an integer—“the fifth one” or whatever), while generic elements of uncountable sets cannot be picked out with finite information.
I think the finite information content comes from being an element of a countable set. Like every other real number, the digits of Chaitin’s constant themselves form a countable set (a sequence), while that set is a member of the uncountable R. Similarly, the busy beaver set is a subset of N, and drawn from the uncountable set 2^N.
Countable sets are useful (or rather, uncountable ones are inconvenient) because you can set up a normalized probability distribution over their contents. But… the set {Chaitin’s Constant} is countable (it has one element) but I still can’t get Omega’s digits. So there still seems to be a bit of mystery here.
Those are theories, which are not generally lost...
I really wish I had the time to explicitly write out the reasons why I believe these examples are compelling reasons to use the usual model of the real numbers. I tried, but I’ve already spent too long and I doubt they would convince you anyway.
People can’t pick out specific examples of numbers that are lost by switching to using nameable numbers,
So? Omega could obliterate 99% of the particles in the known universe, and I wouldn’t be able to name a particular one. If it turns out in the future that these nameable numbers have nice theoretic properties, sure. The effort to rebuild the usual theory doesn’t seem to be worth the benefit of getting rid of uncountability. (Or more precisely, one source of uncountability.)
I think I’ve spent enough time procrastinating on this topic. I don’t see it going anywhere productive.
Suppose someone played idly manipulating small objects, like bottlecaps, in the real world. Suppose they formed (inductively) some hypotheses, picked some axioms and a formal system, derived the observed truths as a consequence of the axioms, and went on to derive some predictions regarding particular long or unusual manipulations of bottlecaps.
If the proofs are correct, the conclusions are true regardless of the outcome of experiments. If you believe that mathematics is self-hosting; interesting and relevant and valuable in itself, that may be sufficient for you. However, you might alternatively take the position that contradictions with experiment would render the previous axioms, theorems and proofs less interesting because they are less relevant.
Generic real numbers, because of their infinite information content, are not good models of physical things (positions, distances, velocities, energies) that a casual consumer of mathematics might think they’re natural models of. If you built the real numbers from first-order ZFC axioms, then they do have (via Lowenheim-Skolem) some finite-information-content correspondences—however, those objects look like abstract syntax trees, ramified with details that act as obstacles to finding an analogous structure in the real world.
Of course, whether a number is definable or not depends on the surrounding theory. Stick to first-order theory of the reals and only algebraic numbers will be definable! Definable in ZF? Or what?
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… no difference here between ZF and ZFC...
Löwenheim-Skolem only applies to first-order theories. While there are models of the theory of real closed fields that are countable, referring to those models as “the real numbers” is somewhat misleading, because there isn’t only one of them (up to model-theoretic isomorphism).
Also, if you’re going to measure information content, you really need to fix a formal language first, or else “the number of bits needed to express X” is ill-defined.
Basically, learn model theory before trying to wield it.
Also, if you’re going to measure information content, you really need to fix a formal language first, or else “the number of bits needed to express X” is ill-defined.
Basically, learn model theory before trying to wield it.
I don’t know model theory, but isn’t the crucial detail here whether or not the number of bits needed to express X is finite or infinite? If so, then it seems we can handwave the specific formal language we’re using to describe X, in the same way that we can handwave what encoding for Turing Machines generally when talking about Kolmogorov complexity, even though to actually get a concrete integer K(S) representing the Kolmogorovo complexity of a string S requires us to use a fixed encoding of Turing Machines. In practice, we never actually care what the number K(S) is.
Let’s say I have a specific model of the real numbers in mind, and lets pretend “number of bits needed to describe X” means “log2 the length of the shortest theory that proves the existence of X.”
Fix a language L1 whose constants are the rational numbers and which otherwise is the language of linear orders. Then it takes a countable number of propositions to prove the existence of any given irrational number (i.e., exists x[1] such that x[1] < u[1], …, exists y[1] such that y[1] > v[1], …, x[1] = y[1], … x[1] = x[2], …, where the sequences u[n] and v[n] are strict upper and lower bounding sequences on the real number in question).
Now fix a language L2 whose constants are the real numbers. It now requires one proposition to prove the existence of any given irrational number (i.e., exists x such that x = c).
The difference between this ill-defined measure of information and Kolomogrov complexity is that Turing Machines are inherently countable, and the languages and models of model theory need not be.
(Disclaimer: paper-machine2011 knew far more about mathematical logic than paper-machine2016 does.)
The definition “length of the shortest program which minimizes (program length + runtime)” isn’t undecideable, although you could argue that that’s not what we normally mean by number of bits.
Sorry for what might be a silly question, but what do you mean by “generic real number”? In the sense of “one number picked at random from the set”, a “generic natural number” would also be a huge and impossible to encounter—almost all natural numbers would need more bits then are Plank volumes in the universe to represent—and it doesn’t seem that you’re trying to say that.
If you start selecting things at random, then you need a probability distribution. Many routinely used probability distributions over the natural numbers give you a nontrivial chance of being able to fit the number on your computer.
There are, of course, corresponding probability distributions over the reals (take a probability distribution over the natural numbers and give zero probability to anything else). However, the routinely used probability distributions on the reals give zero probability to the output being a natural number, a rational number, describable with a finite algebraic equation, or in fact, being able to fit the number on your computer.
One of the problems with real numbers is that if someone trying to do Bayesian analysis of a sensor that reads 2.000..., or 3.14159… using one of these real number distributions as their prior, cannot conclude that the quantity measured probably is 2 or pi, no matter how many digits of precision the sensor goes out to.
I get that the sensor thing was only an example, but still: it doesn’t seem like a real objection. I mean, you’re not going to have (or need) a sensor with infinity decimals of precision. (Or perhaps I’m not understanding you?)
In terms of “selecting things at random”, for any practical use I can think of you’ll be selecting things like intervals, not real numbers. I don’t quite see how that’s relevant to the formalism you use to reason about how and what you’re calculating.
I think there’s some big understanding gap here. Could you explain (or just point to some standard text), how does one reason about trivial things like areas of circles and volumes of spheres without using reals?
Perhaps you’ve confused the “pi has a decimal expansion goes on forever without seeming pattern” with “a generic real number has a decimal expansion that goes on forever without pattern”? Pi does have a finite representation, “Pi”. We use it all the time, and it’s just as precise as “2″ or “0.5”.
Specifically, you could start with the rationals, and complete it by including all solutions to differential equations. Pi would be included, and many other numbers, but you’d still only have a countable set—because every number would have one or more shortest definitions—finite information content.
If you had a probability distribution over such a set, it would naturally favor hypotheses with short definitions. If it started out including pi as a possibility, and you gathered sufficient sensor data consistent with pi (a finite amount), the probability distribution would give pi as the best hypothesis. This is reasonable behavior. You have to do non-obvious mucking around with your prior to get that sort of reasonableness with standard real-number probabilities.
As others have pointed out, any specific countable system of numbers (such as the “solutions to differential equations” that I mentioned) is susceptible to diagonalization—but I see no reason to “complete” the process, as if being susceptible to diagonalization were a terrible flaw above all others. All the entities that you’re actually manipulating (finite precision data and finite, stateable hypotheses like “exactly 2” and “exactly pi”) are finite-information-content, and “completing” the reals against diagonalization makes essentially all the reals infinite-information-content—a cure in my mind far worse than the disease.
(Note: I’m not arguing in this particular post, just asking clarifying questions, as you seem to have the issues much clearer in your mind than I do.)
1) It seems one can start with naturals, extend them to integers, then to rationals, then to whatever set results from including solutions to differential equations (does that have a standard name?). I imagine there are countably infinite many constructions like that, am I right? They seem to “divide” the numbers “finer” (I’d welcome a hint to more formal description of this), though they aren’t necessarily totally ordered in terms of how “fine” they are, and that the limit of this process after an infinity of extensions seems to be the reals. (Am I missing something important until here? In particular, we can reach the reals much faster, is there some important property in particular the countable extensions have in general, other than their result set being countable and their individual structure?)
2) Do you have other objections to real numbers that do not involve probabilities, probability distributions, and similar information theory concepts?
3) I don’t quite grok your π example. It seems to me that a finite amount of sensor data will always only be able to tell you it’s consistent with all values in the interval π±ε; if you’re using a sufficiently “dense” set, even just the rationals, you’ll have an infinity of values in that interval, while using the reals you’ll have an uncountable one. In the countable case you’ll have to have probabilities for the countable infinity of consistent values, which could result in π being the most probable one, and in the uncountable one you’ll need a probability distribution function, which could as well have π as the most probable. (In particular, I can’t see a reason why you couldn’t find a the probability distribution function that has exactly the same value as your probability function when applied to the values in your π-containing countably-infinite set and is “well-behaved” in some sense on the reals between them; but I’m likely to miss something here.)
I sort-of get that picking π in a countable set can be a finite-information operation and an infinite-information one in an uncountable set (though I’m not quite clear if or why that works on sets at least as “finely divided” as the rationals). But that seems to be a trick of picking the right countable set to contain the value you’re looking for:
If you started estimating π (let’s say, the ratio of diameter to circumference in an euclidean universe) with, say, just the rationals, you may or may not get a “most likely” hypothesis, but it wouldn’t be π; you’ll only estimate that one if you happened to start with a set that contained it. And if you use a set that contains π, there would always be some kind of other number that fits in a “finer”-but-countable set you aren’t using that you might need to estimate (assuming there’s a lot of such sets as I speculate in point 1 above).
Of course, using the reals doesn’t save you from that: you still need an infinite amount of information to find an arbitrary real. But by using probability distributions—even if you construct them by picking a probability function on a countable set and then extending it to the reals somehow—it forces you to think about the parts outside that countable set (i.e., other even “finer” countable sets). In a way, this feels like reminding you of things you didn’t think of.
1) Yes, there are countably many constructions of various kinds of numbers. The construction can presumably be written down, and strings are finite-information-content entities. Yes, they’re normally understood to form a set-theoretic lattice—the integers are a subset of the gaussian integers, and the integers are a subset of the rationals, and both the gaussian and rationals are a subset of the complex plane.
However, the reals are not in any well-defined sense “the” limit of that lattice—you could create a contrived argument that they are, but you could also create an argument that the natural limit is something else, either stopping sooner, or continuing further to include infinities and infinitiesimals or (salient to mathematicians) the complex plane.
Defenders of the reals as a natural concept will use the phrase “the complete ordered field”, but when you examine the definition of “complete” they are referencing, it uses a significant amount of set theory (and an ugly Dedekind cuts construction) to include everything that it wants to include, and exclude many other things that might seem to be included.
2) Yes. I think the reals are a history-laden concept; they were built in fear of set-theoretic and calculus paradoxes, and without awareness of the computational approach—information theory and Godel’s incompleteness. They are useful primarily in the way that C++ is useful to C++ programmers—as a thicket or swamp of complexity that defends the residents from attack. Any mathematician doing useful work in a statistical, calculus-related, or topological field who casually uses the reals will need someone else, a numerical analyst or computer scientist, to laboriously go through their work and take out the reals, replacing them with a computable (and countable) alternative notion—different notions for different results. Often, this effort is neglected, and people use IEEE floats where the mathematician said “real”, and get ridiculous results—or worse, dangerously reasonable results.
3) You’re right that the finite amount of sensor data will only say it is consistent with this interval. As you point out, if there are an uncountable set within that interval, then it’s entirely possible for there to be no single value that is a maximum of the probability distribution function. (That’s an excellent example of some of the ridiculous circumlocutions that come from using uncountable sets, when you actually want the system to come up with one or a few best hypotheses, each of which is stateable.)
Pi is always a finite-information entity. Everything nameable is. It doesn’t become an infinitely large in information content just because you consider it as an element of the reals.
Yes—if you use a probability distribution over the rationals as your prior, and the actual value is irrational, then you can get bad results. I think this is a serious problem, and we should think hard about what bayesian updating with misspecified models looks like (I know Cosma Shalizi has done some work on this), so that we have some idea what failure looks like. We should also think carefully about what we would consider to be a reasonable hypothesis, one that we might eventually come to rest on.
However, it’s a false fork to argue “We wouldn’t use the rationals therefore we should use the reals”. As I’ve been trying to say, the reals are a particular, large, complicated, and deeply historical construction, and we should not expect to encounter them “out in the world”.
Andrej Bauer has implemented actual real number arithmetic (not IEEE nonsense, or “computable reals” which are interesting, but not reals). Peano integers, in his (Ocaml-based) language, RZ, would probably be five or ten lines. (Commutative groups are 13 lines). In contrast, building from commutative groups up to defining reals as sequences of nested intervals takes five pages; see the appendix: http://math.andrej.com/wp-content/uploads/2007/04/rzreals.pdf
Regarding “reminding you of things you didn’t think of”, I think Cosma Shalizi and Andrew Gelman have convincingly argued that Bayesian philosophy/methodology is flawed—we don’t just pick a prior, collect data, do an update, and believe the results. If we were magical uncomputable beasties (Solomonoff induction), possibly that is what we would do. In the real world, there are other steps, including examining the data, including the residual errors, to see if it suggests hypotheses that weren’t included in the original prior. http://www.stat.columbia.edu/~gelman/research/unpublished/philosophy.pdf
Hi John! Thank you very much for taking the time to answer at such length. The links you included were also very interesting, thanks.
I think I got a bit of insight into the original issue (way up in the comments, when I interjected in your chat with Patrick).
With respect to the points closer in this thread, it’s become more like teaching than an actual discussion. I’m much too little educated in the subject, so I could contribute mostly with questions (many inevitably naïve) rather than insights. I’ll stop here then; though I am interested, I’m not interested enough right now to educate myself, so I won’t impose on your time any longer.
(That is, not unless you want to. I can continue if for some reason you’d take pleasure in educating me further.)
Mathematicians routinely use “infinite” to mean “infinite in magnitude”. For example, the concept “The natural numbers” is infinite in magnitude, but I have picked it out using only 19 ascii characters. From a computer science perspective, it is a finite concept—finite in information content, the number of bits necessary to point it out.
Each of the objects in the set of the Peano integers is finite. The set of Peano integers, considered as a whole, is infinite in magnitude, but finite in information content.
Mathematician’s routine speech sometimes sounds as if a generic real number is a small thing, something that you could pick up and move around. In fact, a generic real number (since it’s an element of an uncountable set) is infinite in information content—they’re huge, and impossible to encounter, much less pick up.
Lowenheim-Skolem allows you to transform proofs that, on a straightforward reading, claim to be manipulating generic elements of uncountable sets (picking up and moving around real numbers for example), with proofs that claim to be manipulating elements of countable sets—that is, objects that are finite in information content.
In that tranformation, you will probably introduce “objects” which are something like “the double-struck N”, and those objects certainly still satisfy internal predicates like “InfiniteInMagnitude(the double-struck N)”.
However, you’re never forced to believe that mathematicians are routinely doing impossible things—you can always take a formalist stance, pointing out that mathematicians are actually manipulating symbols, which are small, finite-in-information-content things.
So, given this, what exactly is your complaint? You started off criticizing Eliezer (and whomever else) for saying “The integers are countable, but the real number line is uncountable”—I suppose on the grounds that everything in the physical universe is countable, or something. (You weren’t exactly clear.) But now you point out (correctly) that there is a perfectly good interpretation of this statement which in no way depends on there being an uncountable number of physical things anywhere, or otherwise violates your (not-exactly-well-defined) philosophy. So haven’t you just defeated yourself?
I have a knee-jerk response, railing against uncountable sets in general and the real numbers in particular; it’s not pretty and I know how to control it better now.
I’m fairly confident that for your purposes you could live with the computable numbers (that is: those numbers whose decimal expansion can be computed by ), and as long as you didn’t need anything stronger than integration amenable to quadrature, you’d be just fine.
There are people who take this route, but I can’t think of any off the top of my head. Knuth once stated that he’d like to write a calculus book roughly following this path, but, well, he’s got other things on his mind.
EDIT: I should point out also that the computable numbers are countable (by the usual Godel encoding of whatever machine is rattling off the digits for you), and that for all practical intents and purposes they’re probably equivalent to whatever calculus-related mischief is in play at the moment.
There’s some weirdnesses down that route—for example, it turns out that you can’t distinguish zero from nonzero, so the step function is actually uncomputable.
My contrarian claim is that everyone could live with the nameable numbers—that is, the numbers that can be pointed out using a finite number of books to describe them. People who really strongly care about the uncountability of the reals have a hard time coming up with a concrete example of what they’d miss.
I don’t understand. Those also seem to fall prey to
Also,
Lebesgue measure theory, Gal(C/R) = Z/2Z, and some pathological examples in the history of differential geometry without which the current definition of a manifold would have been much more difficult to ascertain.
Off the top of my head. There are certainly other things I would miss.
I’m confused; this is true for any real closed field. What are you getting at with this?
A mistake. I was thinking of C as the so-called “generic complex numbers.” You’re right that if you replace C with the algebraic closure of whatever countable model’s been dreamed up, then C = R[i] and that’s it.
Admittedly I’m only conjecturing that Gal(C/K) will be different for some K countable, but I think there’s good evidence in favor of it. After all, if K is the algebraic closure of Q, then Gal(C/K) is gigantic. It doesn’t seem likely that one could “fix” the other “degrees of freedom” with only countably many irrationals.
Those are theories, which are not generally lost if you switch the underlying definitions aptly—and they are sometimes improved (if the new definitions are better, or if the switch demonstrates an abstraction that was not previously known).
People can’t pick out specific examples of numbers that are lost by switching to using nameable numbers, they can only gesture at broad classes of numbers, like “0.10147829..., choosing subsequent digits according to no specific rule”. If you can describe a specific example (using Lebesgue measure theory if you like), then that description is a name for that number.
I’m not sure what a “nameable number” is. Whatever countable naming scheme you invent, I can “name” a number that’s outside it by the usual diagonal trick: it differs from your first nameable number in the first digit, and so on. (Note this doesn’t require choice, the procedure may be deterministic.) Switching from reals to nameable numbers seems to require adding more complexity than I’m comfortable with. Also, I enjoy having a notion of Martin-Löf random sequences and random reals, which doesn’t play nice with nameability.
By “nameable number” he seems to just mean a definable number—in general an object is called “definable” if there is some first-order property that it and only it satisfies. (Obviously, this dependson just what the surrouding theory is. Sounds like he means “definable in ZFC”.) The set of all definable objects is countable, for obvious reasons.
With this definition, your diagonal trick actually doesn’t work (which is good, because otherwise we’d have a paradox): Definability isn’t a notion expressible in the theory itself, only in the metatheory. Hence if you attempt to “define” something in terms of the set of all definable numbers, the result is only, uh, “metadefinable”. (I gave myself a real headache once over the idea of the “first undefinable ordinal”; thanks to JoshuaZ for pointing out to me why this isn’t a paradox.)
EDIT: I should point out, using definable numbers seems kind of awful, because they’re defined (sorry, metadefined :P ) in terms of logic-stuff that depends on the surrounding theory. Computable numbers, though more restrictive, might behave a little better, I expect...
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… that makes it rather less awful than I suggested.
We could similarly argue that the definable objects should be thought of as “meta-countable” rather than countable, right? The reals-implied-by-a-theory would always be uncountable-in-the-theory. (I’m tempted to imagine a world in which this ended the argument between constructivists and classicists, but realistically, one side or the other would end up feeling uneasy about such a compromise… more likely, both.)
I think you’re confusing levels here. When I spoke of “the surrounding theory” above, I didn’t mean the, uh, actual ambient theory. (Sorry about that—I may have gotten a little mixed up myself) And indeed, like I said, definability only depends on the language, not the theory. Well—of course it still depends on the actual ambient theory. But working internal to that (which I was doing), it only depends on the language. And then one can talk about the metalanguage, staying internal to the same ambient theory, etc… (mind you, all this is assuming that the ambient theory is powerful enough to talk about this sort of thing).
So at no point was I intending to vary the actual ambient theory, like you seem to be talking about.
Warning: I don’t quite understand just how logicians think of these things and so may be confused myself.
You’re correct to point out that I’m being too vague, and I’m making mistakes speaking as if nameable numbers constitute a set or a single alternative to the reals or the rationals.
However, I’ve been a consumer of theorems and proofs that casually use real numbers as if they’re lightweight objects. There is considerable effort involved to parse the underlying concepts out of the theorems and proofs, and re-formalize them using something reasonable (completions of the natural numbers under various operations like additive inverse, multiplicative inverse, square root, roots of polynomials in general, roots of differential equations in general). Those are all different sets of nameable numbers, and they’re all countable.
I would prefer that mathematicians routinely perceived “the reals” as a peculiar construction, and instead of throwing it in routinely when working on concepts in geometry or symmetry as the standard tool to modeling positions and distances, thought about what properties they actually need to get the job they’re doing done.
I know of one mathematician who thinks the real numbers are a peculiar construction in the context of topology because of the pathological things you can do with them — continuous nowhere-differentiable curves, space-filling curves, and so on. That’s why she studies motivic/A1 homotopy theory instead of classical homotopy theory; only polynomial functions are allowed.
Why is it that mathematicians so love the idea of doing their work blindfolded and with their hands tied behind their backs? Someone invented the reals. They’re awesome things. And people invented all sorts of techniques you can use the reals for. Make the most of it! Leave proving stuff about when reals are useful to and how such a peculiar construction can be derived and angsting about how deep and convoluted the basis must be to specialists in angsting about how deep and convoluted the basis for using reals is.
It’s the same as programmers insisting on introducing abstractions decoupling their code from the framework and libraries that they’re using; modularity to prevent dependency hell.
I think it is the same as programmers choosing to use languages with built in support for floating point calculations and importing standard math and stats libraries as appropriate. This is an alternative to rolling your own math functions to model your calculations based off integers or bytes.
Your shot.
Are you suggesting that modeling real numbers with floating point is a good practice?
Yes, it is a standard practice, and it may be the best compromise available for a programmer or a team on a limited time budget, but the enormous differences between real numbers and floating point numbers mean that everything that was true upstream in math-land regarding real numbers, becomes suspect and have to be re-checked, or transformed into corresponding not-quite-the-same theorems and proofs.
If we (downstream consumers of mathematics) could get mathematicians to use realistic primitives, then we could put a lot of numerical analysts out of work (free up their labor to do something more valuable).
Do you think some constructivist representation of numbers can do better than IEEE floats at removing the need for numerical analysis in most engineering tasks, while still staying fast enough? I’m veeeeery skeptical. It would be a huge breakthrough if it were true.
Yes, that’s my position. In fact, if you had hardware support for an intuitionistic / constructivist representation (intervals, perhaps), my bet would be that the circuits would be simpler than the floating-point hardware implementing the IEEE standard now.
I’m not an expert in the field, but it seems to me that intervals require strictly more complexity than IEEE floats (because you still need to do floating-point arithmetic on the endpoints) and will be unusable in many practical problems because they will get too wide. At least that’s the impression I got from reading a lot of Kahan. Or do you have some more clever scheme in mind?
Yes, if you have to embrace the same messy compromises, then I am mistaken. My belief (which is founded on studying logic and mathematics in college, and then software development after college) is that better foundations, with effort, show through to better implementations.
Good, certainly. Not a universally optimal practice though. There are times when unlimited precision is preferable, despite the higher computational overhead. There are libraries for that too.
So you would prefer that, instead of having one all-purpose number system that we can embed just about any kind of number we like into (not to mention do all kinds of other things with), we had a collection of distinct number systems, each used for some different ad-hoc purpose? How would this be an improvement?
You might consider the fact that, once upon a time, people actually started with the natural numbers—and then, over the ages, gradually felt the need to expand the system of numbers further and further until they ended up with the standard objects of modern mathematics, such as the real number system.
This was not a historical accident. Each new kind of number corresponds to a new kind of operation people wanted to do, that they couldn’t do with existing numbers. If you want to do subtraction, you need negative numbers; if you want to do division, you need rationals; and if you want to take limits of Cauchy sequences, then you need real numbers.
I don’t understand why this should cause computer-programming types any anxiety. A real number is not some kind of mysterious magical entity; it is just the result of applying an operation, call it “lim”, to an object called a “sequence” (a_n).
Real numbers are used because people want to be able to take limits (the usefulness of doing which was established decisively starting in the 17th century). So long as you allow the taking of limits, you are going to be working with the real numbers, or something equivalent. Yes, you could try to examine every particular limit that anyone has ever taken, and put them all into a special class (or several special classes), but that would be ad-hoc, ugly, and completely unnecessary.
I think you’re being a bit uncharitable here. You’ve just moved the infinitude/”mysterious magicalness” from talking about real numbers to talking about sequences of rational numbers, and it is in fact possible to classify sequences as definable vs. undefinable, as well as computable vs. uncomputable. (Though definability personally seems a bit ad-hoc to me, seeing as it depends on the ambient theory.) I don’t think it’s really extraordinary to claim that an undefinable or uncomputable sequence is a bit mysterious and possibly somehow unreal.
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… that makes it rather less ad-hoc than I suggested.
EDIT: I should probably add, though, that this whole argument seems mostly pointless. Aside from where cousin_it and Jonicholas got to talking about how to represent numbers in computers—and that seems to have hardly anything to do with actual real numbers seeing as those can’t be represented in computers—it seems to be basically just Jonicholas saying “I don’t like the reals for common constructivist reasons” and other people saying “Regardless, they’re valid objects of study”. Is there more to it than that?
That was deliberate. (How was it uncharitable?)
It may not be extraordinary, but it’s still a confusion. A confusion that was resolved a century ago, when set theory was axiomatized, and the formalist view emerged. The Cantor/Kronecker debate is over: Cantor was right, Kronecker was wrong.
The source of this confusion seems to be a belief that correspondences between mathematical structures and the physical world are properties of the mathematical structures in question, rather than properties of the physical world. This is a kind of map/territory confusion.
A good point.
Sorry, uncharitable was the wrong word there. I meant you didn’t address the actual apparent problem. Your new comment does.
No, there’s nothing substantive beyond that.
My understanding is this thread was started, and to some extent kept rolling, by an unrelated thread, where I was behaving extremely hostile to EY, and several people went through all my back posts, looking for things to downvote. Patrick found something.
To put myself in the clear, I came across this old comment because I was looking through Doug S’s old posts (because I was idly curious). I replied to your comment because I’m ridiculously pedantic, a virtue in mathematics. I haven’t downvoted any of your comments and I harbor no feelings of antipathy towards you. Eliezer’s a big boy and he can take care of himself.
Now, back to the math debate! I don’t think it’s legitimate to conflate countable sets with sets with finite information content. Here are two counter examples. 1. The set of busy beaver numbers (a subset of the naturals). 2. The digits of Chaitin’s Omega [make an ordered pair of (digit, position) to represent these] (see http://en.wikipedia.org/wiki/Chaitin%27s_constant). It’s been proved that these sets can’t be constructed with any algorithm.
“The set of busy beaver numbers” communicates, points to, refers to, picks out, a concept in natural language that we can both talk about. It has finite information content (a finite sequence of ascii characters suffices to communicate it). An analogous sentence in a sufficiently formal language would still have finite information content.
Note that a description, even if it is finite, is not necessarily in the form that you might desire. Transforming the description “the sequence of the first ten natural numbers” into the format “{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}” is easy, but the analogous transformation of “the first 10 busy beaver numbers” is very difficult if not impossible.
As nshepperd points out, an element of a countable set necessarily has finite information content (you can pick it out by providing an integer—“the fifth one” or whatever), while generic elements of uncountable sets cannot be picked out with finite information.
I think the finite information content comes from being an element of a countable set. Like every other real number, the digits of Chaitin’s constant themselves form a countable set (a sequence), while that set is a member of the uncountable R. Similarly, the busy beaver set is a subset of N, and drawn from the uncountable set 2^N.
Countable sets are useful (or rather, uncountable ones are inconvenient) because you can set up a normalized probability distribution over their contents. But… the set {Chaitin’s Constant} is countable (it has one element) but I still can’t get Omega’s digits. So there still seems to be a bit of mystery here.
I really wish I had the time to explicitly write out the reasons why I believe these examples are compelling reasons to use the usual model of the real numbers. I tried, but I’ve already spent too long and I doubt they would convince you anyway.
So? Omega could obliterate 99% of the particles in the known universe, and I wouldn’t be able to name a particular one. If it turns out in the future that these nameable numbers have nice theoretic properties, sure. The effort to rebuild the usual theory doesn’t seem to be worth the benefit of getting rid of uncountability. (Or more precisely, one source of uncountability.)
I think I’ve spent enough time procrastinating on this topic. I don’t see it going anywhere productive.
Suppose someone played idly manipulating small objects, like bottlecaps, in the real world. Suppose they formed (inductively) some hypotheses, picked some axioms and a formal system, derived the observed truths as a consequence of the axioms, and went on to derive some predictions regarding particular long or unusual manipulations of bottlecaps.
If the proofs are correct, the conclusions are true regardless of the outcome of experiments. If you believe that mathematics is self-hosting; interesting and relevant and valuable in itself, that may be sufficient for you. However, you might alternatively take the position that contradictions with experiment would render the previous axioms, theorems and proofs less interesting because they are less relevant.
Generic real numbers, because of their infinite information content, are not good models of physical things (positions, distances, velocities, energies) that a casual consumer of mathematics might think they’re natural models of. If you built the real numbers from first-order ZFC axioms, then they do have (via Lowenheim-Skolem) some finite-information-content correspondences—however, those objects look like abstract syntax trees, ramified with details that act as obstacles to finding an analogous structure in the real world.
Of course, whether a number is definable or not depends on the surrounding theory. Stick to first-order theory of the reals and only algebraic numbers will be definable! Definable in ZF? Or what?
EDIT Apr 30: Oops! Obviously definability depends only on the ambient language, not the actual ambient theory… no difference here between ZF and ZFC...
Löwenheim-Skolem only applies to first-order theories. While there are models of the theory of real closed fields that are countable, referring to those models as “the real numbers” is somewhat misleading, because there isn’t only one of them (up to model-theoretic isomorphism).
Also, if you’re going to measure information content, you really need to fix a formal language first, or else “the number of bits needed to express X” is ill-defined.
Basically, learn model theory before trying to wield it.
I don’t know model theory, but isn’t the crucial detail here whether or not the number of bits needed to express X is finite or infinite? If so, then it seems we can handwave the specific formal language we’re using to describe X, in the same way that we can handwave what encoding for Turing Machines generally when talking about Kolmogorov complexity, even though to actually get a concrete integer K(S) representing the Kolmogorovo complexity of a string S requires us to use a fixed encoding of Turing Machines. In practice, we never actually care what the number K(S) is.
Let’s say I have a specific model of the real numbers in mind, and lets pretend “number of bits needed to describe X” means “log2 the length of the shortest theory that proves the existence of X.”
Fix a language L1 whose constants are the rational numbers and which otherwise is the language of linear orders. Then it takes a countable number of propositions to prove the existence of any given irrational number (i.e., exists x[1] such that x[1] < u[1], …, exists y[1] such that y[1] > v[1], …, x[1] = y[1], … x[1] = x[2], …, where the sequences u[n] and v[n] are strict upper and lower bounding sequences on the real number in question).
Now fix a language L2 whose constants are the real numbers. It now requires one proposition to prove the existence of any given irrational number (i.e., exists x such that x = c).
The difference between this ill-defined measure of information and Kolomogrov complexity is that Turing Machines are inherently countable, and the languages and models of model theory need not be.
(Disclaimer: paper-machine2011 knew far more about mathematical logic than paper-machine2016 does.)
Whether a theory proves the existence of X may be an undecideable question.
How many bits it takes to describe X is an undecidable question when defined in other ways, too.
The definition “length of the shortest program which minimizes (program length + runtime)” isn’t undecideable, although you could argue that that’s not what we normally mean by number of bits.
Adding program length and runtime feels to me like a type error.
Sorry for what might be a silly question, but what do you mean by “generic real number”? In the sense of “one number picked at random from the set”, a “generic natural number” would also be a huge and impossible to encounter—almost all natural numbers would need more bits then are Plank volumes in the universe to represent—and it doesn’t seem that you’re trying to say that.
If you start selecting things at random, then you need a probability distribution. Many routinely used probability distributions over the natural numbers give you a nontrivial chance of being able to fit the number on your computer.
There are, of course, corresponding probability distributions over the reals (take a probability distribution over the natural numbers and give zero probability to anything else). However, the routinely used probability distributions on the reals give zero probability to the output being a natural number, a rational number, describable with a finite algebraic equation, or in fact, being able to fit the number on your computer.
One of the problems with real numbers is that if someone trying to do Bayesian analysis of a sensor that reads 2.000..., or 3.14159… using one of these real number distributions as their prior, cannot conclude that the quantity measured probably is 2 or pi, no matter how many digits of precision the sensor goes out to.
I get that the sensor thing was only an example, but still: it doesn’t seem like a real objection. I mean, you’re not going to have (or need) a sensor with infinity decimals of precision. (Or perhaps I’m not understanding you?)
In terms of “selecting things at random”, for any practical use I can think of you’ll be selecting things like intervals, not real numbers. I don’t quite see how that’s relevant to the formalism you use to reason about how and what you’re calculating.
I think there’s some big understanding gap here. Could you explain (or just point to some standard text), how does one reason about trivial things like areas of circles and volumes of spheres without using reals?
Perhaps you’ve confused the “pi has a decimal expansion goes on forever without seeming pattern” with “a generic real number has a decimal expansion that goes on forever without pattern”? Pi does have a finite representation, “Pi”. We use it all the time, and it’s just as precise as “2″ or “0.5”.
Specifically, you could start with the rationals, and complete it by including all solutions to differential equations. Pi would be included, and many other numbers, but you’d still only have a countable set—because every number would have one or more shortest definitions—finite information content.
If you had a probability distribution over such a set, it would naturally favor hypotheses with short definitions. If it started out including pi as a possibility, and you gathered sufficient sensor data consistent with pi (a finite amount), the probability distribution would give pi as the best hypothesis. This is reasonable behavior. You have to do non-obvious mucking around with your prior to get that sort of reasonableness with standard real-number probabilities.
As others have pointed out, any specific countable system of numbers (such as the “solutions to differential equations” that I mentioned) is susceptible to diagonalization—but I see no reason to “complete” the process, as if being susceptible to diagonalization were a terrible flaw above all others. All the entities that you’re actually manipulating (finite precision data and finite, stateable hypotheses like “exactly 2” and “exactly pi”) are finite-information-content, and “completing” the reals against diagonalization makes essentially all the reals infinite-information-content—a cure in my mind far worse than the disease.
(Note: I’m not arguing in this particular post, just asking clarifying questions, as you seem to have the issues much clearer in your mind than I do.)
1) It seems one can start with naturals, extend them to integers, then to rationals, then to whatever set results from including solutions to differential equations (does that have a standard name?). I imagine there are countably infinite many constructions like that, am I right? They seem to “divide” the numbers “finer” (I’d welcome a hint to more formal description of this), though they aren’t necessarily totally ordered in terms of how “fine” they are, and that the limit of this process after an infinity of extensions seems to be the reals. (Am I missing something important until here? In particular, we can reach the reals much faster, is there some important property in particular the countable extensions have in general, other than their result set being countable and their individual structure?)
2) Do you have other objections to real numbers that do not involve probabilities, probability distributions, and similar information theory concepts?
3) I don’t quite grok your π example. It seems to me that a finite amount of sensor data will always only be able to tell you it’s consistent with all values in the interval π±ε; if you’re using a sufficiently “dense” set, even just the rationals, you’ll have an infinity of values in that interval, while using the reals you’ll have an uncountable one. In the countable case you’ll have to have probabilities for the countable infinity of consistent values, which could result in π being the most probable one, and in the uncountable one you’ll need a probability distribution function, which could as well have π as the most probable. (In particular, I can’t see a reason why you couldn’t find a the probability distribution function that has exactly the same value as your probability function when applied to the values in your π-containing countably-infinite set and is “well-behaved” in some sense on the reals between them; but I’m likely to miss something here.)
I sort-of get that picking π in a countable set can be a finite-information operation and an infinite-information one in an uncountable set (though I’m not quite clear if or why that works on sets at least as “finely divided” as the rationals). But that seems to be a trick of picking the right countable set to contain the value you’re looking for:
If you started estimating π (let’s say, the ratio of diameter to circumference in an euclidean universe) with, say, just the rationals, you may or may not get a “most likely” hypothesis, but it wouldn’t be π; you’ll only estimate that one if you happened to start with a set that contained it. And if you use a set that contains π, there would always be some kind of other number that fits in a “finer”-but-countable set you aren’t using that you might need to estimate (assuming there’s a lot of such sets as I speculate in point 1 above).
Of course, using the reals doesn’t save you from that: you still need an infinite amount of information to find an arbitrary real. But by using probability distributions—even if you construct them by picking a probability function on a countable set and then extending it to the reals somehow—it forces you to think about the parts outside that countable set (i.e., other even “finer” countable sets). In a way, this feels like reminding you of things you didn’t think of.
OK, what am I missing?
1) Yes, there are countably many constructions of various kinds of numbers. The construction can presumably be written down, and strings are finite-information-content entities. Yes, they’re normally understood to form a set-theoretic lattice—the integers are a subset of the gaussian integers, and the integers are a subset of the rationals, and both the gaussian and rationals are a subset of the complex plane.
However, the reals are not in any well-defined sense “the” limit of that lattice—you could create a contrived argument that they are, but you could also create an argument that the natural limit is something else, either stopping sooner, or continuing further to include infinities and infinitiesimals or (salient to mathematicians) the complex plane.
Defenders of the reals as a natural concept will use the phrase “the complete ordered field”, but when you examine the definition of “complete” they are referencing, it uses a significant amount of set theory (and an ugly Dedekind cuts construction) to include everything that it wants to include, and exclude many other things that might seem to be included.
2) Yes. I think the reals are a history-laden concept; they were built in fear of set-theoretic and calculus paradoxes, and without awareness of the computational approach—information theory and Godel’s incompleteness. They are useful primarily in the way that C++ is useful to C++ programmers—as a thicket or swamp of complexity that defends the residents from attack. Any mathematician doing useful work in a statistical, calculus-related, or topological field who casually uses the reals will need someone else, a numerical analyst or computer scientist, to laboriously go through their work and take out the reals, replacing them with a computable (and countable) alternative notion—different notions for different results. Often, this effort is neglected, and people use IEEE floats where the mathematician said “real”, and get ridiculous results—or worse, dangerously reasonable results.
3) You’re right that the finite amount of sensor data will only say it is consistent with this interval. As you point out, if there are an uncountable set within that interval, then it’s entirely possible for there to be no single value that is a maximum of the probability distribution function. (That’s an excellent example of some of the ridiculous circumlocutions that come from using uncountable sets, when you actually want the system to come up with one or a few best hypotheses, each of which is stateable.)
Pi is always a finite-information entity. Everything nameable is. It doesn’t become an infinitely large in information content just because you consider it as an element of the reals.
Yes—if you use a probability distribution over the rationals as your prior, and the actual value is irrational, then you can get bad results. I think this is a serious problem, and we should think hard about what bayesian updating with misspecified models looks like (I know Cosma Shalizi has done some work on this), so that we have some idea what failure looks like. We should also think carefully about what we would consider to be a reasonable hypothesis, one that we might eventually come to rest on.
However, it’s a false fork to argue “We wouldn’t use the rationals therefore we should use the reals”. As I’ve been trying to say, the reals are a particular, large, complicated, and deeply historical construction, and we should not expect to encounter them “out in the world”.
Andrej Bauer has implemented actual real number arithmetic (not IEEE nonsense, or “computable reals” which are interesting, but not reals). Peano integers, in his (Ocaml-based) language, RZ, would probably be five or ten lines. (Commutative groups are 13 lines). In contrast, building from commutative groups up to defining reals as sequences of nested intervals takes five pages; see the appendix: http://math.andrej.com/wp-content/uploads/2007/04/rzreals.pdf
Regarding “reminding you of things you didn’t think of”, I think Cosma Shalizi and Andrew Gelman have convincingly argued that Bayesian philosophy/methodology is flawed—we don’t just pick a prior, collect data, do an update, and believe the results. If we were magical uncomputable beasties (Solomonoff induction), possibly that is what we would do. In the real world, there are other steps, including examining the data, including the residual errors, to see if it suggests hypotheses that weren’t included in the original prior. http://www.stat.columbia.edu/~gelman/research/unpublished/philosophy.pdf
Hi John! Thank you very much for taking the time to answer at such length. The links you included were also very interesting, thanks.
I think I got a bit of insight into the original issue (way up in the comments, when I interjected in your chat with Patrick).
With respect to the points closer in this thread, it’s become more like teaching than an actual discussion. I’m much too little educated in the subject, so I could contribute mostly with questions (many inevitably naïve) rather than insights. I’ll stop here then; though I am interested, I’m not interested enough right now to educate myself, so I won’t impose on your time any longer.
(That is, not unless you want to. I can continue if for some reason you’d take pleasure in educating me further.)
Thank you again for sharing your thoughts :-)