Artificial Addition
Suppose that human beings had absolutely no idea how they performed arithmetic. Imagine that human beings had evolved, rather than having learned, the ability to count sheep and add sheep. People using this built-in ability have no idea how it worked, the way Aristotle had no idea how his visual cortex supported his ability to see things. Peano Arithmetic as we know it has not been invented. There are philosophers working to formalize numerical intuitions, but they employ notations such as
Plus-Of(Seven, Six) = Thirteen
to formalize the intuitively obvious fact that when you add “seven” plus “six”, of course you get “thirteen”.
In this world, pocket calculators work by storing a giant lookup table of arithmetical facts, entered manually by a team of expert Artificial Arithmeticians, for starting values that range between zero and one hundred. While these calculators may be helpful in a pragmatic sense, many philosophers argue that they’re only simulating addition, rather than really adding. No machine can really count—that’s why humans have to count thirteen sheep before typing “thirteen” into the calculator. Calculators can recite back stored facts, but they can never know what the statements mean—if you type in “two hundred plus two hundred” the calculator says “Error: Outrange”, when it’s intuitively obvious, if you know what the words mean, that the answer is “four hundred”.
Philosophers, of course, are not so naive as to be taken in by these intuitions. Numbers are really a purely formal system—the label “thirty-seven” is meaningful, not because of any inherent property of the words themselves, but because the label refers to thirty-seven sheep in the external world. A number is given this referential property by its semantic network of relations to other numbers. That’s why, in computer programs, the LISP token for “thirty-seven” doesn’t need any internal structure—it’s only meaningful because of reference and relation, not some computational property of “thirty-seven” itself.
No one has ever developed an Artificial General Arithmetician, though of course there are plenty of domain-specific, narrow Artificial Arithmeticians that work on numbers between “twenty” and “thirty”, and so on. And if you look at how slow progress has been on numbers in the range of “two hundred”, then it becomes clear that we’re not going to get Artificial General Arithmetic any time soon. The best experts in the field estimate it will be at least a hundred years before calculators can add as well as a human twelve-year-old.
But not everyone agrees with this estimate, or with merely conventional beliefs about Artificial Arithmetic. It’s common to hear statements such as the following:
“It’s a framing problem—what ‘twenty-one plus’ equals depends on whether it’s ‘plus three’ or ‘plus four’. If we can just get enough arithmetical facts stored to cover the common-sense truths that everyone knows, we’ll start to see real addition in the network.”
“But you’ll never be able to program in that many arithmetical facts by hiring experts to enter them manually. What we need is an Artificial Arithmetician that can learn the vast network of relations between numbers that humans acquire during their childhood by observing sets of apples.”
“No, what we really need is an Artificial Arithmetician that can understand natural language, so that instead of having to be explicitly told that twenty-one plus sixteen equals thirty-seven, it can get the knowledge by exploring the Web.”
“Frankly, it seems to me that you’re just trying to convince yourselves that you can solve the problem. None of you really know what arithmetic is, so you’re floundering around with these generic sorts of arguments. ‘We need an AA that can learn X’, ‘We need an AA that can extract X from the Internet’. I mean, it sounds good, it sounds like you’re making progress, and it’s even good for public relations, because everyone thinks they understand the proposed solution—but it doesn’t really get you any closer to general addition, as opposed to domain-specific addition. Probably we will never know the fundamental nature of arithmetic. The problem is just too hard for humans to solve.”
“That’s why we need to develop a general arithmetician the same way Nature did—evolution.”
“Top-down approaches have clearly failed to produce arithmetic. We need a bottom-up approach, some way to make arithmetic emerge. We have to acknowledge the basic unpredictability of complex systems.”
“You’re all wrong. Past efforts to create machine arithmetic were futile from the start, because they just didn’t have enough computing power. If you look at how many trillions of synapses there are in the human brain, it’s clear that calculators don’t have lookup tables anywhere near that large. We need calculators as powerful as a human brain. According to Moore’s Law, this will occur in the year 2031 on April 27 between 4:00 and 4:30 in the morning.”
“I believe that machine arithmetic will be developed when researchers scan each neuron of a complete human brain into a computer, so that we can simulate the biological circuitry that performs addition in humans.”
“I don’t think we have to wait to scan a whole brain. Neural networks are just like the human brain, and you can train them to do things without knowing how they do them. We’ll create programs that will do arithmetic without we, our creators, ever understanding how they do arithmetic.”
“But Gödel’s Theorem shows that no formal system can ever capture the basic properties of arithmetic. Classical physics is formalizable, so to add two and two, the brain must take advantage of quantum physics.”
“Hey, if human arithmetic were simple enough that we could reproduce it in a computer, we wouldn’t be able to count high enough to build computers.”
“Haven’t you heard of John Searle’s Chinese Calculator Experiment? Even if you did have a huge set of rules that would let you add ‘twenty-one’ and ‘sixteen’, just imagine translating all the words into Chinese, and you can see that there’s no genuine addition going on. There are no real numbers anywhere in the system, just labels that humans use for numbers...”
There is more than one moral to this parable, and I have told it with different morals in different contexts. It illustrates the idea of levels of organization, for example—a CPU can add two large numbers because the numbers aren’t black-box opaque objects, they’re ordered structures of 32 bits.
But for purposes of overcoming bias, let us draw two morals:
First, the danger of believing assertions you can’t regenerate from your own knowledge.
Second, the danger of trying to dance around basic confusions.
Lest anyone accuse me of generalizing from fictional evidence, both lessons may be drawn from the real history of Artificial Intelligence as well.
The first danger is the object-level problem that the AA devices ran into: they functioned as tape recorders playing back “knowledge” generated from outside the system, using a process they couldn’t capture internally. A human could tell the AA device that “twenty-one plus sixteen equals thirty-seven”, and the AA devices could record this sentence and play it back, or even pattern-match “twenty-one plus sixteen” to output “thirty-seven!”, but the AA devices couldn’t generate such knowledge for themselves.
Which is strongly reminiscent of believing a physicist who tells you “Light is waves”, recording the fascinating words and playing them back when someone asks “What is light made of?”, without being able to generate the knowledge for yourself. More on this theme tomorrow.
The second moral is the meta-level danger that consumed the Artificial Arithmetic researchers and opinionated bystanders—the danger of dancing around confusing gaps in your knowledge. The tendency to do just about anything except grit your teeth and buckle down and fill in the damn gap.
Whether you say, “It is emergent!”, or whether you say, “It is unknowable!”, in neither case are you acknowledging that there is a basic insight required which is possessable, but unpossessed by you.
How can you know when you’ll have a new basic insight? And there’s no way to get one except by banging your head against the problem, learning everything you can about it, studying it from as many angles as possible, perhaps for years. It’s not a pursuit that academia is set up to permit, when you need to publish at least one paper per month. It’s certainly not something that venture capitalists will fund. You want to either go ahead and build the system now, or give up and do something else instead.
Look at the comments above: none are aimed at setting out on a quest for the missing insight which would make numbers no longer mysterious, make “twenty-seven” more than a black box. None of the commenters realized that their difficulties arose from ignorance or confusion in their own minds, rather than an inherent property of arithmetic. They were not trying to achieve a state where the confusing thing ceased to be confusing.
If you read Judea Pearl’s “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference” then you will see that the basic insight behind graphical models is indispensable to problems that require it. (It’s not something that fits on a T-Shirt, I’m afraid, so you’ll have to go and read the book yourself. I haven’t seen any online popularizations of Bayesian networks that adequately convey the reasons behind the principles, or the importance of the math being exactly the way it is, but Pearl’s book is wonderful.) There were once dozens of “non-monotonic logics” awkwardly trying to capture intuitions such as “If my burglar alarm goes off, there was probably a burglar, but if I then learn that there was a small earthquake near my home, there was probably not a burglar.” With the graphical-model insight in hand, you can give a mathematical explanation of exactly why first-order logic has the wrong properties for the job, and express the correct solution in a compact way that captures all the common-sense details in one elegant swoop. Until you have that insight, you’ll go on patching the logic here, patching it there, adding more and more hacks to force it into correspondence with everything that seems “obviously true”.
You won’t know the Artificial Arithmetic problem is unsolvable without its key. If you don’t know the rules, you don’t know the rule that says you need to know the rules to do anything. And so there will be all sorts of clever ideas that seem like they might work, like building an Artificial Arithmetician that can read natural language and download millions of arithmetical assertions from the Internet.
And yet somehow the clever ideas never work. Somehow it always turns out that you “couldn’t see any reason it wouldn’t work” because you were ignorant of the obstacles, not because no obstacles existed. Like shooting blindfolded at a distant target—you can fire blind shot after blind shot, crying, “You can’t prove to me that I won’t hit the center!” But until you take off the blindfold, you’re not even in the aiming game. When “no one can prove to you” that your precious idea isn’t right, it means you don’t have enough information to strike a small target in a vast answer space. Until you know your idea will work, it won’t.
From the history of previous key insights in Artificial Intelligence, and the grand messes which were proposed prior to those insights, I derive an important real-life lesson: When the basic problem is your ignorance, clever strategies for bypassing your ignorance lead to shooting yourself in the foot.
- Lost Purposes by 25 Nov 2007 9:01 UTC; 183 points) (
- The Hidden Complexity of Wishes by 24 Nov 2007 0:12 UTC; 173 points) (
- Zombies! Zombies? by 4 Apr 2008 9:55 UTC; 114 points) (
- Zombies Redacted by 2 Jul 2016 20:16 UTC; 94 points) (
- Fake Utility Functions by 6 Dec 2007 16:55 UTC; 69 points) (
- Ghosts in the Machine by 17 Jun 2008 23:29 UTC; 68 points) (
- That Tiny Note of Discord by 23 Sep 2008 6:02 UTC; 61 points) (
- Book Review: The Root of Thought by 22 Jul 2010 8:58 UTC; 59 points) (
- Teaching the Unteachable by 3 Mar 2009 23:14 UTC; 55 points) (
- Causality and Moral Responsibility by 13 Jun 2008 8:34 UTC; 55 points) (
- CFAR’s new focus, and AI Safety by 3 Dec 2016 18:09 UTC; 51 points) (
- Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle by 14 Jul 2020 6:03 UTC; 50 points) (
- Passing the Recursive Buck by 16 Jun 2008 4:50 UTC; 48 points) (
- Logical or Connectionist AI? by 17 Nov 2008 8:03 UTC; 46 points) (
- Fake Fake Utility Functions by 6 Dec 2007 6:30 UTC; 42 points) (
- Surface Analogies and Deep Causes by 22 Jun 2008 7:51 UTC; 38 points) (
- Abstracted Idealized Dynamics by 12 Aug 2008 1:00 UTC; 37 points) (
- Grasping Slippery Things by 17 Jun 2008 2:04 UTC; 36 points) (
- Engaging First Introductions to AI Risk by 19 Aug 2013 6:26 UTC; 31 points) (
- The Art of the Artificial: Insights from ‘Artificial Intelligence: A Modern Approach’ by 25 Mar 2018 6:55 UTC; 31 points) (
- Building Something Smarter by 2 Nov 2008 17:00 UTC; 26 points) (
- A Premature Word on AI by 31 May 2008 17:48 UTC; 26 points) (
- The Rhythm of Disagreement by 1 Jun 2008 20:18 UTC; 24 points) (
- The world where LLMs are possible by 10 Jul 2023 8:00 UTC; 20 points) (
- Principles of Disagreement by 2 Jun 2008 7:04 UTC; 20 points) (
- Help us Optimize the Contents of the Sequences eBook by 19 Sep 2013 4:31 UTC; 18 points) (
- 25 Nov 2008 21:03 UTC; 11 points) 's comment on ...Recursion, Magic by (
- [SEQ RERUN] Artificial Addition by 1 Nov 2011 2:51 UTC; 8 points) (
- Rationality Reading Group: Part M: Fragile Purposes by 5 Nov 2015 2:08 UTC; 6 points) (
- 14 Nov 2023 15:10 UTC; 5 points) 's comment on When did Eliezer Yudkowsky change his mind about neural networks? by (
- 1 Aug 2008 8:59 UTC; 5 points) 's comment on Detached Lever Fallacy by (
- 14 Apr 2011 2:04 UTC; 3 points) 's comment on Eight questions for computationalists by (
- 31 Jan 2012 7:06 UTC; 3 points) 's comment on Waterfall Ethics by (
- 22 Mar 2011 16:19 UTC; 2 points) 's comment on What is wrong with mathematics education? by (
- 14 Jun 2010 13:21 UTC; 2 points) 's comment on Open Thread June 2010, Part 3 by (
- 13 Apr 2011 23:10 UTC; 1 point) 's comment on Eight questions for computationalists by (
- 13 Jun 2010 15:43 UTC; 1 point) 's comment on Artificial Addition by (
- Depth-based supercontroller objectives, take 2 by 24 Sep 2014 1:25 UTC; 1 point) (
- 24 Sep 2011 19:25 UTC; 0 points) 's comment on Syntacticism by (
- 26 Jul 2009 7:42 UTC; 0 points) 's comment on Freaky Fairness by (
- Two Dogmas of LessWrong by 15 Dec 2022 17:56 UTC; -7 points) (
Well, shooting randomly at a distant target is more likely to produce a bulls-eye than not shooting at all, even though you’re almost certainly going to miss (and probably shoot yourself in the foot while you’re at it). It’s probably better to try to find a way to take off that blindfold. As you suggest, we don’t yet understand intelligence, so there’s no way we’re going to make an intelligent machine without either significantly improving our understanding or winning the proverbial lottery.
“Programming is the art of figuring out what you want so precisely that even a machine can do it.”—Some guy who isn’t famous
Well shooting randomly is perhaps a bad idea, but I think the best we can do is shoot systematically, which is hardly better (takes exponentially many bullets). So you either have to be lucky, or hope the target isn’t very far, so you don’t need to a wide cone to take pot shots at, or hope P=NP.
quadratically many, actually.
EDIT: well, in the case of actual shooting at least.
@Doug & Gray: AGI is a William Tell target. A near miss could be very unfortunate. We can’t responsibly take a proper shot till we have an appropriate level of understanding and confidence of accuracy.
This keeps on coming up, is there somewhere this is explained in detail? Also, have possible solutions been looked at such as constructing the AI in a controlled environment? If so why wouldn’t any of them work work?
Thanks to whoever responds.
Try “The Two Faces of Tomorrow”, by James P. Hogan. Fictional evidence, to be sure, but well thought out fiction that demonstrates the problem well.
Eliezer,
Did you include your own answer to the question of why AI hasn’t arrived yet in the list? :-)
This is a nice post. Another way of stating the moral might be: “If you want to understand something, you have to stare your confusion right in the face; don’t look away for a second.”
So, what is confusing about intelligence? That question is problematic: a better one might be “what isn’t confusing about intelligence?”
Here’s one thing I’ve pondered at some length. The VC theory states that in order to generalize well a learning machine must implement some form of capacity control or regularization, which roughly means that the model class it uses must have limited complexity (VC dimension). This is just Occam’s razor.
But the brain has on the order of 10^12 synapses, and so it must be enormously complex. How can the brain generalize, if it has so many parameters? Are the vast majority of synaptic weights actually not learned, but rather preset somehow? Or, is regularization implemented in some other way, perhaps by applying random changes to the value of the weights (this would seem biochemically plausible)?
Also, the brain has a very high metabolic cost, so all those neurons must be doing something valuable.
This is what some philosophers have purposed, others have thought we start as a blank slate. The research into the subject has shown that babies do start with some sort of working model of things. That is we begin life with a set of preset preferences and the ability to distinguish those preferences and a basic understanding of geometric shapes.
It would be shocking if we didn’t have preset functions. Calves, for example, can walk almost straight away and swim not much longer. We aren’t going to entirely eliminate the mammalian ability to start with a set of preset features there just isn’t enough pressure to keep a few of them.
If you put a newborn whose mother had an unmedicated labor on the mother’s stomach, the baby will move up to a breast and start to feed.
Conversely, studies with newborn mammals have shown that if you deprive them of something as simple as horizontal lines, they will grow up unable to distinguish lines that approach ‘horizontalness’. So even separating the most basic evolved behavior from the most basic learned behavior is not intuitive.
The deprivation you’re talking about takes place over the course of days and weeks—it reflects the effects of (lack of) reinforcement learning, so it’s not really germane to a discussion of preset functions that manifest in the first few minutes after birth.
It’s relevant insofar as we shouldn’t make assumptions on what is and is not preset simply based on observations that take place in a “typical” environment.
Ah, a negative example. Fair point. Guess I wasn’t paying enough attention and missed the signal you meant to send by using “conversely” as the first word of your comment.
That was lazy of me, in retrospect. I find that often I’m poorer at communicating my intent than I assume I am.
Illusion of transparency strikes again!
Good point. Drink (food), breathe, scream and a couple of cute reactions to keep caretakers interested. All you need to bootstrap a human growth process. There seems to be something built in about eye contact management too—because a lack there is an early indicator that something is wrong.
Not terribly relevant to your point, but it’s likely human sense of cuteness is based on what babies do rather than the other way around.
I’d replace “human” with “mammalian”—most young mammals share a similar set of traits, even those that aren’t constrained as we are by big brains and a pelvic girdle adapted to walking upright. That seems to suggest a more basal cuteness response; I believe the biology term is “baby schema”.
Other than that, yeah.
Artificial Neural Networks have been trained with millions of parameters. There are a lot of different methods of regularization like dropconnect or sparsity constraints. But the brain does online learning. Overfitting isn’t as big of a concern because it doesn’t see the data more than once.
On the other hand, architecture matters. The most successful neural network for a given task has connections designed for the structure of that task, so that it will learn much more quickly than a fully-connected or arbitrarily connected network.
The human brain appears to have a great deal of information and structure in its architecture right off the bat.
I’m not saying that you’re wrong, but the state of the art in computer vision is weight sharing which biological NNs probably can’t do. Hyper parameters like the number of layers and how local the connections should be, are important but they don’t give that much prior information about the task.
I may be completely wrong, but I do suspect that biological NNs are far more general purpose and less “pre-programmed” than is usually thought. The learning rules for a neural network are far simpler than the functions they learn. Training neural networks with genetic algorithms is extremely slow.
Architecture of the V1 and V2 areas of the brain, which Convolutional Neural Networks and other ANNs for vision borrow heavily from, is highly geared towards vision, and includes basic filters that detect stripes, dots, corners, etc. that appear in all sorts of computer vision work. Yes, no backpropagation or weight-sharing is directly responsible for this, but the presence of local filters is still what I would call very specific architecture (I’ve studied computer vision and inspiration it draws from early vision specifically, so I can say more about this).
The way genetic algorithms tune weights in an ANN (and yes, this is an awful way to train an ANN) is very different from the way they work in actually evolving a brain; working on the genetic code that develops the brain. I’d say they are so wildly different that no conclusions from the first can be applied to the second.
During a single individual’s life, Hebbian and other learning mechanisms in the brain are distinct from gradient learning, but can achieve somewhat similar things.
The human brain appears to engage in hierarchical learning, which is what allows it to leverage huge amounts of “general case” abstract knowledge in attacking novel specific problems put before it.
That’s not how William Tell managed it. He had to practice aiming at less-dangerous targets until he became an expert, and only then did he attempt to shoot the apple.
It is not clear to me that it is desirable to prejudge what an artificial intelligence should desire or conclude, or even possible to purposefully put real constraints on it in the first place. We should simply create the god, then acknowledge the truth: that we aren’t capable of evaluating the thinking of gods.
But it shouldn’t conclude that throwing large asteroids at Yellowstone is a good idea, nor desire to do it. If you follow this strategy, you’ll doom us. Simple as that.
Adding to DanBurFoot, is there a link you want to point to that shows your real, tangible results for AI, based on your superior methodology?
I think that one of the difficulties inherent in monotonous logics comes from the fact that real numbers are not very good a representing things continuous. In order to define a single point, an infinite number of digits are needed and thus an infinite amount of information. Often mathematicians ignore this. To them, using the symbol 2 to represent a continuous quantity is the same as the symbol 2.000… which seem to make for all kinds of weird paradoxes caused by the use of, often implied, infinite digits. For example, logicians seem to be unable to make a distinction between 1.999… and 2 (where they take two as meaning 2.000...) thus two different definable real numbers represent the same point.
When using real numbers that represent continuous value, I often wonder if we shouldn’t always be using the number of digits to represent some kind of uncertainty. Using significant digits, is one of the first thing students learn in university, they are crucial for experiments of the real world, they allow us to quantify the uncertainty in the digits we write down. Yet mathematicians and logicians seem to ignore them in favor of paradoxical infinities. I wonder if by using uncertainty in this way, we might not do away with Godel’s theorem and define arithmetics within a certain amount of relative uncertainty inherent to our measuring instruments and reasoning machinery.
For what it’s worth, Benoit Essiambre, the things you have just said are nonsense. The reason logicians seem to be unable to make a distinction between 1.999… and 2 is that there is no distinction. They are not two different definable real numbers, they are the same definable real number.
Except that 1.9999… < 2
Edit: here’s the proof that I’m wrong mathematically (from the provided Wikipedia link): “Multiplication of 9 times 1 produces 9 in each digit, so 9 × 0.111… equals 0.999… and 9 × 1⁄9 equals 1, so 0.999… = 1”
No, these are two different ways of writing the same number.
The easiest example I’ve come across is:
If (1 ÷ 3 = 0.333...) and (0.999… ÷ 3 = 0.333...) then (1 = 0.999...).
Ok. Interesting.
I can see and agree that 0.999… can in the limit equal two, whereas in any finite representation would still be less than 2.
I don’t consider them to be “the same number” in that sense… even though they algebraically equate (once the limit is reached) in a theoretical framework that can encompass infinities.
ie, in maths, I’d equate them but in the “real world”—I’d treat them separately.
Edit: and reading further… it seems I’m wrong again. Of course, the whole point of putting ”...” is to represent the fact that this is the limit of the decimal expansion of 0.999… to infinity.
therefore yep, 1.999… = 2
Where my understanding failed me is that 1.999… does not in fact represent the summation of the infinite set of 1 + 0.9 + 0.09 + … which summation could, in fact, simply not be taken to its full limit. The representation “1.999...” can only represent either the set or the limit of the set, and mathematical convention has it as the latter, not the former.
Note also that it has to denote the limit, because we want it to denote a number, and the other object you describe (a sequence rather than a set, strictly speaking) isn’t a number, just, well, a sequence of numbers.
This is the part I take issue with.
It does not have to denote a number, but we choose to let it denote a number (rather than a sequence) because that is how mathematicians find it most convenient to use that particular representation.
That sequence is also quite useful mathematically—just not as useful as the number-that-represents-the-limit. Many sequences are considered to be useful… though generally not in algebra—it’s more common in Calculus, where such sequences are extremely useful. In fact I’d say that in calculus “just a sequence” is perhaps even more useful than “just a number”.
My first impression (and thus what I originally got wrong) was that 1.999… represented the sequence and not the limit because, really, if you meant 2, why not just say 2? :)
If we wanted to talk about the sequence we would never denote it 1.999… We would write {1, 1.9, 1.99, 1.999, …} and perhaps give the formula for the Nth term, which is 2 − 10^-N.
Hi Misha, I might also turn that argument back on you and repeat what I said before: “if you meant 2, why not just say 2?” It’s as valid as “if you meant the sequence, why not just write {1, 1.9, 1.99, 1.999, …}”?
Clearly there are other reasons for using something that is not the usual convention. There are definitely good reasons for representing infinite series or sequences… as you have pointed out. However—there is no particular reason why mathematics has chosen to use 1.999… to mean the limit, as opposed to the actual infinite series. Either one could be equally validly used in this situation.
It is only by common convention that mathematics uses it to represent the actual limit (as n tends to infinity) instead of the other possibility—which would be “the actual limit as n tends to infinity… if we actually take it to infinity, or an infinitesimal less than the limit if we don’t”, which is how I assumed (incorrectly) that it was to be used
However, the other thing you say that “we never denote it 1.999...” pulls out an interesting though, and if I grasp what you’re saying correctly, then I disagree with you.
As I’ve mentioned in another comment now—mathematical symbolic conventions are the same as “words”—they are map, not territory. We define them to mean what we want them to mean. We choose what they mean by common consensus (motivated by convenience). It is a very good idea to follow that convention—which is why I decided I was wrong to use it the way I originally assumed it was being used… and from now on, I will use the usual convention...
However, you seem to be saying that you think the current way is “the one true way” and that the other way is not valid at all… ie that “we would never denote it 1.9999...” as being some sort of basis of fact out there in reality, when really it’s just a convention that we’ve chosen, and is therefore non-obvious from looking at the symbol without the prior knowledge of the convention (as I did).
I am trying to explain that this is not the case—without knowing the convention, either meaning is valid… it’s only having now been shown the convention that I now know what is generally “by definition” meant by the symbol, and it happened to be a different way to what I automatically picked. without prior knowledge.
so yes, I think we would never denote the sequence as 1.999… but not because the sequence is not representable by 1.999… - simply because it is conventional to do so.
You have a point. I tend to dislike arguments about mathematics that start with “well, this definition is just a choice” because they don’t capture any substance about any actual math. As a result, I tried to head that off by (perhaps poorly) making a case for why this definition is a reasonable choice.
In any case, I misunderstood the nature of what you were saying about the convention, so I don’t think we’re in any actual disagreement.
If I meant 2, I would say 2. However, our system of writing repeating decimals also allows us to (redundantly) write the repeating decimal 1.999… which is equivalent to 2. It’s not a very useful repeating decimal, but it sometimes comes out as a result of an algorithm: e.g. when you multiply 2⁄9 = 0.222… by 9, you will get 1.999… as you calculate it, instead of getting 2 straight off the bat.
Me too! Especially as I’ve just been reading that sequence here about “proving by definition” and “I can define it any way I like”… that’s why I tried to make it very clear I wasn’t saying that… I also needed to head of the heading off ;)
Anyway—I believe we are just in violent agreement here, so no problems ;)
OK, let me put it this way: If we are considering the question “Is 1.999...=2?”, the context makes it clear that we must be considering the left hand side as a number, because the RHS is a number. (Would you interpret 2 in that context as the constant 2 sequence? Well then of course they’re not equal, but this is obvious and unenlightening.) Why would you compare a number for equality against a sequence? They’re entirely different sorts of objects.
is “x-squared = 2” ? is a perfectly valid question to ask in mathematics even though the LHS is not obviously an number
In this case, it is a formula that can equate to a number… just as the sequence is a (very limited) formula that can equate to 2 - if we take the sequence to its limit; or that falls just shy of 2 - if we try and represent it in any finite/limited way.
In stating that 1.9999… is a number, you are assuming the usage of the limit/number, rather than the other potential usage ie, you are falling into the same assumption-trap that I fell into… It’s just that your assumption happens to be the one that matches with common usage, whereas mine wasn’t ;)
Using 1.9999. to represent the limit of the sequence (ie the number) is certainly true by convention (ie “by definition”), but is no means the only way to interpret the symbols. It could just as easily represent the sequence itself… we just don’t happen to do that—we define what mathematical symbols refer to… they’re just the word/pointers to what we’re talking about yes?
Er… yes it is? In that context, x^2 is a number. We just don’t know what number it might be. By contrast, the sequence (1, 1.9, 1.99, …) is not a number at all.
Furthermore, even if we insist on regarding x^2 as a formula with a free variable, your analogy doesn’t hold. The sequence (1, 1.9, 1.99, …) has no free variables; it’s one specific sequence.
You are correct that the convention could have been that 1.999… represents the sequence… but as I stated before, in that case, the question of whether it equals 2 would not be very meaningful. Given the context you can deduce that we are using the convention that it designates a number.
yes I agree, a sequence is not a number, it’s sequence… though I wonder if we’re getting confused, because we’re talking about the sequence, instead of the infinite series (1 + 0.9 + 0.09 +...) which is actually what I had in my head when I was first thinking about 1.999...
Along the way, somebody said “sequence” and that’s the word I started using… when really I’ve been thinking about the infinite series.… anyway
The infinite series has far less freedom than x^2, but that doesn’t mean that it’s a different thing entirely from x^2.
Lets consider “x − 1”
“x −1 ” is not a number, until we equate it to something that lets us determine what x is…
If we use: “x −1 =4 ” however. We can solve-for-x and there are no degrees of freedom.
If we use “1.9 < x −1 < 2” we have some minor degree of freedom… and only just a few more than the infinite series in question.
Admittedly, the only degree of freedom left to 1.9999… (the series) is to either be 2 or an infinitesimal away from 2. But I don’t think that makes it different in kind to x −1 = 4
anyway—I think we’re probably just in “violent agreement” (as a friend of mine once used to say) ;)
All the bits that I was trying to really say we agree over… now we’re just discussing the related maths ;)
Ok, lets move into hypothetical land and pretend that 1.9999… represents what I originally though it represents.
The comparison with the number 2 provides the meaning that what you want to do is to evaluate the series at its limit.
It’s totally supportable for you to equate 1.9999… = 2 and determine that this is a statement that is: 1) true when the infinite series has been evaluated to the limit 2) false when it is represented in any finite/limited way
Edit: ah… that’s why you can’t use stars for to-the-power-of ;)
Er, no… there still seems to be quite a bit of confusion here...
Well, if you really think that’s not significant… :P
It’s not clear to me what distinction you’re drawing here. A series is a sequence, just written differently.
It’s not at all clear to me what notion of “degrees of freedom” you’re using here. The sequence is an entirely different sort of thing than x^2, in that one is a sequence, a complete mathematical object, while the other is an expression with a free variable. If by “degrees of freedom” you mean something like “free variables”, then the sequence has none. Now it’s true that, being a sequence of real numbers, it is a function from N to R, but there’s quite a difference between the expression 2-10^(-n), and the function (i.e. sequence) n |-> 2-10^(-n) ; yes, normally we simply write the latter as the former when the meaning is understood, but under the hood they’re quite different. In a sense, functions are mathematical, expressions are metamathematical.
When I say “x^2 is a number”, what I mean is essentially, if we’re working under a type system, then it has the type “real number”. It’s an expression with one free variable, but it has type “real number”. By contrast, the function x |-> x^2 has type “function from reals to reals”, the sequence (1, 1.9, 1.99, …) has type “sequence of reals”… (I realize that in standard mathematics we don’t actually technically work under a type system, but for practical purposes it’s a good way to think, and it’s I’m pretty sure it’s possible to sensibly formulate things this way.) To equate a sequence to a number may technically in a sense return “false”, but it’s better to think of it as returning “type error”. By contrast, equating x^2 to 2 - not equating the function x|->x^2 to 2, which is a type error! - allows us to infer that x^2 is also a number.
Note, BTW, that the real numbers don’t have any infinitesimals (save for 0, if you count it).
Sorry, what does it even mean for it to be “represented in a finite/limited way”? The alternative to it being a number is it being an infinite sequence, which is, well, infinite.
I am really getting the idea you should go read the standard stuff on this and clear up any remaining confusion that way, rather than try to argue this here...
ah—then I apologise. I need to clarify. I see that there are several points where you’ve pointed out that I am using mathematical language in a sloppy fashion. How about I get those out of the way first.
I should not have used the word “infinitesimal”—as I really meant “a very small number” and was being lazy. I am aware that “the theory of infinitesimals” has an actual mathematical meaning… but this is not the way in which I was using the word. I’ll explain what I meant in a bit..
If I write a program that starts by adding 1 to 0.9 then I put it into a loop where it then adds “one tenth of the previous number you just added”...
If at any point I tell the program “stop now and print out what you’ve got so far”… then what it will print out is something that is “a very small number” less than 2.
If I left the program running for literally an infinite amount of time, it would eventually reach two. If I stop at any point at all (ie the program is finite), then it will return a number that a very small amount less than two.
In this way, the program has generated a finite approximation of 1.999… that is != 2
As humans, we can think about the problem in a way that a stupid computer algorithm cannot, and can prove to ourselves that 1+(0.111.. * 9) actually == 2 exactly. but that is knowledge outside of the proposed “finite” solution/system as described above.
Thus the two are different “representations” of 1.999...
I am reminded of the old engineering adage that “3 is a good approximation of Pi for all practical purposes”—which tends to make some mathematicians squirm.
x^2 has one degree of freedom. x can be any real number
1 < x < 1.1 has less freedom than that. It can be any real number between 1 and 1.1
With the previous description I’ve given of the difference between the results of a “finite” and “infinite” calculation of the limit of 1.999… (the series), “x = 1.999...” can be either 2 (if we can go to the limit or can think about it in a way outside of the summing-the-finite-series method) or a very small number less than two (if we begin calculating but have to stop calculating for some weird reason, such as running out of time before the heat-death of the universe).
The “freedom” involved here is even more limited than the freedom of 1 < x < 1.1 and would not constitute a full “degree” of freedom in the mathematical sense. But in the way that I have already mentioned above (quite understanding that this may not be the full mathematically approved way of reasoning about it)… it can have more than one value (given the previously-stated contexts) and thus may be considered to have some “freedom”. …even if it’s only between “2″ and “a very, very small distance from 2”
I’d like to think of it as a fractional degree of freedom :)
Firstly—there is no surprise that you are unfamiliar with my background… as I haven’t specifically shared it with you. But I happen to have actually started in a maths degree. I had a distinction average, but didn’t enjoy it enough… so I switched to computing. I’m certainly not a total maths expert (unlike my Dad and my maths-PhD cousin) but I would say that I’m fairly familiar with “the standard stuff”. Of course… as should be obvious—this does not mean that errors do not still slip through (as I’ve recently just clearly learned).
Secondly—with respect, I think that some of the confusion here is that you are confused as to what I’m talking about… that is totally my fault for not being clear—but it will not be cleared up by me going away and researching anything… because I think it’s more of a communication issue than a knowledge-based one.
So… back to the point at hand.
I think I get what you’re trying to say with the type-error example. But I don’t know that you quite get what I’m saying. That is probably because I’ve been saying it poorly…
I don’t know if you’ve programmed in typeless programming languages, but my original understanding is more along the lines of:
Lets say I have this object, and on the outside it’s called “1.999...”
When I ask it “how do I calculate your value?” it can reply “well, you add 1 to 0.9 and then 0.09 and then 0.009...” and it keeps going on and on… and if I write it down as it comes out… it looks just like the Infinite Series.
So then I ask it “what number do you equate to if I get to the end of all that addition?” and it says “2″ - and that looks like the Limit
I could even ask ask it “do you equal two?” and it could realise that I’m asking it to calculate its limit and say “yes”
But then I actually try the addition in the Series myself… and I go on and on and on… and each next value looks like the next number in the Sequence
but eventually I get bored and stop… and the number I have is not quite 2… almost, but not quite… which is the Finite Representation that I keep talking about.
Then you can see that this object matches all the properties that I have mentioned in my previous discussion… no type-errors required, and each “value” comes naturally from the given context.
That “object” is what I have in my head when I’m talking about something that can be both the number and the sequence, and in which it can reveal the properties of itself depending on how you ask it.
...it’s also a reasonably good example of duck-typing ;)
We want it to denote a number for simple consistency. .11111… is a number. It is a limit. 3.14159… should denote a number. Why should 1.99999?… Be any different? If we are going to be at all consistent in our notation they should all represent the same sort of series. Otherwise this is extremely irregular notation to no end.
Yes, I totally agree with you: consistency and convenience are why we have chosen to use 1.9999… notation to represent the limit, rather than the sequence.
consistency and convenience tends to drive most mathematical notational choices (with occasional other influences), for reasons that should be extremely obvious.
It just so happened that, o this occasion, I was not aware enough of either the actual convention, or of other “things that this notation would be consistent with” before I guessed at the meaning of this particular item of notation.
And so my guessed meaning was one of the two things that I thought would be “likely meanings” for the notation.
In this case, my guess was for the wrong one of the two.
I seem to be getting a lot of comments that are implying that I should have somehow naturally realised which of the two meanings was “correct”… and have tried very hard to explain why it is not obvious, and not somehow inevitable.
Both of my possible interpretations were potentially valid, and I’d like to insist that the sequence-one is wrong only by convention (ie maths has to pick one or the other meaning… it happens to be the most convenient for mathematicians, which happens in this case to be the limit-interpretation)… but as is clearly evidenced by the fact that there is so much confusion around the subject (ref the wikipedia page) - it is not obvious intuitively that one is “correct” and one is “not correct”.
I maintain that without knowledge of the convention, you cannot know which is the “correct” interpretation. Any assumption otherwise is simply hindsight bias.
There is no inherent meaning to a set of symbols scrawled on paper. There is no “correct” and “incorrect” way of interpreting it; only convention (unless your goal is to communicate with others). There is no Platonic Ideal of Mathematical Notation, so obviously there is no objective way to pluck the “correct” interpretation of some symbols out of the interstellar void. You are right in as far as you say that.
However, you are expected to know the meaning of the notation you use in exactly the same way that you are expected to know the meaning of the words you use. Not knowing is understandable, but observing that it is possible to not-know a convention is not a particular philosophical insight.
People guess the meanings of words and notations from context all the time. Especially when they aren’t specialists in the field in question. Lots of interested amateurs exist and read things without the benefit of years of training before hand.
Some things just lend themselves more easily to guessing the accepted-meaning than others. It is often a good idea to make things easier to guess the accepted-meaning, rather than to fail to do so, if at all possible. Make it hard to fail.
Another argument that may be more convincing on a gut level:
9x(1/9) is exactly equal to 1, correct?
Find the decimal representation of 1⁄9 using long division: 1/9=0.11111111… (note there is no different or superior way to represent this number as a decimal)
9x(1/9) = 9x(0.11111111...)=0.9999999… which we already agreed was exactly equal to 1.
Yes :)
See my previous (edited) comment above.
Oh sorry, my bad. I should have read the thread. Or the link.
No problem. It is a great proof (there aren’t many so simple and succinct). Just bad luck on timing ;)
For what it’s worth (and why do I have to pay karma to reply to this comment, I don’t get it) there is an infinitesimal difference between the two. An infinitesimal is just like infinity in that it’s not a real number. For all practical purposes it is equal to zero, but just like infinity, it has useful mathematical purposes in that it isn’t exactly equal to zero. You could plug an infinitesimal into an equation to show how close you can get to zero without actually getting there. If you just replaced it with zero the equation could come out undefined or something.
Likewise using 1.999… because of the property that it isn’t exactly equal to 2 but is practically equal to 2, could be useful.
er… I’m not sure if this is the right way to look at it.
1.999999… is 2. Exactly 2. The thing is, there is an infinitesimal difference between ‘2’ and ‘2’. 1.999999.… isn’t “Two minus epsilon”, it’s “The limit of two minus epsilon as epsilon approaches zero”, which is two.
EDIT: And to explain the following objection:
Yes, absolutely. That’s part of the point of infinity. One way of looking at certain kinds of infinity (note that there are several kinds of infinity) is that infinity is one of our placeholders for where rules break down.
This is one of those things that isn’t worth arguing over at all, but I will anyways because I’m interested. I’m probably wrong because people much smarter than me have thought about this before, but this still doesn’t make any sense to me at all.
1.9 is just 2 minus 0.1, right? And 1.99 is just 2 minus 0.01. Each time you add another 9, you are dividing the number you are subtracting by 10. No matter how many times you divide 0.1 by ten, you will never exactly reach zero. And if it’s not exactly zero, then two minus the number isn’t exactly two.
Even if you do it 3^^^3 times, it will still be more than zero. Weird things happen when you apply infinity, but can it really change a rule that is true for all finite numbers? You can say it approaches 2 but that’s not the same as it ever actually reaching it. Does this make any sense?
Interesting… three down-votes but only one soul kind enough to point out why what I said was wrong (thank you ciphergoth).
I find that quite disappointing—especially as I’ve seen some deliberate troll-baiting receive fewer down-votes.
Yes, by “take a proper shot” I meant shooting at the proper target with proper shots. And yes, practice on less-dangerous targets is necessary, but it’s not sufficient.
I agree we can’t accurately evaluate superintelligent thoughts, but that doesn’t mean we can’t or shouldn’t try to affect what it thinks or what it’s goals are.
I couldn’t do this argument justice. I encourage interested readers to read Eliezer’s paper on coherent extrapolated volition.
Nominull, I kind of agree that they are the same at the limit of infinite digits (assuming by 2 you mean 2.000...). It just seems to me that working with numbers that are subject to this kind of limit is the wrong approach to mathematics if we want maths to be tied to something real in this universe, especially when the limit is implicit and hidden in the notation.
No, by 2 I mean 1.999...
A_A
Benoit,
1,9999.… can only be the same (or equal) to 2 in some kind of imaginary world. The number 1,999… where there is an infinity of 9′s does not “exist” in so far as it cannot be “represented” in a finite amount of space or time. The only way out is to “represent” infinity by (...). So you represent something infinite by something finite, thus avoiding a serious problem. But then stating that 1,999… is equal to 2 becomes a tautology.
Of course mathematicians now are used to deal with infinities. They can manipulate them any which way they want. But in the end, infinity has no equivalent in the “real” world. It is a useful abstraction.
So back to arithmetic. We can only “count” because our physical world is a quantum world. We have units because the basic elements are units, like elementary particles. If the real world were a continuum, there would be no arithmetic. Furthermore, arithmetic is a feature of the macroscopic world. When you look closer, it breaks down. In quantum physics, 1+1 is not always equal to two. You can have many particles in the same quantum state that are indistinguishable. How do you count sheep when you can’t distinguish them?
I don’t see anything “obvious” in stating that 1+1=2. It’s only a convention. “1″ is a symbol. “2” is another symbol. Trace it back to the “real” world, and you find that to have one object plus another of the same object (but distinct) requires subtle physical conditions.
On another note, arithmetic is a recent invention for humanity. Early people couldn’t count to more than about 5, if not 3. Our brain is not that good at counting. That’s why we learn arithmetic tables by heart, and count with our fingers. We have not “evolved” as arithmeticians.
If we were on wikipedia, I could add [Citation needed] to this statement :)
Also—can you specify what you mean by “recent”: 10,000 years? 4,000 years? 800 years? Last week ?
“Trace it back to the “real” world, and you find that to have one object plus another of the same object (but distinct) requires subtle physical conditions.”
Are there objects and this notion of “same but distinct” in the “real” world? I think if you stop at objects, you haven’t traced back far enough. (By the way has there been much/any discussion of objects on LW that I’ve missed?)
I agree that infinity is an abstraction. What I’m trying to say is that this concept is often abused when it is taken as implicit in real numbers.
“We can only “count” because our physical world is a quantum world. We have units because the basic elements are units, like elementary particles. If the real world were a continuum, there would be no arithmetic.”
I don’t see it that way. In Euclid’s book, variables are assigned to segment lengths and other geometries that tie algebra to geometric interpretations. IMO, when mathematics stray away from something that can be interpreted physically it leads to confusion and errors.
What I’d like to see is a definition of real numbers that is closer to reality and that allows us to encode our knowledge of reality more efficiently. A definition that does not allow abstract limits and infinite precision. Using the “significant digits” interpretation seems to be a step in the right direction to me as all of our measurement and knowledge is subject to some kind of error bar.
We could for example, define a set of real numbers such that we always use as many digit needed so that the quantization error from the limited number of digits is under a hundred times smaller than the error in the value we are measuring. This way, the error caused by the use of this real number system would always explain less than a 1% of the variance of our measurements based on it.
This also seem to require that we distinguish mathematics on natural numbers which represent countable whole items, and mathematics that represent continuous scales which would be best represented by the real numbers system with the limited significant digits.
Now this is just an idea, I’m just an amateur mathematician but I think it could resolve a lot of issues and paradoxes in mathematics.
1.9999… = 2 is not an “issue” or a “paradox” in mathematics.
If you use a limited number of digits in your calculations, then your quantization errors can accumulate. (And suppose the quantity you are measuring is the difference of two much larger numbers.)
Of course it’s possible that there’s nothing in the real world that corresponds exactly to our so-called “real numbers”. But until we actually know what smaller-scale structure it is that we’re approximating, it would be crazy to pick some arbitrary “lower-resolution” system and hope it matches the world better. That’s doing for “finiteness” what Eliezer has somewhere or other complained about people doing for “complexity”.
″...mathematics that represent continuous scales which would be best represented by the real numbers system with the limited significant digits.”
If you limit the number of significant digits, your mathematics are discrete, not continuous. I’m guessing the concept you’re really after is the idea of computable numbers. The set of computable numbers is a dense countable subset of the reals.
“Pocket calculators work by storing a giant lookup table of arithmetical facts”.
you can’t create a lookup table without proper math.
With the graphical-network insight in hand, you can give a mathematical explanation of exactly why first-order logic has the wrong properties for the job, and express the correct solution in a compact way that captures all the common-sense details in one elegant swoop.
Consider the following example, from Menzies’s “Causal Models, Token Causation, and Processes”[*]:
An assassin puts poison in the king’s coffee. The bodyguard responds by pouring an antidote in the king’s coffee. If the bodyguard had not put the antidote in the coffee, the king would have died. On the other hand, the antidote is fatal when taken by itself and if the poison had not been poured in first, it would have killed the king. The poison and the antidote are both lethal when taken singly but neutralize each other when taken together. In fact, the king drinks the coffee and survives.
We can model this situation with the following structural equation system:
A = true G = A S = (A and G) or (not-A and not-G)
where A is a boolean variable denoting whether the Assassin put poison in the coffee or not, G is a boolean variable denoting whether the Guard put the antidote in the coffee or not, and S is a boolean variable denoting whether the king Survives or not.
According to Pearl and Halpern’s definition of actual causation, the assassin putting poison in the coffee causes the king to survive, since changing the assassin’s action changes the king’s survival when we hold the guard’s action fixed. This is clearly an incorrect account of causation.
IMO, graphical models and related techniques represent the biggest advance in thinking about causality since Lewis’s work on counterfactuals (though James Heckman disagrees, which should make us a bit more circumspect). But they aren’t the end of the line, even if we restrict our attention to manipulationist accounts of causality.
[*] The paper is found here. As an aside, I do not agree with Menzies’s proposed resolution.
Um, this sounds not correct. The assassin causes the bodyguard to add the antidote; if the bodyguard hadn’t seen the assassin do it, he wouldn’t have so added. So if you compute the counterfactual the Pearlian way, manipulating the assassin changes the bodyguard’s action as well, since the bodyguard causally descends from the assassin.
Right—and according to Pearl’s causal beam method, you would first note that the guard sustains the coffee’s (non)deadliness-state against the assassin’s action, which ultimately makes you deem the guard the cause of the king’s survival.
Furthermore, if you draw the graph the way Neel seems to suggest, then the bodyguard is adding the antidote without dependence on the actions of the assassin, and so there is no longer any reason to call one “assassin” and the other “bodyguard”, or one “poison” and the other “antidote”. The bodyguard in that model is trying to kill the king as much as the assassin is, and the assassin’s timely intervention saved the king as much as the bodyguard’s.
“But until we actually know what smaller-scale structure”.
From http://en.wikipedia.org/wiki/Planck_Length: “Combined, these two theories imply that it is impossible to measure position to a precision greater than the Planck length, or duration to a precision greater than the time a photon traveling at c would take to travel a Planck length”
Therefore, one could in fact say that all time- and distance- derived measurements can in fact be truncated to a fixed number of decimal places without losing any real precision, by using precisions based on the Planck Length. There’s no point in having precision smaller than the limits in the quote above, as anything smaller is unobservable in our current understanding of physics.
That length is approximately 1.6 x 10^-35, and the corresponding time duration is approximately 5.33702552 x 10^-44 seconds.
“When the basic problem is your ignorance, clever strategies for bypassing your ignorance lead to shooting yourself in the foot.”
I like this lesson. It rings true to me, but the problem of ego is not one to be overlooked. People like feeling smart and having the status of being a “learned” individual. It takes a lot of courage to profess ignorance in today’s academic climate. We are taught that we have such sophisticated techniques to solve really hard problems. There are armies of scientists and engineers working to advance our society every minute. But who stops and asks “if these guys (and gals) are so smart, why is it that such fundamental ignorance still exists in so many fields”? Yes, there are our current theories, but how many of them are truly impressive? How many logically follow from the context vs. how many took a truly creative breakthrough? The myth of reductionism promises steady progress, but it is the individual who gets inspired. It boils down to humility. Man is too arrogant to admit that he is still clueless on many fundamental problems. How could that possibly be true if we are all so smart in our modern age? Who amongst you will admit when something that seems very sophisticated actually makes no sense? You’ll probably just feel stupid for not understanding, but the problem is not necessarily with you. Dogma creeps into any organization of people, and science is no different. We assume our level of understanding in certain subjects applies equally to all. Until people have the courage to question very fundamental assumptions on how we approach new problems, we will not progress, or worse, we will find much work has been done on a faulty foundation. Figuring out the right question to ask is the most important hurdle of all. But who has time when we are judged not by the quality of our thought but by the quantity? Some very important minds only produced a handful of papers, but they were worth reading...
anonymous—I’d like to second that motion
I read a book on the philosophy of set theory—and I get lost right at the point where classical infinite thought was replaced by modern infinite thought. IIRC the problem was paradoxes based on infinite recursion (Zeno et. all) and finding mathematical foundations to satisfy calculus limits. Then something about Cantor, cardinality and some hand wavy ‘infinite sets are real!’.
1.999… is just an infinite set summation of finite numbers 1 + 0.9 + 0.09 + …
Now, how an infinite process on an infinite set can equal an integer is a problem I still grapple with. Classical theory said that this was nonsense since one would never finish the summation (if one were to begin). I tend to agree and I suppose one could say I see infinity as a verb and not a noun.
I suggest anyone who believes 1.999… === 2 really looks into what that means. The root of the argument isn’t “What is the number between 1.999… and 2?” but rather “Can we say that 1.999… is a sensible theoretical concept?”
It was nonsense in classical theory. Infinite sum has its own separate definition.
There are times in modern mathematics that infinite numbers are used. This is not one of them.
I doubt I’m the best at explaining what limits are, so I won’t bother. I may be able to tell you what they aren’t. They give results similar to the intuitive idea of infinite numbers, but they don’t do it in the most intuitively obvious way. They don’t use infinite numbers. They use a certain property that at most one number will have in relation to a sequence. In the case of 1, 1.9, 1.99, …, this number is two. In the case of 1, 0, 1, 0, …, there is no such number, so the series is said not to converge.
No. The question is “Can we make a sensible theoretical way to interpret the numeral 1.999..., that approximately matches our intuitions?” It wasn’t easy, but we managed it.
1.999… does not equal 2 - it just tends towards 2
For all practical purposes, you could substitute one for the other.
But in theory, you know that 1.9999… is always just below 2, even though it creeps ever closer.
If we ever found a way to magickally “reach infinity” they would finally meet… and be “equal”.
Edit: The numbers are always going to be slightly different in a finite-space, but equate to the same thing when you allow infinities. ie mathematically, in the limit, they equate to the same value, but in any finite representation, they are different.
Further Edit: According to mathematical convention, the notation “1.999...” does refer to the limit. therefore, “1.999...” strictly refers to 2 (not to any finite case that is slightly less than two).
The issue with AI has nothing to do with ignorance or arrogance. The basic problem is that intelligence can’t be meaningfully defined or meaningfully quantified. Documented fact: Richard Feynman had a measured I.Q. of 120. Documented fact: Marilyn Vos Savant had a measured I.Q. of 180 or 200, depending on which test you place more faith in. Documented fact: Feynman made a huge breakthrough in physics, Vos Savant has accomplished nothing worth mentioning in her life. I.Q. measurements fail to measure intelligence in any meaningful way.
Here’s another fact for you. Louis Terman collected a group of so-called “geniuses” sieved by their high I.Q. scores. Two future nobel prize winners, Shockley and Alvarez, got tested but discarded by Terman’s I.Q. tests and weren’t part of the group.
Question: What does this tell you about current methods for measuring intelligence?
There is no evidence that people can meaningfully define or objectively measure intelligence. Rule of thumb: if you can’t define it and you can’t measure it objectively, you can’t do science about it.
[Remainder of gigantic comment truncated by editor.]
I’m surprised nobody brought this up at the time, but it’s telling that you’ve only picked out examples of humans when discussing intelligence, not bacteria or rocks or the color blue. I submit that the property is not as unknowable as you would suggest.
Nobel prizes aren’t based only on intelligence, but also on drive, persistence and also a little luck (mainly the luck to find something interesting to work on that nobody else has yet solved).
After all “1% inspiration and 99% perspiration” yes?
Drive and persistence are part of intelligence, at least in the sense that any useful AI would have to have them. Saying it measures luck is just saying that it’s imprecise.
That said, it’s not going to measure all the different components of intelligence in the way we want.
Nobel prizes are measuring something (or, more likely, a bunch of things), but is it a good match for what we mean by intelligence?
The problem isn’t that it can’t be meaningfully defined or quantified. The problem is that it hasn’t been. I have no idea how hard it is to do that. It may very well be beyond anything any human can do, but it’s theoretically possible.
In the hypothetic universe, addition certainly could be defined, it’s just that nobody in that universe knew how.
Intelligence is a multidimensional concept that is not amenable to any single definition or quantization. Take for instance the idea of “the size of a tree.” Size could mean height, drip radius, mass, volume of smallest convex polyhedron that contains the whole organism, volume of water displaced if the tree was immersed in a tank, trunk girth at 6 feet, etc. The tallest redwood is taller than the tallest sequoia, but isn’t the sequoia bigger? Why is it bigger? Because it has greater mass? But what of the biggest banyan? It has a greater mass than both the redwood and the sequoia.
The problem with intelligence is not that it’s not quantifiable, but that different researchers use different mapping functions all the while pretending they’re measuring the exact same thing, heaping up the confusion. If you pick one specific mental activity (arithmetic, visual memory, music-compositional ability, language processing), it is rarely very difficult to measure and rank people by their adeptness. If, on the other hand, you try to come up with a “good” way to map many different intelligences together onto some scale, you’re going to be terrible at using this scale to predict individual performance at specific tasks. Further, individuals with low IQ (or other attempted measure at general intelligence) may be brilliant at specific tasks because of their low IQ in that because much of their brain is dedicated to that task, they have little left over for anything else. This is especially true of many autistic individuals.
In the end, intelligence is rather easy to define if you recognize it as the multifaceted phenomena that it is.
Better question: why do you insist that those examples are of failures to acknowledge intelligence when you also insist that we are unable to meaningfully define intelligence?
mclaren, your comment is way too long. I have truncated it and emailed you the full version. Feel free to post the comment to your blog, then post a link to the blog here.
Anonymous (re Planck scales etc.), sure you can truncate your representations of lengths at the Planck length, and likewise for your representations of times, but this doesn’t simplify your number system unless you have acceptable ways of truncating all the other numbers you need to use. And, at present, we don’t. Sure, maybe really the universe is best considered as some sort of discrete network with some funky structure on it, but that doesn’t give us any way of simplifying (or making more appropriate) our mathematics until we know just what sort of discrete network with what funky structure. (And I think every sketch-of-a-theory we currently have along those lines still uses continuously varying quantities as quantum “amplitudes”, too.)
James (re mathematics and infinite sets and suchlike), it seems unfair to criticize something as being handwavy when you demonstrably don’t remember it clearly; how do you know that the vagueness is in the thing itself rather than your recollection? There is a perfectly clear and simple definition of what a sum like 1 + 9⁄10 + 9⁄100 + … means (which, btw, is surely enough to call it “a sensible theoretical concept”), and what that particular one means is 2. If you have a different definition, or a different way of doing mathematics, that you like better, then feel free to adopt it and do mathematics that way; if you end up with a theory at least as coherent, useful and elegant as the usual one then perhaps it’ll catch on.
Anonymous (re humility, reductionism, etc.): I think your comment consisted mostly of applause lights. Science is demonstrably pretty good at questioning fundamental assumptions (consider, say, heliocentricity, relativity, quantum mechanics, continental drift); what evidence have you that more effort should go into questioning them than currently does? (Clearly some should, and does. Clearly much effort spent that way is wasted, and produces pseudoscience or merely frustration. The question is how to apportion the effort.)
Thanks g for the tip about computable numbers, that’s pretty much what I had in mind. I didn’t quite get from the wikipedia article if these numbers could or could not replace the reals for all of useful mathematics but it’s interesting indeed.
James, I share your feelings of uneasiness about infinite digits, as you said, the problem is not that these numbers will not represent the same points at the limit but that they shouldn’t be taken to the limit so readily as this doesn’t seem to add anything to mathematics but confusion.
@James:
If I recall my Newton correctly, the only way to take this “sum of an infinite series” business consistently is to interpret it as shorthand for the limit of an infinite series. (Cf. Newton’s Principia Mathematica, Lemma 2. The infinitesimally wide parallelograms are dubitably real, but the area under the curve between the sets of parallelograms is clearly a real, definite area.)
@Benoit:
Why shouldn’t we take 1.9999… as just another, needlessly complicated (if there’s no justifying context) way of writing “2”? Just as I could conceivably count “1, 2, 3, 4, d(5x)/dx, 6, 7″ if I were a crazy person.
Benquo, I see two possible reasons:
1) ‘2’ leads to confusion as to whether we are representing a real or a natural number. That is, whether we are counting discrete items or we are representing a value on a continuum. If we are counting items then ‘2’ is correct.
2) If it is clear that we are representing numbers on a continuum, I could see the number of significant digits used as an indication of the amount of uncertainty in the value. For any real problem there is always uncertainty caused by A) the measuring instrument and B) the representation system itself such as the computable numbers which are limited by a finite amount of digits (although we get to choose the uncertainty here as we choose the number of digits). This is one of the reason the infinite limits don’t seem useful to me. They don’t correspond to reality. The implicit limits seems to lead to sloppiness in dealing with uncertainty in number representation.
For example I find ambiguity in writing 1⁄3 = 0.333… However, 1.000/3.000 = 0.333 or even 1.000.../3.000...=0.333… make more sense to me as it is clear where there is uncertainty or where we are taking infinite limits.
Benoit Essiambre,
Right now Wikipedia’s article is claiming that calculus cannot be done with computable numbers, but a Google search turned up a paper from 1968 which claims that differentiation and integration can be performed on functions in the field of computable numbers. I’ll go and fix Wikipedia, I suppose.
eh? maths is well defined and well structured etc. intuitive thinking isn’t and so can’t be encoded into a computer program very easily, that was the whole point of minsky’s paper! are you a bit thick or something??
Benoit Essiambre,
You say:
“1) ‘2’ leads to confusion as to whether we are representing a real or a natural number. That is, whether we are counting discrete items or we are representing a value on a continuum.”
If I recall correctly, this “confusion” is what allowed modern, atomic chemistry. Chemical substances—measured as continuous quantities—seem to combine in simple natural-number ratios. This was the primary evidence for the existence of atoms.
What is the practical negative consequence of the confusion you’re trying to avoid?
You also say:
“2) If it is clear that we are representing numbers on a continuum, I could see the number of significant digits used as an indication of the amount of uncertainty in the value. For any real problem there is always uncertainty caused by A) the measuring instrument and B) the representation system itself such as the computable numbers which are limited by a finite amount of digits (although we get to choose the uncertainty here as we choose the number of digits). This is one of the reason the infinite limits don’t seem useful to me. They don’t correspond to reality. The implicit limits seems to lead to sloppiness in dealing with uncertainty in number representation.”
But wouldn’t good sig-fig practice round 1.999… up to something like 2.00 anyway?
Benoit, it was “Cyan” and not me who mentioned computable numbers.
Benoit, you assert that our use of real numbers leads to confusion and paradox. Please point to that confusion and paradox.
Also, how would your proposed number system represent pi and e? Or do you think we don’t need pi and e?
Well, for example, the fact that two different real represent the same point. 2.00… 1.99… , the fact that they are not computable in a finite amount of time. pi and e are quite representable within a computable number system otherwise we couldn’t reliably use pi and e on computers!
Benoit, those are two different ways of writing the same real, just like 0.333… and 1⁄3 (or 1.0/3.0, if you insist) are the same number. That’s not a paradox. 2 is a computable number, and thus so are 2.000… and 1.999..., even though you can’t write down those ways of expressing them in a finite amount of time. See the definition of a computable number if you’re confused.
1.999… = 2.000… = 2. Period.
Benoit,
In the decimal numeral system, every number with a terminating decimal representation also has a non-terminating one that ends with recurring nines. Hence, 1.999… = 2, 0.74999… = 0.75, 0.986232999… = 0.986233, etc. This isn’t a paradox, and it has nothing to do with the precision with which we measure actual real things. This sort of recurring representation happens in any positional numeral system.
You seem very confused as to the distinction between what numbers are and how we can represent them. All I can say is, these matters have been well thought out, and you’d profit by reading as much as you can on the subject and by trying to avoid getting too caught up in your preconceptions.
I could almost convince myself that you know something I don’t about the way calculators work, but after the 12-year-old comment by “best experts” was never backed up by anything, I had to jump ship. Where are you pulling this stuff?
I completely don’t understand this article, and I’ve been a (rather good) software developer for 10 years. Calculators can’t add 200 + 200? What? Huh? I don’t get it.
Their processors are also not using lookup tables. Long ago in the 70′s there was a processor that did that, but it had too many limitations.
I have no idea what the hell you’re talking about here.
Also why the fuck is my email address required? Why do weblogs do that...
To reduce the number of fake accounts (and therefore trolling and spam).
Anonymous posting (ie without a verified email address) is allowed—but can be moderated separately—and more stringently than non-anonymous accounts.
Also, if you only have one email address and set up an account and then proceed to flame/troll-bait the blog, your account (and therefore email address) can be blocked… and you will no longer be welcome to post except by going through the anonymous channel (which, as I mentioned before, will be more stringently checked).
In a perfect world ie one that did not contain trolls or flamers… such as once existed on ye ancient olde usenet (honest, there was a time when it wasn’t full of trolls!) such tactics were not required…
Accounts were not even required on OvercomingBias. From original post:
This is one of the threads that were imported to lesswrong. (Perhaps we do not need to respond to trollish questions that were posted over 2 years ago. :P)
Maybe so, but the reason would have been similar… and 2008 isn’t so olde-dayes ago that accounts were unheard of on blogs. It’s only 2 years. My blog required accounts-for-posting back then ;)
You didn’t need an account. You didn’t need to verify anything. arglebargle@floodlebock.com would have worked.
This old post led me to an interesting question: will AI find itself in the position of our fictional philosophers of addition? The basic four functions of arithmetic are so fundamental to the operation of the digital computer that an intelligence built on digital circuitry might well have no idea of how it adds numbers together (unless told by a computer scientist, of course).
Bog: You are correct. That is, you do not understand this article at all. Pay attention to the first word, “Suppose...”
We are not talking about how calculators are designed in reality. We are discussing how they are designed in a hypothetical world where the mechanism of arithmetic is not well-understood.
Did anyone else get so profoundly confused that they googled “Artificial Addition”? Only when I was half way though the bullet point list that it clicked that the whole post is a metaphor for common beliefs about AI. And that was on the second time reading, first time I gave up before that point.
“Like shooting blindfolded at a distant target”
So long as you know where the target is within five feet, it doesn’t matter how small it is, how far away it is, whether or not you’re blindfolded, or whether or not you even know how to use a bow. You’ll hit it on a natural twenty. http://www.d20srd.org/srd/combat/combatStatistics.htm#attackRoll
Logical fallacy of generalization from fictional evidence.
Damn right. And the same goes for the oft-quoted “million-to-one chances crop up nine times out of ten”.
Thread necromancy:
It occured to me that a real life example of this kind of thing is grammar. I don’t know what the grammatical rules are for which of the words “I” or “me” should be used when I refer to myself, but I can still use those words with perfect grammar in everyday life*. This may be a better example to use since it’s one that everyone can relate to.
*I do use a rule for working out whether I should say “Sarah and I” or “Sarah and me”, but that rule is just “use whichever one you would use if you were just talking about youself”. Thinking about it now I can guess at the “I/me” rule, but there’s plenty of other grammar I have no idea about.
Can we get a link to the original thread?
It this thread itself. He’s commenting on the top paragraph of the original post. (It seems like thread necromancy at LW is actually very common. It may not be a good term given the negative connotations of necromancy for many people. Maybe thread cryonic revival?)
I’d expect here we’d give necromancy positive connotations. Most of the people here seem to be against death.
I thought it’s only thread necromancy if it moves it to the front page. This website doesn’t seem to work like that.
I hope it doesn’t work like that, because I posted most of my comments on old threads.
Just because we have a specific attitude about things doesn’t mean we need to go and use terminology that has pre-existing connotations. I don’t think for example that calling cryonics “technological necromancy” or “supercold lichdom” would be helpful to getting people listen although both would be awesome names. However, Eliezer seems to disagree at least in regards to cryonics in certain narrow contexts. See his standard line when people ask about his cryonic medallion that it is a mark of his membership in the “Cult of the Severed Head.”
There’s actually a general trend in modern fantasy literature to see necromancy as less intrinsically evil. The most prominent example would be Garth Nix’s “Abhorsen” trilogy and the next most prominent would be Gail Martin’s “Chronicles of the Necromancer” series. Both have necromancers as the main protagonists. However, in this context, most of the cached thoughts about death still seem to be present. In both series, the good necromancers use their powers primarily to stop evil undead and help usher people in to accepting death and the afterlife. Someone should at some point write a fantasy novel in which there’s a good necromancer who brings people back as undead.
Posts only get put to the main page if Eliezer decides do so (which he generally does to most high ranking posts).
I dunno—I reckon you might get increased interest from the SF/F crowd. :)
Funny. I was working on something an awful lot like that back in 2000. I wasn’t terribly good at writing back then, unfortunately.
...or would they...nahh.
There should be one on whatever page you’re viewing my comment in (unless you’re doing something unusual like reading this in an rss reader)
Still, here you go: link
McDermott’s old article, “Artificial Intelligence and Natural Stupidity” is a good reference for suggestively-named tokens and algorithms.
Someone needs to teach them how to count: {}, {{}}, {{},{{}}}, {{},{{}},{{},{{}}}}...
even less esoteric: |, ||, |||, ||||, |||||, ….
Then “X” + “Y” = “XY”. For example |||| + ||| = |||||||.
It turns out the difficulty in addition is the insight that ordinals are just an unfriendly representation. One needs a map between representations in order that the addition problem becomes trivial.
Gah! Any field with a publishing requirement like that… I shudder.
And… is it me, or is this one of the stupidest discussion threads on this site?
“I don’t think we have to wait to scan a whole brain. Neural networks are just like the human brain, and you can train them to do things without knowing how they do them. We’ll create programs that will do arithmetic without we, our creators, ever understanding how they do arithmetic.”
This sort of anti-predicts the deep learning boom, but only sort of.
Fully connected networks didn’t scale effectively; researchers had to find (mostly principled, but some ad-hoc) network structures that were capable of more efficiently learning complex patterns.
Also, we’ve genuinely learned more about vision by realizing the effectiveness of convolutional neural nets.
And yet, the state of the art is to take a generalizable architecture and to scale it massively, not needing to know anything new about the domain, nor learning much new about it. So I do think Eliezer loses some Bayes points for his analogy here, as it applies to games and to language.
I’d be interested to hear whether it is the case that people were saying things like this about AGI two, three, four, five decades ago:
I’m guessing the answer is yes?
If so, who were these people? It’s looking like they are about to be vindicated. Maybe we can go give them a medal, and then learn from their thought processes.
I’ve pointed out the cases of Moravec (1997) and Shane Legg pre-DM (~2009) as saying pretty much exactly that and in the case of Legg, influencing his DM founding timeline. I am pretty sure that if you were able to go back and do a thorough survey of the connectionist literature and influenced people, you’d find more instances.
For example, yesterday I was collating my links on AI Dungeon and I ran into a 1989 text adventure talk by Doug Sharp mostly about his King of Chicago & simple world/narrative simulation approach to IF, where before discussing King, to my shock, he casually drops in Moravec’s 1988 Mind Children’s forecast for human-level compute in 2030 and compute as a prerequisite for “having this AI problem licked”, and notes
Well, I can’t disagree with that! It’s only 2021, and AI Dungeon and its imitators owe essentially nothing to the last 46 years of IF, and have to invent methodologies for neural text games from scratch. But it’s not very fun to just give up in 1986 and say you’ll sit around twiddling your thumbs for the next 33 years or so, waiting for Moore’s law to give you the most rudimentary NNs you can start experimenting with...
At first I was perplexed, thinking that Yudkovsky for some reason wants to use programs for AI, and not neural networks. This article showed me very clearly why you need to understand the general principle first, and not try to do anything now. Even if you can randomly find answers to a specific quadratic equation, it won’t solve even other quadratic equations, let alone cubic or any other problem in mathematics.