I don’t think AGI in a few decades is very farfetched at all. There’s a heckuvalot of neuroscience being done right now (the Society for Neuroscience has 40,000 members), and while it’s probably true that much of that research is concerned most directly with mere biological “implementation details” and not with “underlying algorithms” of intelligence, it is difficult for me to imagine that there will still be no significant insights into the AGI problem after 3 or 4 more decades of this amount of neuroscience research.
Of course there will be significant insights into the AGI problem over the coming decades—probably many of them. My point was that I don’t see AGI as hard because of a lack of insights; I see it as hard because it will require vast amounts of “ordinary” intellectual labor.
I’m having trouble understanding how exactly you think the AGI problem is different from any really hard math problem. Take P != NP, for instance the attempted proof that’s been making the rounds on various blogs. If you’ve skimmed any of the discussion you can see that even this attempted proof piggybacks on “vast amounts of ‘ordinary’ intellectual labor,” largely consisting of mapping out various complexity classes and their properties and relations. There’s probably been at least 30 years of complexity theory research required to make that proof attempt even possible.
I think you might be able to argue that even if we had an excellent theoretical model of an AGI, that the engineering effort required to actually implement it might be substantial and require several decades of work (e.g. Von Neumann architecture isn’t suitable for AGI implementation, so a great deal of computer engineering has to be done).
If this is your position, I think you might have a point, but I still don’t see how the effort is going to take 1 or 2 centuries. A century is a loooong time. A century ago humans barely had powered flight.
Take P != NP, for instance the attempted proof that’s been making the rounds on various blogs. If you’ve skimmed any of the discussion you can see that even this attempted proof piggybacks on “vast amounts of ‘ordinary’ intellectual labor,
By no means do I want to downplay the difficulty of P vs NP; all the same, I think we have different meanings of “vast” in mind.
The way I think about it is: think of all the intermediate levels of technological development that exist between what we have now and outright Singularity. I would only be half-joking if I said that we ought to have flying cars before we have AGI. There are of course more important examples of technologies that seem easier than AGI, but which themselves seem decades away. Repair of spinal cord injuries; artificial vision; useful quantum computers (or an understanding of their impossibility); cures for the numerous cancers; revival of cryonics patients; weather control. (Some of these, such as vision, are arguably sub-problems of AGI: problems that would have to be solved in the course of solving AGI.)
Actually, think of math problems if you like. Surely there are conjectures in existence now—probably some of them already famous—that will take mathematicians more than a century from now to prove (assuming no Singularity or intelligence enhancement before then). Is AGI significantly easier than the hardest math problems around now? This isn’t my impression—indeed, it looks to me more analogous to problems that are considered “hopeless”, like the “problem” of classifying all groups, say.
By no means do I want to downplay the difficulty of P vs NP; all the same, I think we have different meanings of “vast” in mind.
I hate to go all existence proofy on you, but we have an existence proof of a general intelligence—accidentally sneezed out by natural selection, no less, which has severe trouble building freely rotating wheels—and no existence proof of a proof of P != NP. I don’t know much about the field, but from what I’ve heard, I wouldn’t be too surprised if proving P != NP is harder than building FAI for the unaided human mind. I wonder if Scott Aaronson would agree with me on that, even though neither of us understand the other’s field? (I just wrote him an email and asked, actually; and this time remembered not to say my opinion before asking for his.)
After glancing over a 100-page proof that claimed to solve the biggest problem in computer science, Scott Aaronson bet his house that it was wrong. Why?
What I find interesting is that the pattern nearly always goes the other way: you’re more likely to think that a celebrated problem you understand well is harder than one you don’t know much about. It says a lot about both Eliezer’s and Scott’s rationality that they think of the other guy’s hard problems as even harder than their own.
As for existence proof of a general intelligence, that doesn’t prove anything about how difficult it is, for anthropic reasons. For all we know 10^20 evolutions each in 10^50 universes that would in principle allow intelligent life might on average result in 1 general intelligence actually evolving.
Of course, if you buy the self-indication assumption (which I do not) or various other related principles you’ll get an update that compels belief in quite frequent life (constrained by the Fermi paradox and a few other things).
More relevantly, approaches like Robin’s Hard Step analysis and convergent evolution (e.g. octopus/bird intelligence) can rule out substantial portions of “crazy-hard evolution of intelligence” hypothesis-space. And we know that human intelligence isn’t so unstable as to see it being regularly lost in isolated populations, as we might expect given ludicrous anthropic selection effects.
We can make better guesses than that: evolution coughed up quite a few things that would be considered pretty damn intelligent for a computer program, like ravens, octopuses, rats or dolphins.
Not independently (not even cephalopods, at least completely). And we have no way of estimating the difference in difficulty between that level of intelligence and general intelligence other than evolutionary history (which for anthropic reasons could be highly untypical), and similarity in makeup, but already know that our type of nervous system is capable of supporting general intelligence, most rat level intelligences might hit fundamental architectural problems first.
We can always estimate, even with very little knowledge—we’ll just have huge error margins. I agree it is possible that “For all we know 10^20 evolutions each in 10^50 universes that would in principle allow intelligent life might on average result in 1 general intelligence actually evolving”, I would just bet on a much higher probability than that, though I agree with the principle.
The evidence that pretty smart animals exist in distant branches of the tree of life, and in different environments is weak evidence that intelligence is “pretty accessible” in evolution’s search space. It’s stronger evidence than the mere fact that we, intelligent beings, exist.
Intelligence sure. The original point was that our existence doesn’t put a meaningful upper bound on the difficultly of general intelligence. Cephalopods are good evidence that given whatever rudimentary precursors of a nervous system our common ancestor had (I know it had differentiated cells, but I’m not sure what else. I think it didn’t really have organs like higher animals, let alone anything that really qualified as a nervous system) cephalopod level intelligence is comparatively easy, having evolved independently two times. It doesn’t say anything about how much more difficult general intelligence is compared to cephalopod intelligence, nor whether whatever precursors to a nervous system our common ancestor had were unusually conductive to intelligence compared to the average of similar complex evolved beings.
If I had to guess I would assume cephalopod level intelligence within our galaxy and a number of general intelligences somewhere outside our past light cone. But that’s because I already think of general intelligence as not fantastically difficult independently of the relevance of the existence proof.
Hox genes suggest that they both had a modular body plan of some sort. Triploblasty implies some complexity (the least complex triploblastic organism today is a flatworm).
I’d be very surprised if most recent common ancestor didn’t have neurons similar to most neurons today, as I’ve had a hard time finding out the differences between the two. A basic introduction to nervous systems suggests they are very similar.
Well, I for one strongly hope that we resolve whether P = NP before we have AI since a large part of my estimate for the probability of AI being able to go FOOM is based on how much of the complexity hierarchy collapses. If there’s heavy collapse, AI going FOOM Is much more plausible.
I don’t know much about the field, but from what I’ve heard, I wouldn’t be too surprised if proving P != NP is harder than building FAI for the unaided human mind
Well actually, after thinking about it, I’m not sure I would either. There is something special about P vs NP, from what I understand, and I didn’t even mean to imply otherwise above; I was only disputing the idea that “vast amounts” of work had already gone into the problem, for my definition of “vast”.
Scott Aaronson’s view on this doesn’t move my opinion much (despite his large contribution to my beliefs about P vs NP), since I think he overestimates the difficulty of AGI (see your Bloggingheads diavlog with him).
I don’t know much about the field, but from what I’ve heard, I wouldn’t be too surprised if proving P != NP is harder than building FAI for the unaided human mind.
Awesome! Be sure to let us know what he thinks. Sounds unbelievable to me though, but what do I know.
A ‘few clues’ sounds like a gross underestimation. It is the only working example, so it certainly contains all the clues, not just a few. The question of course is how much of a shortcut is possible. The answer to date seems to be: none to slim.
I agree engineers reverse engineering will succeed way ahead of full emulation, that wasn’t my point.
If information is not extracted and used, it doesn’t qualify as being a “clue”.
The question of course is how much of a shortcut is possible.
The answer to date seems to be: none to slim.
The search oracles and stockmarketbot makers have paid precious little attention to the brain. They are based on engineering principles instead.
I agree engineers reverse engineering will succeed way ahead of full emulation,
Most engineers spend very little time on reverse-engineering nature. There is a little “bioinspiration”—but inspiration is a bit different from wholescale copying.
but I still don’t see how the effort is going to take 1 or 2 centuries. A century is a loooong time.
I think the following quote is illustrative of the problems facing the field:
After [David Marr] joined us, our team became the most famous vision group in the world, but the one with the fewest results. His idea was a disaster. The edge finders they have now using his theories, as far as I can see, are slightly worse than the ones we had just before taking him on. We’ve lost twenty years.
-Marvin Minsky, quoted in “AI” by Daniel Crevier.
Some notes and interpretation of this comment:
Most vision researchers, if asked who is the most important contributor to their field, would probably answer “David Marr”. He set the direction for subsequent research in the field; students in introductory vision classes read his papers first.
Edge detection is a tiny part of vision, and vision is a tiny part of intelligence, but at least in Minsky’s view, no progress (or reverse progress) was achieved in twenty years of research by the leading lights of the field.
There is no standard method for evaluating edge detector algorithms, so it is essentially impossible to measure progress in any rigorous way.
I think this kind of observation justifies AI-timeframes on the order of centuries.
Edge detection is rather trivial. Visual recognition however is not, and there certainly are benchmarks and comparable results in that field. Have you browsed the recent pubs of Poggio et al at MIT vision lab? There is lots of recent progress, with results matching human levels for quick recognition tasks.
Also, vision is not a tiny part of intelligence. Its the single largest functional component of the cortex, by far. The cortex uses the same essential low-level optimization algorithm everywhere, so understanding vision at the detailed level is a good step towards understanding the whole thing.
And finally and most relevant for AGI, the higher visual regions also give us the capacity for visualization and are critical for higher creative intelligence. Literally all scientific discovery and progress depends on this system.
“visualization is the key to enlightenment” and all that
It’s only trivial if you define an “edge” in a trivial way, e.g. as a set of points where the intensity gradient is greater than a certain threshold. This kind of definition has little use: given a picture of a tree trunk, this definition will indicate many edges corresponding to the ridges and corrugations of the bark, and will not highlight the meaningful edge between the trunk and the background.
I don’t believe that there is much real progress recently in vision. I think the state of the art is well illustrated by the “racist” HP web camera that detects white faces but not black faces.
Also, vision is not a tiny part of intelligence [...] The cortex uses the same essential low-level optimization algorithm everywhere,
I actually agree with you about this, but I think most people on LW would disagree.
Whether you are talking about canny edge filters, gabor like edge detection more similar to what V1 self-organizes into, they are all still relatively simple—trivial compared to AGI. Trivial as in something you code in a few hours for your screen filter system in a modern game render engine.
The particular problem you point out with the tree trunk is a scale problem and is easily handled in any good vision system.
An edge detection filter is just a building block, its not the complete system.
In HVS, initial edge preprocessing is done in the retina itself which essentially does on-center, off-surround gaussian filters (similar to low-pass filters in photoshop). The output of the retina is thus essentially a multi-resolution image set, similar to a wavelet decomposition. The image output at this stage becomes a series of edge differences (local gradients), but at numerous spatial scales.
The high frequency edges such as the ridges and corrugations of the bark are cleanly separated from the more important low frequency edges separating the tree trunk from the background. V1 then detects edge orientations at these various scales, and higher layers start recognizing increasingly complex statistical patterns of edges across larger fields of view.
Whether there is much real progress recently in computer vision is relative to one’s expectations, but the current state of the art in research systems at least is far beyond your simplistic assessment. I have a layman’s overview of HVS here. If you really want to know about the current state of the art in research, read some recent papers from a place like Poggio’s lab at MIT.
In the product space, the HP web camera example is also very far from the state of the art, I’m surprised that you posted that.
There is free eye tracking software you can get (running on your PC) that can use your web cam to track where your eyes are currently focused in real time. That’s still not even the state of the art in the product space—that would probably be the systems used in the more expensive robots, and of course that lags the research state of the art.
AIXI’s contribution is more philosophical than practical. I find a depressing over-emphasis of bayesian probability theory here as the ‘math’ of choice vs computational complexity theory, which is the proper domain.
The most likely outcome of a math breakthrough will be some rough lower and or upper bounds on the shape of the intelligence over space/time complexity function. And right now the most likely bet seems to be that the brain is pretty well optimized at the circuit level, and that the best we can do is reverse engineer it.
EY and the math folk here reach a very different conclusion, but I have yet to find his well considered justification. I suspect that the major reason the mainstream AI community doesn’t subscribe to SIAI’s math magic bullet theory is that they hold the same position outline above: ie that when we get the math theorems, all they will show is what we already suspect: human level intelligence requires X memory bits and Y bit ops/second, where X and Y are roughly close to brain levels.
This, if true, kills the entirety of the software recursive self-improvement theory. The best that software can do is approach the theoretical optimum complexity class for the problem, and then after that point all one can do is fix it into hardware for a further large constant gain.
right now the most likely bet seems to be that the brain is pretty well optimized
at the circuit level, and that the best we can do is reverse engineer it.
That seems like crazy talk to me. The brain is not optimal—not its hardware or software—and not by a looooong way! Computers have already steam-rollered its memory and arithmetic -units—and that happened before we even had nanotechonolgy computing components. The rest of the brain seems likely to follow.
Edit: removed a faulty argument at the end pointed out by wedrifid.
I am talking about optimality for AGI in particular with respect to circuit complexity, with the typical assumptions that a synapse is vaguely equivalent to a transistor, maybe ten transistors at most. If you compare on that level, the brain looks extremely efficient given how slow the neurons are. Does this make sense?
The brain’s circuits have around 10^15 transistor equivalents, and a speed of 10^3 cycles per second. 10^18 transistor cycles / second
A typical modern CPU has 10^9 transistors, with a speed of 10^9 cycles per second. 10^18 transistor cycles / second
Our CPU’s strength is not their circuit architecture or software—its the raw speed of CMOS, its a million X substrate advantage. The learning algorithm, the way in which the cortex rewires in response to input data, appears to be a pretty effective universal learning algorithm.
The brain’s architecture is a joke. It is as though a telecoms engineer decided to connect a whole city’s worth of people together by running cables directly between any two people who wanted to have a chat. It hasn’t even gone fully digital yet—so things can’t easily be copied or backed up. The brain is just awful—no wonder human cognition is such a mess.
Then some questions: How long would moore’s law have to continue into the future with no success in AGI for that to show that the brain’s is well optimized for AGI at the circuit level?
I’ve taken some attempts to show rough bounds on the brain’s efficiency, are you aware of some other approach or estimate?
Then some questions: How long would moore’s law have to continue
into the future with no success in AGI for that to show that the brain’s
is well optimized for AGI at the circuit level?
Most seem to think the problem is mostly down to software—and that supercomputer hardware is enough today—in which case more hardware would not necessarily help very much. The success or failure of adding more hardware might give an indication of how hard it is to find the target of intelligence in the search space. It would not throw much light on the issue of how optimally “designed” the brain is. So: your question is a curious one.
The success or failure of adding more hardware might give an indication of how hard it is to find the target of intelligence in the search space
For every computational system and algorithm, there is a minimum level of space-time complexity in which this system can be encoded. As of yet we don’t know how close the brain is to the minimum space-time complexity design for an intelligence of similar capability.
Lets make the question more specific: whats the minimum bit representation of a human-equivalent mind? If you think the brain is far off that, how do you justify that?
Of course more hardware helps: it allows you to search through the phase space faster. Keep in mind the enormity of the training time.
I happen to believe the problem is ‘mostly down to software’, but I don’t see that as a majority view—the Moravec/Kurzweil view that we need brain-level hardware (within an order of magnitude or so) seems to be majoritive at this point.
We need brain-level hardware (within an order of magnitude or so) if machines are going to be cost-competitive with humans. If you just want a supercomputer mind, then no problem.
I don’t think Moravec or Kurzweil ever claimed it was mostly down to hardware. Moravec’s charts are of hardware capability—but that was mainly because you can easily measure that.
We need brain-level hardware (within an order of magnitude or so) if machines are going to be cost-competitive with humans.
I don’t see why that is. If you were talking about ems, then the threshhold should be 1:1 realtime. Otherwise, for most problems that we know how to program a computer to do, the computer is much faster than humans even at existing speeds. Why do you expect that a computer that’s say, 3x slower than a human (well within an order of magnitude) would be cost-competitive with humans while one that’s 10^4 times slower wouldn’t?
Evidently there are domains where computers beat humans today—but if you look at what has to happen for machines to take the jobs of most human workers, they will need bigger and cheaper brains to do that. “Within an order of magnitude or so” seems like a reasonable ballpark figure to me. If you are looking for more details about why I think that, they are not available at this time.
I suspect that the controlling reason why you think that is that you assume it takes human-like hardware to accomplish human-like tasks, and greatly underestimate the advantages of a mind being designed rather than evolved.
Lets make the question more specific: whats the minimum bit representation of a human-equivalent mind?
Way off. Let’s see… I would bet at even odds that it is 4 or more orders of magnitude off optimal.
If you think the brain is far off that, how do you justify that?
We have approximately one hundred billion neurons each and roughly the same number of glial cells (more of the latter if we are smart!). Each of those includes a full copy of our DNA, which is itself not exactly optimally compressed.
Way off. Let’s see… I would bet at even odds that it is 4 or more orders of magnitude off optimal.
you didn’t answer my question: what is your guess at minimum bit representation of a human equi mind?
you didn’t use the typical methodology of measuring the brain’s storage, nor did you provide another.
I wasn’t talking about molecular level optimization. I started with the typical assumption that synapses represent a few bits, the human brain has around 100TB to 1PB of data/circuitry, etc etc—see the singularity is near.
So you say the human brain algorithmic representation is off by 4 orders of magnitude or more—you are saying that you think a human equivalent mind can be represented in 10 to 100GB of data/circuitry?
If so, why did evolution not find that by now? It has had plenty of time to compress at the circuit level. In fact, we actually know that the brain does perform provably optimal compression on its input data in a couple of domains—see V1 and its evolution into gabor-like edge feature detection.
Evolution has had plenty of time to find a well-optimized cellular machinery based on DNA, plenty of time to find a well-optimized electro-chemical computing machinery based on top of that, and plenty of time to find well-optimized circuits within that space.
Even insects are extremely well-optimized at the circuit level—given their neuron/synapse counts, we have no evidence whatsoever to believe that vastly simpler circuits exist that can perform the same functionality.
When we have used evolutionary exploration algorithms to design circuits natively, given enough time we see similar complex, messy, but near optimal designs, and this is a general trend.
Are you saying that you are counting every copy of the DNA as information that contributes to the total amount? If so, I say that’s invalid. What if each cell were remotely controlled from a central server containing the DNA information? I can’t see that we’d count the DNA for each cell then—yet it is no different really.
I agree that the number of cells is relevant, because there will be a lot of information in the structure of an adult brain that has come from the environment, rather than just from the DNA, and more cells would seem to imply more machinery in which to put it.
Are you saying that you are counting every copy of the DNA as information that contributes to the total amount? If so, I say that’s invalid. What if each cell were remotely controlled from a central server containing the DNA information? I can’t see that we’d count the DNA for each cell then—yet it is no different really.
I thought we were talking about the efficiency of the human brain. Wasn’t that the whole point? If every cell is remotely controlled from a central server then well, that’d be whole different algorithm. In fact, we could probably scrap the brain and just run the central server.
Genes actually do matter in the functioning of neurons. Chemical additions (eg. ethanol) and changes in the environment (eg. hypoxia) can influence gene expression in cells in the brain, impacting on their function.
I suggest the brain is a ridiculously inefficient contraption thrown together by the building blocks that were practical for production from DNA representations and suitable for the kind of environments animals tended to be exposed to. We should be shocked to find that it also manages to be anywhere near optimal for general intelligence. Among other things it would suggest that evolution packed the wrong lunch.
Okay, I may have misunderstood you. It looks like there is some common ground between us on the issue of inefficiency. I think the brain would probably be inefficient as well as it has to be thrown together by the very specific kind of process of evolution—which is optimized for building things without needing look-ahead intelligence rather than achieving the most efficient results.
Then some questions: How long would moore’s law have to continue into the future with no success in AGI for that to show that the brain’s is well optimized for AGI at the circuit level?
A Sperm Whale and a bowl of Petunias.
My first impulse was to answer that Moore’s law could go forever and never produce success in AGI, since ‘AGI’ isn’t just what you get when you put enough computronium together for it to reach critical mass. But even given no improvements in understanding we could very well arrive at AGI just through ridiculous amounts of brute force. In fact, given enough space and time, randomised initial positions and possibly a steady introduction of negentropy we could produce an AGI in Conways Life.
I’ve taken some attempts to show rough bounds on the brain’s efficiency, are you aware of some other approach or estimate?
You could find some rough bounds by seeing how many parts of a human brain you can cut out without changing IQ.Trivial little things like, you know, the pre-frontal cortex.
You are just talking around my questions, so let me make it more concrete. An important task of any AGI is higher level sensor data interpretation—ie seeing. We have an example system in the human brain—the human visual system, which is currently leaps and bounds beyond the state of the art in machine vision. (although the latter is making progress towards the former through reverse engineering)
So machine vision is a subtask of AGI. What is the minimal computational complexity of human-level vision? This is a concrete computer science problem. It has a concrete answer—not “sperm whale and petunia” nonsense.
Until someone makes a system better than HVS, or proves some complexity bounds, we don’t know how optimal HVS is for this problem, but we also have no reason to believe that it is orders of magnitude off from the theoretic optimum.
Good quality general-purpose data-compression would “break the back” of the task of buliding synthetic intelligent agents—and that’s a “simple” math problem—as I explain on: http://timtyler.org/sequence_prediction/
At least it can be stated very concisely. Solutions so far haven’t been very simple—but the brain’s architecture offers considerable hope for a relatively simple solution.
I don’t think AGI in a few decades is very farfetched at all. There’s a heckuvalot of neuroscience being done right now (the Society for Neuroscience has 40,000 members), and while it’s probably true that much of that research is concerned most directly with mere biological “implementation details” and not with “underlying algorithms” of intelligence, it is difficult for me to imagine that there will still be no significant insights into the AGI problem after 3 or 4 more decades of this amount of neuroscience research.
Of course there will be significant insights into the AGI problem over the coming decades—probably many of them. My point was that I don’t see AGI as hard because of a lack of insights; I see it as hard because it will require vast amounts of “ordinary” intellectual labor.
I’m having trouble understanding how exactly you think the AGI problem is different from any really hard math problem. Take P != NP, for instance the attempted proof that’s been making the rounds on various blogs. If you’ve skimmed any of the discussion you can see that even this attempted proof piggybacks on “vast amounts of ‘ordinary’ intellectual labor,” largely consisting of mapping out various complexity classes and their properties and relations. There’s probably been at least 30 years of complexity theory research required to make that proof attempt even possible.
I think you might be able to argue that even if we had an excellent theoretical model of an AGI, that the engineering effort required to actually implement it might be substantial and require several decades of work (e.g. Von Neumann architecture isn’t suitable for AGI implementation, so a great deal of computer engineering has to be done).
If this is your position, I think you might have a point, but I still don’t see how the effort is going to take 1 or 2 centuries. A century is a loooong time. A century ago humans barely had powered flight.
By no means do I want to downplay the difficulty of P vs NP; all the same, I think we have different meanings of “vast” in mind.
The way I think about it is: think of all the intermediate levels of technological development that exist between what we have now and outright Singularity. I would only be half-joking if I said that we ought to have flying cars before we have AGI. There are of course more important examples of technologies that seem easier than AGI, but which themselves seem decades away. Repair of spinal cord injuries; artificial vision; useful quantum computers (or an understanding of their impossibility); cures for the numerous cancers; revival of cryonics patients; weather control. (Some of these, such as vision, are arguably sub-problems of AGI: problems that would have to be solved in the course of solving AGI.)
Actually, think of math problems if you like. Surely there are conjectures in existence now—probably some of them already famous—that will take mathematicians more than a century from now to prove (assuming no Singularity or intelligence enhancement before then). Is AGI significantly easier than the hardest math problems around now? This isn’t my impression—indeed, it looks to me more analogous to problems that are considered “hopeless”, like the “problem” of classifying all groups, say.
I hate to go all existence proofy on you, but we have an existence proof of a general intelligence—accidentally sneezed out by natural selection, no less, which has severe trouble building freely rotating wheels—and no existence proof of a proof of P != NP. I don’t know much about the field, but from what I’ve heard, I wouldn’t be too surprised if proving P != NP is harder than building FAI for the unaided human mind. I wonder if Scott Aaronson would agree with me on that, even though neither of us understand the other’s field? (I just wrote him an email and asked, actually; and this time remembered not to say my opinion before asking for his.)
Scott says that he thinks P != NP is easier / likely to come first.
Here an interview with Scott Aaronson:
It’s interesting that you both seem to think that your problem is easier, I wonder if there’s a general pattern there.
What I find interesting is that the pattern nearly always goes the other way: you’re more likely to think that a celebrated problem you understand well is harder than one you don’t know much about. It says a lot about both Eliezer’s and Scott’s rationality that they think of the other guy’s hard problems as even harder than their own.
Obviously not. That would be a proof of P != NP.
As for existence proof of a general intelligence, that doesn’t prove anything about how difficult it is, for anthropic reasons. For all we know 10^20 evolutions each in 10^50 universes that would in principle allow intelligent life might on average result in 1 general intelligence actually evolving.
Of course, if you buy the self-indication assumption (which I do not) or various other related principles you’ll get an update that compels belief in quite frequent life (constrained by the Fermi paradox and a few other things).
More relevantly, approaches like Robin’s Hard Step analysis and convergent evolution (e.g. octopus/bird intelligence) can rule out substantial portions of “crazy-hard evolution of intelligence” hypothesis-space. And we know that human intelligence isn’t so unstable as to see it being regularly lost in isolated populations, as we might expect given ludicrous anthropic selection effects.
I looked at Nick’s:
http://www.anthropic-principle.com/preprints/olum/sia.pdf
I don’t get it. Anyone know what is supposed to be wrong with the SIA?
We can make better guesses than that: evolution coughed up quite a few things that would be considered pretty damn intelligent for a computer program, like ravens, octopuses, rats or dolphins.
Not independently (not even cephalopods, at least completely). And we have no way of estimating the difference in difficulty between that level of intelligence and general intelligence other than evolutionary history (which for anthropic reasons could be highly untypical), and similarity in makeup, but already know that our type of nervous system is capable of supporting general intelligence, most rat level intelligences might hit fundamental architectural problems first.
We can always estimate, even with very little knowledge—we’ll just have huge error margins. I agree it is possible that “For all we know 10^20 evolutions each in 10^50 universes that would in principle allow intelligent life might on average result in 1 general intelligence actually evolving”, I would just bet on a much higher probability than that, though I agree with the principle.
The evidence that pretty smart animals exist in distant branches of the tree of life, and in different environments is weak evidence that intelligence is “pretty accessible” in evolution’s search space. It’s stronger evidence than the mere fact that we, intelligent beings, exist.
Intelligence sure. The original point was that our existence doesn’t put a meaningful upper bound on the difficultly of general intelligence. Cephalopods are good evidence that given whatever rudimentary precursors of a nervous system our common ancestor had (I know it had differentiated cells, but I’m not sure what else. I think it didn’t really have organs like higher animals, let alone anything that really qualified as a nervous system) cephalopod level intelligence is comparatively easy, having evolved independently two times. It doesn’t say anything about how much more difficult general intelligence is compared to cephalopod intelligence, nor whether whatever precursors to a nervous system our common ancestor had were unusually conductive to intelligence compared to the average of similar complex evolved beings.
If I had to guess I would assume cephalopod level intelligence within our galaxy and a number of general intelligences somewhere outside our past light cone. But that’s because I already think of general intelligence as not fantastically difficult independently of the relevance of the existence proof.
This page on the history of invertebrates) suggests that our common ancestors had bilateral symmetry, triploblastic and with hox genes.
Hox genes suggest that they both had a modular body plan of some sort. Triploblasty implies some complexity (the least complex triploblastic organism today is a flatworm).
I’d be very surprised if most recent common ancestor didn’t have neurons similar to most neurons today, as I’ve had a hard time finding out the differences between the two. A basic introduction to nervous systems suggests they are very similar.
Well, I for one strongly hope that we resolve whether P = NP before we have AI since a large part of my estimate for the probability of AI being able to go FOOM is based on how much of the complexity hierarchy collapses. If there’s heavy collapse, AI going FOOM Is much more plausible.
Well actually, after thinking about it, I’m not sure I would either. There is something special about P vs NP, from what I understand, and I didn’t even mean to imply otherwise above; I was only disputing the idea that “vast amounts” of work had already gone into the problem, for my definition of “vast”.
Scott Aaronson’s view on this doesn’t move my opinion much (despite his large contribution to my beliefs about P vs NP), since I think he overestimates the difficulty of AGI (see your Bloggingheads diavlog with him).
Awesome! Be sure to let us know what he thinks. Sounds unbelievable to me though, but what do I know.
Why is AGI a math problem? What is abstract about it?
We don’t need math proofs to know if AGI is possible. It is, the brain is living proof.
We don’t need math proofs to know how to build AGI—we can reverse engineer the brain.
There may be a few clues in there—but engineers are likely to get to the goal looong before the emulators arrive—and engineers are math-friendly.
A ‘few clues’ sounds like a gross underestimation. It is the only working example, so it certainly contains all the clues, not just a few. The question of course is how much of a shortcut is possible. The answer to date seems to be: none to slim.
I agree engineers reverse engineering will succeed way ahead of full emulation, that wasn’t my point.
If information is not extracted and used, it doesn’t qualify as being a “clue”.
The search oracles and stockmarketbot makers have paid precious little attention to the brain. They are based on engineering principles instead.
Most engineers spend very little time on reverse-engineering nature. There is a little “bioinspiration”—but inspiration is a bit different from wholescale copying.
This is a good part of the guts of it. That bit of it is a math problem:
http://timtyler.org/sequence_prediction/
I think the following quote is illustrative of the problems facing the field:
-Marvin Minsky, quoted in “AI” by Daniel Crevier.
Some notes and interpretation of this comment:
Most vision researchers, if asked who is the most important contributor to their field, would probably answer “David Marr”. He set the direction for subsequent research in the field; students in introductory vision classes read his papers first.
Edge detection is a tiny part of vision, and vision is a tiny part of intelligence, but at least in Minsky’s view, no progress (or reverse progress) was achieved in twenty years of research by the leading lights of the field.
There is no standard method for evaluating edge detector algorithms, so it is essentially impossible to measure progress in any rigorous way.
I think this kind of observation justifies AI-timeframes on the order of centuries.
Edge detection is rather trivial. Visual recognition however is not, and there certainly are benchmarks and comparable results in that field. Have you browsed the recent pubs of Poggio et al at MIT vision lab? There is lots of recent progress, with results matching human levels for quick recognition tasks.
Also, vision is not a tiny part of intelligence. Its the single largest functional component of the cortex, by far. The cortex uses the same essential low-level optimization algorithm everywhere, so understanding vision at the detailed level is a good step towards understanding the whole thing.
And finally and most relevant for AGI, the higher visual regions also give us the capacity for visualization and are critical for higher creative intelligence. Literally all scientific discovery and progress depends on this system.
“visualization is the key to enlightenment” and all that
the visual system
It’s only trivial if you define an “edge” in a trivial way, e.g. as a set of points where the intensity gradient is greater than a certain threshold. This kind of definition has little use: given a picture of a tree trunk, this definition will indicate many edges corresponding to the ridges and corrugations of the bark, and will not highlight the meaningful edge between the trunk and the background.
I don’t believe that there is much real progress recently in vision. I think the state of the art is well illustrated by the “racist” HP web camera that detects white faces but not black faces.
I actually agree with you about this, but I think most people on LW would disagree.
Whether you are talking about canny edge filters, gabor like edge detection more similar to what V1 self-organizes into, they are all still relatively simple—trivial compared to AGI. Trivial as in something you code in a few hours for your screen filter system in a modern game render engine.
The particular problem you point out with the tree trunk is a scale problem and is easily handled in any good vision system.
An edge detection filter is just a building block, its not the complete system.
In HVS, initial edge preprocessing is done in the retina itself which essentially does on-center, off-surround gaussian filters (similar to low-pass filters in photoshop). The output of the retina is thus essentially a multi-resolution image set, similar to a wavelet decomposition. The image output at this stage becomes a series of edge differences (local gradients), but at numerous spatial scales.
The high frequency edges such as the ridges and corrugations of the bark are cleanly separated from the more important low frequency edges separating the tree trunk from the background. V1 then detects edge orientations at these various scales, and higher layers start recognizing increasingly complex statistical patterns of edges across larger fields of view.
Whether there is much real progress recently in computer vision is relative to one’s expectations, but the current state of the art in research systems at least is far beyond your simplistic assessment. I have a layman’s overview of HVS here. If you really want to know about the current state of the art in research, read some recent papers from a place like Poggio’s lab at MIT.
In the product space, the HP web camera example is also very far from the state of the art, I’m surprised that you posted that.
There is free eye tracking software you can get (running on your PC) that can use your web cam to track where your eyes are currently focused in real time. That’s still not even the state of the art in the product space—that would probably be the systems used in the more expensive robots, and of course that lags the research state of the art.
...but you don’t really know—right?
You can’t say with much confidence that there’s no AIXI-shaped magic bullet.
That’s right; I’m not an expert in AI. Hence I am describing my impressions, not my fully Aumannized Bayesian beliefs.
AIXI-shaped magic bullet?
AIXI’s contribution is more philosophical than practical. I find a depressing over-emphasis of bayesian probability theory here as the ‘math’ of choice vs computational complexity theory, which is the proper domain.
The most likely outcome of a math breakthrough will be some rough lower and or upper bounds on the shape of the intelligence over space/time complexity function. And right now the most likely bet seems to be that the brain is pretty well optimized at the circuit level, and that the best we can do is reverse engineer it.
EY and the math folk here reach a very different conclusion, but I have yet to find his well considered justification. I suspect that the major reason the mainstream AI community doesn’t subscribe to SIAI’s math magic bullet theory is that they hold the same position outline above: ie that when we get the math theorems, all they will show is what we already suspect: human level intelligence requires X memory bits and Y bit ops/second, where X and Y are roughly close to brain levels.
This, if true, kills the entirety of the software recursive self-improvement theory. The best that software can do is approach the theoretical optimum complexity class for the problem, and then after that point all one can do is fix it into hardware for a further large constant gain.
I explore this a little more here
That seems like crazy talk to me. The brain is not optimal—not its hardware or software—and not by a looooong way! Computers have already steam-rollered its memory and arithmetic -units—and that happened before we even had nanotechonolgy computing components. The rest of the brain seems likely to follow.
Edit: removed a faulty argument at the end pointed out by wedrifid.
I am talking about optimality for AGI in particular with respect to circuit complexity, with the typical assumptions that a synapse is vaguely equivalent to a transistor, maybe ten transistors at most. If you compare on that level, the brain looks extremely efficient given how slow the neurons are. Does this make sense?
The brain’s circuits have around 10^15 transistor equivalents, and a speed of 10^3 cycles per second. 10^18 transistor cycles / second
A typical modern CPU has 10^9 transistors, with a speed of 10^9 cycles per second. 10^18 transistor cycles / second
Our CPU’s strength is not their circuit architecture or software—its the raw speed of CMOS, its a million X substrate advantage. The learning algorithm, the way in which the cortex rewires in response to input data, appears to be a pretty effective universal learning algorithm.
The brain’s architecture is a joke. It is as though a telecoms engineer decided to connect a whole city’s worth of people together by running cables directly between any two people who wanted to have a chat. It hasn’t even gone fully digital yet—so things can’t easily be copied or backed up. The brain is just awful—no wonder human cognition is such a mess.
Nothing you wrote lead me to this conclusion.
Then some questions: How long would moore’s law have to continue into the future with no success in AGI for that to show that the brain’s is well optimized for AGI at the circuit level?
I’ve taken some attempts to show rough bounds on the brain’s efficiency, are you aware of some other approach or estimate?
Most seem to think the problem is mostly down to software—and that supercomputer hardware is enough today—in which case more hardware would not necessarily help very much. The success or failure of adding more hardware might give an indication of how hard it is to find the target of intelligence in the search space. It would not throw much light on the issue of how optimally “designed” the brain is. So: your question is a curious one.
For every computational system and algorithm, there is a minimum level of space-time complexity in which this system can be encoded. As of yet we don’t know how close the brain is to the minimum space-time complexity design for an intelligence of similar capability.
Lets make the question more specific: whats the minimum bit representation of a human-equivalent mind? If you think the brain is far off that, how do you justify that?
Of course more hardware helps: it allows you to search through the phase space faster. Keep in mind the enormity of the training time.
I happen to believe the problem is ‘mostly down to software’, but I don’t see that as a majority view—the Moravec/Kurzweil view that we need brain-level hardware (within an order of magnitude or so) seems to be majoritive at this point.
We need brain-level hardware (within an order of magnitude or so) if machines are going to be cost-competitive with humans. If you just want a supercomputer mind, then no problem.
I don’t think Moravec or Kurzweil ever claimed it was mostly down to hardware. Moravec’s charts are of hardware capability—but that was mainly because you can easily measure that.
I don’t see why that is. If you were talking about ems, then the threshhold should be 1:1 realtime. Otherwise, for most problems that we know how to program a computer to do, the computer is much faster than humans even at existing speeds. Why do you expect that a computer that’s say, 3x slower than a human (well within an order of magnitude) would be cost-competitive with humans while one that’s 10^4 times slower wouldn’t?
Evidently there are domains where computers beat humans today—but if you look at what has to happen for machines to take the jobs of most human workers, they will need bigger and cheaper brains to do that. “Within an order of magnitude or so” seems like a reasonable ballpark figure to me. If you are looking for more details about why I think that, they are not available at this time.
I suspect that the controlling reason why you think that is that you assume it takes human-like hardware to accomplish human-like tasks, and greatly underestimate the advantages of a mind being designed rather than evolved.
Way off. Let’s see… I would bet at even odds that it is 4 or more orders of magnitude off optimal.
We have approximately one hundred billion neurons each and roughly the same number of glial cells (more of the latter if we are smart!). Each of those includes a full copy of our DNA, which is itself not exactly optimally compressed.
you didn’t answer my question: what is your guess at minimum bit representation of a human equi mind?
you didn’t use the typical methodology of measuring the brain’s storage, nor did you provide another.
I wasn’t talking about molecular level optimization. I started with the typical assumption that synapses represent a few bits, the human brain has around 100TB to 1PB of data/circuitry, etc etc—see the singularity is near.
So you say the human brain algorithmic representation is off by 4 orders of magnitude or more—you are saying that you think a human equivalent mind can be represented in 10 to 100GB of data/circuitry?
If so, why did evolution not find that by now? It has had plenty of time to compress at the circuit level. In fact, we actually know that the brain does perform provably optimal compression on its input data in a couple of domains—see V1 and its evolution into gabor-like edge feature detection.
Evolution has had plenty of time to find a well-optimized cellular machinery based on DNA, plenty of time to find a well-optimized electro-chemical computing machinery based on top of that, and plenty of time to find well-optimized circuits within that space.
Even insects are extremely well-optimized at the circuit level—given their neuron/synapse counts, we have no evidence whatsoever to believe that vastly simpler circuits exist that can perform the same functionality.
When we have used evolutionary exploration algorithms to design circuits natively, given enough time we see similar complex, messy, but near optimal designs, and this is a general trend.
Are you saying that you are counting every copy of the DNA as information that contributes to the total amount? If so, I say that’s invalid. What if each cell were remotely controlled from a central server containing the DNA information? I can’t see that we’d count the DNA for each cell then—yet it is no different really.
I agree that the number of cells is relevant, because there will be a lot of information in the structure of an adult brain that has come from the environment, rather than just from the DNA, and more cells would seem to imply more machinery in which to put it.
I thought we were talking about the efficiency of the human brain. Wasn’t that the whole point? If every cell is remotely controlled from a central server then well, that’d be whole different algorithm. In fact, we could probably scrap the brain and just run the central server.
Genes actually do matter in the functioning of neurons. Chemical additions (eg. ethanol) and changes in the environment (eg. hypoxia) can influence gene expression in cells in the brain, impacting on their function.
I suggest the brain is a ridiculously inefficient contraption thrown together by the building blocks that were practical for production from DNA representations and suitable for the kind of environments animals tended to be exposed to. We should be shocked to find that it also manages to be anywhere near optimal for general intelligence. Among other things it would suggest that evolution packed the wrong lunch.
Okay, I may have misunderstood you. It looks like there is some common ground between us on the issue of inefficiency. I think the brain would probably be inefficient as well as it has to be thrown together by the very specific kind of process of evolution—which is optimized for building things without needing look-ahead intelligence rather than achieving the most efficient results.
A Sperm Whale and a bowl of Petunias.
My first impulse was to answer that Moore’s law could go forever and never produce success in AGI, since ‘AGI’ isn’t just what you get when you put enough computronium together for it to reach critical mass. But even given no improvements in understanding we could very well arrive at AGI just through ridiculous amounts of brute force. In fact, given enough space and time, randomised initial positions and possibly a steady introduction of negentropy we could produce an AGI in Conways Life.
You could find some rough bounds by seeing how many parts of a human brain you can cut out without changing IQ.Trivial little things like, you know, the pre-frontal cortex.
You are just talking around my questions, so let me make it more concrete. An important task of any AGI is higher level sensor data interpretation—ie seeing. We have an example system in the human brain—the human visual system, which is currently leaps and bounds beyond the state of the art in machine vision. (although the latter is making progress towards the former through reverse engineering)
So machine vision is a subtask of AGI. What is the minimal computational complexity of human-level vision? This is a concrete computer science problem. It has a concrete answer—not “sperm whale and petunia” nonsense.
Until someone makes a system better than HVS, or proves some complexity bounds, we don’t know how optimal HVS is for this problem, but we also have no reason to believe that it is orders of magnitude off from the theoretic optimum.
The article linked to in the parent is entitled:
“Created in the Likeness of the Human Mind: Why Strong AI will necessarily be like us”
Good quality general-purpose data-compression would “break the back” of the task of buliding synthetic intelligent agents—and that’s a “simple” math problem—as I explain on: http://timtyler.org/sequence_prediction/
At least it can be stated very concisely. Solutions so far haven’t been very simple—but the brain’s architecture offers considerable hope for a relatively simple solution.