Humans don’t get 15TB of RAM, though. It’s unknown how much of the 3-6TB we do get operates essentially like RAM and how much like disk. And comparing flops between humans and a multiprocessor machine isn’t very useful. Human flops don’t correlate to state changes in a program the same way they do in a multiprocessor. The flops details are really comparing apples to oranges. Where flops matter for machines, in my opinion, is in simulating a brain. We know from universality that any Turing machine can simulate any other. But to simulate a brain in a multiprocessor, we’d have to deal with the realities of all of the computational overheads involved in simulating one type of computation with another. The Parberry paper I linked to in the OP treats this as an argument against strong A.I. by arguing that it won’t be feasible with engineering to actually build a machine fast enough to overcome the overhead of simulating a petaflops brain, even if the “petaflops” that the brain is performing are way more hollow that the flops on the multiprocessor machine.
What I mean to say is that gigaflops are plenty enough such that, with a smart software design, you can achieve software efficiency close to a petaflops (but not digital) brain. I think there is a lot of evidence for this. But it’s not at all the same with power and RAM. If Watson had the same flops as a human mind, but still had the same power and memory listed above, I still wouldn’t feel it was a “success” if Watson could do natural language tasks well enough to statistically fool humans (i.e. pass some kind of Turing test).
When you’re talking about passing a Turing test, power is absolutely key. If Watson was in the room next to me supplying replies to verbal queries, I would start to get hot from all of Watson’s waste heat. It’s great that it can answer natural language questions, but I’m saying a better goal for many reasons is to answer natural language questions in a manner that doesn’t require massive heat removal. Invest time into figuring out how to answer natural language questions on a desktop PC with 12GB RAM. If scaling the flops down to that machine causes performance limitations, then work around them. But if the answer is: “well, having such few flops causes performance limitations, so lets just make the whole thing bigger” then it’s inherently uninteresting. Evolution could only pull that lever a certain amount, which is why brain software is so impressive.
Invest time into figuring out how to answer natural language questions on a desktop PC with 12GB RAM.
“Solve the hard problem before the easy precursor problem.”
If that wasn’t what you meant, I may have misunderstood you.
Evolution could only pull that lever a certain amount, which is why brain software is so impressive.
But we aren’t even up to using the kind of processing power that evolution used. Human-level reasoning in a machine will be impressive without regard to the physical characteristics of the machine it runs on. Once the problem is well-understood, we’ll get smaller and cheaper versions.
There’s a categorical difference between “try to find a reasonable solution” and “throw money at this until it’s no longer a problem” and you’re acting like there isn’t. I already made exactly the same comments you have in the OP, where I said:
I don’t mean to criticize Watson unduly; it certainly is an impressive engineering achievement and has generated a lot of good publicity and public interest in computing. The engineering feat is impressive if for no other reason than that it is the first accomplishment of this scale, and pioneering is always hard… future Watson’s will be cheaper, faster, and more effective because of IBM’s great work on this.
But there’s a categorical difference in the two approaches. In my own field of computer vision, it’s like this: if you want to understand how face recognition works, you will study the neuroscience of primate brains and come up with compact and efficient representations of the problem that can run in a manner similar to the way primates do it. If you just want to recognize faces right now, you just concatenate every feature vector imaginable at every scale level that could conceivably be relevant and you train 10,000 SVMs over a month and then use cross-validation and mutual information to reduce that down to a “lean” set of 2,000 SVMs and there you go, you’ve overfitted a solution that still leaves face recognition as a total black box, and you use orders of magnitude more resources and time to get that solution.
It’s interesting that current researchers who spent years working on the primate brain / Barlow infomax principle idea and studied monkey face recognition at Cal Tech, and couldn’t do good face recognition for years, are now blowing face.com and other proprietary face recognition software out of the water.
There’s a categorical difference between even trying to solve the hard problem and resorting to using more resources when you have to, vs. just overblowing the whole thing and not even making an attempt at solving the hard problem. From what I know about natural language processing, machine learning, and Watson, Watson is the latter approach and its power and memory consumption reveal it to be quite unimpressive… though hopefully trying to miniaturize it will spawn interesting engineering research.
When you’re talking about passing a Turing test, power is absolutely key.
So, if we made a program that beat the Turing test, but the hardware consumed a lot of power, it would be a failure, but if we ran the program on different hardware with the exact same specs, except it was more energy efficient, it would be a success?
You’re ignoring fundamental limits of computing efficiency here. You can’t have the same specs if you have many orders of magnitude more energy efficiency. Something’s got to give. At the transistor level you can’t preserve the same amount of computation for vastly less power. This is why a petaflops human brain is not the same as a petaflops super computer. Computation is represented differently because the power constraints of a human brain force it to be. You cannot do the same amount of processing with a brain that you can do with a modern petaflops cluster. It’s the software that matters because the power constraint forces a better software design.
You could make the same arguments I have made with physical space. Watson would be more impressive if it fit inside of the real volume of an average human head. It’s just simple physics. The hardware and volume occupied are inherent to the computations it’s doing. If no attention is paid to these constraints at all, then it’s not surprising or impressive that a solution can be brute forced with stupendous resources. We’re not talking about the difference between a clean diesel sedan and a Prius. We’re talking about re-purposing a military humvee and being impressed you can use it to take the kids to soccer practice.
You’re ignoring fundamental limits of computing efficiency here.
You’re assuming we’re at the fundamental limits of computing efficiency. If we are, power is essentially instructions per second. If not, it’s less useful than instructions per second. You might as well just say instructions per second and ignore power.
For what it’s worth, I’m not unimpressed because of the computing power used. I’m unimpressed because of the inflexibility of the program. If we built a true AI using the combined might of every computer on the planet, that would be impressive. Limiting computing power makes creating intelligence harder, but no matter how much you have, it’s far from easy. Chess can be brute forced, but intelligence can’t.
(I know Watson isn’t a chess program, and is far more impressive than one, but it’s still nothing compared to true AI.)
I disagree that intelligence can’t be brute forced, at least if you don’t care about computational resources. Presumably, what we mean by ‘intelligence’ is the passing of some Turing test (otherwise, if you just define ‘intelligence’ to be passing a Turing test with some kind of “elegance” in the design of the program, then your claim is true but only because you defined it to be that way).
If computational resources truly weren’t bounded, then we could build a massively inefficient lookup table whose search grows exponentially (or worse) in the length of the input. See this paper and this one for arguments about how to bound the complexity of such a lookup table argument. This paper is also very useful and the writing style is great.
What you cannot do, however, is claim that intelligence cannot be brute forced (again, under the assumption we ignore resources), without some appeal to computational complexity theory.
In particular, the Aaronson paper points out that according to Searle’s (flawed) Chinese room argument and Ned Block’s criticisms of the Turing test, complexity theory puts us in the situation where it is exactly the efficiency of an algorithm that gives it the property we ascribe to intelligence. We only know something’s intelligent because any reasonable Turing test that “respects” human intelligence will also function like a zero-knowledge proof that any agent who can pass the test does not have an algorithm that’s exponential in the size of the input.
Watson achieves the necessary speed (for the greatly restricted test of playing Jeopardy) but as you mentioned, Watson is easy to unmask simply by asking for the second or third best answers. However, in terms of complexity theory, Watson’s program fails the Turing test detrimentally… the program’s resource efficiency is dismal compared to humans. It’s doing something ‘stupid’ like a slow lookup with some correlations and statistical search. With similar resources to a human, this approach would be doomed to failure, so IBM just scaled up the resources until this bad approach no longer failed on the desired test cases.
Thus, if you would want to use ‘intelligence’ to label a planet sized computer which solves human Turing tests by brute forcing them with tremendously outrageous resource consumption, this would be a fundamental departure from what the literature I linked above considers ‘intelligence.’ If the planet-sized computer did fancy, efficient algorithms, then its massive resources would imply it can blow away human performance. The Turing test should be testing for the “general capacity” to do something, whether by lookup table or some equally stupid way, or by efficient intelligence.
Complexity theory really plays a large large role in all this. I would also add that I see no reason not to call a massive look-up table intelligent… assuming it is implemented in some kind of hardware and architecture that’s much better than anything human’s know about. If it turned out that human minds, for example, were some kind of quantum gravity look-up table (I absolutely do not believe this at all, but just for the sake of argument) I would not instantly believe that humans are not intelligent.
Thus, if you would want to use ‘intelligence’ to label a planet sized computer which solves human Turing tests by brute forcing them with tremendously outrageous resource consumption, this would be a fundamental departure from what the literature I linked above considers ‘intelligence.’
A planet-sized computer isn’t big enough to brute-force a Turing test. At least, it isn’t big enough to build a look-up table. Actually brute forcing a Turing test would require figuring out how the human would react to each possible output, in which case you’ve already solved AI.
If you had nigh infinite computing power you could create AIXI trivially. If you had quite a bit less, but still nigh infinite, computing power and programming time you could create a lookup table. If you had a planet-sized computer, you could probably create a virtual world in which intelligence evolves, though it would be far from trivial. Anything less than that, and it’s a nigh insurmountable task.
Increasing the computing power would make it easier, in that it wouldn’t make it harder, but within any reasonable bounds it’s not going to help much.
I would also add that I see no reason not to call a massive look-up table intelligent… assuming it is implemented in some kind of hardware and architecture that’s much better than anything human’s know about.
Because regardless of software inefficiency, unless the hardware is able to produce solutions in real time, it can’t pass a Turing test. A massive look-up table would be just fine as an intelligence if it had the hardware throughput to do its exponential searches fast enough to answer in real time, the way a human does.
If you had a planet-sized computer, you could probably create a virtual world in which intelligence evolves, though it would be far from trivial.
No, not even close. A planet sized computer is not even close to being up to that task, both the Parberry paper and Nick Bostrom’s simulation argument papers refute that quantitatively.
A planet-sized computer isn’t big enough to brute-force a Turing test.
It depends on the Turing test. The Shieber paper shows that you are correct if the Turing test is “carry out a 5 minute long English conversation on unrestricted topics” but Watson says you’re wrong if the Turing test is “win at Jeopardy.” Both the test and the hardware matter.
Actually brute forcing a Turing test would require figuring out how the human would react to each possible output, in which case you’ve already solved AI.
Not true. It just requires coming up with an algorithmic shortcut that mimics a plausible human output. Just think of the Eliza chatbox that fooled people into believing it was a psychologist just by parroting their statements back as questions. Even after being told about it, many people refused to believe they had not just been talking to a psychologist … and yet, almost no one today would ascribe intelligence to Eliza.
A planet-sized computer doing dumb search in a lookup table built around some cute algorithmic tricks could mimic human output extremely well, especially in restricted domains. How would we unmask it as unintelligent? Suppose a company used it as a customer service chatbot for conversations never lasting more than 1 minute, and that its heuristics and massive search were adequate for coming up with appropriate replies to 99.999% of the conversations it would face of length up to 1 minute. As soon as you know the way its software works and that its ability is purely based on scaling up dumb software with lots of hardware, you’d declare it unintelligent. Prior to that, though, it would be indistinguishable from a human in 1 minute or less conversations.
A massive look-up table would be just fine as an intelligence if it had the hardware throughput to do its exponential searches fast enough to answer in real time, the way a human does.
Isn’t that like saying that air would be just fine as an intelligence, if you told a person the questions you were going to use during the Turing test and how much time you would take asking each one (and your hypothetical responses to their hypothetical responses, etc.) if only sound waves could be recorded and replayed at precisely the right time? Which they can be, though that is beside the point.
A look up table is functionally absolute evidence of intelligence without being at all intelligent just as air is for a Turing test between two humans.
I think my point was not clearly communicated, because that section is not relevant to this.
That section is about how just about any instance of a medium could be interpreted as just about any message by some possible minds. It picks an instance of a medium and fits a message and listener to the instance of a medium.
I am suggesting something much more normal: first you pick a particular listener (the normal practice in communication) and particular message, and I can manipulate the listener’s customary medium to give the particular listener the particular message. In this case, many different messages would be acceptable, so much the easier.
When administering a Turing test, why do you say it is the human on the other terminal that is intelligent, rather than their keyboard, or the administrator’s eyeballs? For the same reasons, a look up table is not intelligent.
When administering a Turing test, why do you say it is the human on the other terminal that is intelligent, rather than their keyboard, or the administrator’s eyeballs? For the same reasons, a look up table is not intelligent.
First of all, the “systems reply” to Searle’s Chinese room argument is exactly the argument that the whole room, Searle plus the book plus the room itself, is intelligent and does understand Chinese, regardless of whether or not Searle does. Since such a situation has never occured with human-to-human interaction, it’s never been relevant for us to reconsider whether what we think is a human interacting with us really is intelligent. It’s easy to envision a future like Blade Runner where bots are successful enough that more sophisticated tests are needed to determine if something is intelligent. And this absolutely involves speed of the hardware.
Also, how do you know that a person isn’t a lookup table?
Would you say that neurosurgery is “teaching”, if one manipulates the brain’s bits such that the patient knows a new fact?
Also, how do you know that a person isn’t a lookup table?
The probability is low that a person is a lookup table based on the rules of physics for which the probability is high. If someone is a lookup table controlled remotely from another universe with incomprehensibly more matter than ours, or similar...so what? That just means an intelligence arranged the lookup table and it did not arise by random, high-entropy coincidence, one can say this with probability as close to absolute as it gets. Whatever arranged the lookup table may have arisen by a random, high-entropy process, like evolution, but so what?
And this absolutely involves speed of the hardware.
Something arbitrarily slow may still be intelligent, by any normal meaning. More things are intelligent than pass the Turing test (unless it is merely offered as a definition) just as more things fly than are birds.
It’s easy to envision a future like Blade Runner where bots are successful enough that more sophisticated tests are needed to determine if something is intelligent.
If the laws of physics are very different than I think they are, one could fit a lookup table inside a human-sized body. That would not make it intelligent any more than expanding the size of a human brain would make it cease to be intelligent. That wouldn’t prevent a robot from operating analogously to a human other than being on a different substrate, either.
What do you mean when you say “intelligence”? If you mean something performing the same functions as what we agree is intelligence given a contrived enough situation, I agree a lookup table could perform that function.
The problem with what I think is your definition isn’t the physical impossibility of creating the lookup table, but that once the informational output to an input is informationally as complex as it will ever be, anything transformation happening afterwards isn’t reasonably intelligence. The whole system of the lookup table’s creator and the lookup table may perhaps be described as an intelligent system, but not the fingers of the creator and the lookup table alone.
I’d hate to argue over definitions, but I’m interested in “Intelligence can be brute forced” and I wonder how common you think your usage is?
More things are intelligent than pass the Turing test (unless it is merely offered as a definition)
Yes, I am only considering the Turing test as a potential definition for intelligence, and I think this is obvious from the OP and all of my comments. See Chapter 7 of David Deutsch’s new book, The Beginnings of Infinity. Something arbitrarily slow can’t pass a Turing test that depends on real time interaction, so complexity theory allows us to treat a Turing test as a zero-knowledge proof that the agent who passes it possess something computationally more tractable than a lookup table. I also dismiss the lookup tables, but the reason why is that iterating conversation in a Turing test is Bayesian evidence that the agent interacting with me can’t be using an exponentially slow lookup table.
I agree with you that a major component of intelligence is how the knowledge is embedded in the program. If the knowledge is embedded solely by some external creator, then we don’t want to label that as intelligent. But how do we detect whether creator-embedded knowledge is a likely explanation? That has to do with the hardware it is implemented on. Since Watson is implemented on such massive resources, the explanation that it produces answers from searching a store of data has more likelihood. That is more plausible because of Watson’s hardware. If Watson achieved the same results with much less capable hardware, it would make the hypothesis that Watson’s responses are “merely pre-sorted embedded knowledge” less likely (assuming I knew no details of the software that Watson used, which is one of the conditions of a Turing test).
If you tell me something can converse with me, but that it takes 340 years to formulate a response to any sentence I utter, then I strongly suspect the implementation is arranged such that it is not intelligent. Similarly, if you tell me something can converse with me, and it only takes 1 second to respond reasonably, but it requires the resources of 10,000 humans and can’t produce responses of any demonstrably better quality than humans, then I also suspect it is just a souped-up version of a stupid algorithm, and thus not intelligent.
The behavior alone is not enough. I need details of how the behavior happens, and if I’m lacking detailed explanations of the software program, then details about the hardware resources it requires also tell me something.
If the laws of physics are very different than I think they are, one could fit a lookup table inside a human-sized body. That would not make it intelligent any more than expanding the size of a human brain would make it cease to be intelligent.
But it would mean that having a conversation with a person was not conclusive evidence that he or she wasn’t a lookup table implemented in a human substrate.
Would you say that neurosurgery is “teaching”, if one manipulates the brain’s bits such that the patient knows a new fact?
Yes, absolutely. “Regular” teaching is just exactly that, but achieved more slowly by communication over a noisy channel.
To strengthen DanielLC’s point, say we have a software program capable of beating the Turing test. In room A, it runs on a standard home desktop, and it is pitted against a whole-brain-emulation using several supercomputer clusters consuming on the order of ten megawatts.
In room B, the software program is run on a custom-built piece of hardware—a standard home desktop’s components interlinked with a large collection of heating elements, heatsinks attached to these heating elements, and cooling solutions for the enormous amount of heat generated, consuming on the order of ten megawatts. It is pitted against the person whom the whole-brain-emulation copy was made from—a whole-brain-emulation running on one whole brain.
It makes no sense that in room A the software wins and in room B the brain wins.
I agree that if it’s just hooked up to heat producing elements that play absolutely no role in the computation, then that waste heat or extra power consumption is irrelevant. But that’s not the case with any computer I’ve ever heard of and certainly not with Watson. The waste heat is directly related to its compute capability.
See the papers linked in my other comment for much more rigorous dismantling of the idea that resource efficiency doesn’t matter for a Turing test.
Also, the big fundamental flaw here is when you say:
say we have a software program capable of beating the Turing test.
You’re acting like this is a function of the software with no concern for the hardware. A massive, inefficient, exponentially slow (software-wise) look-up table can beat the Turing test if you either (a) give it magically fast hardware or (b) give it extra time to finish. But this clearly doesn’t capture the “spirit” of what you want. You want software that is somehow innately efficient in the manner it solves the problem. This is why most people would say a brute force look-up table is not intelligent, but a human is. Presumably the brain does something more resource efficient to generate responses the way that it does. But all 5 minute conversations, for example, can be bounded in terms of total number of bits transmitted, and you make a giant, unwieldy (but still finite) look-up table for every possible 5 minute conversation that could ever happen, and just make something win the “have a 5 minute conversation” Turing test by doing a horrible search in that table.
When you say “make some software that beats a Turing test” this is == “make some software that does a task in a resource efficient manner”, as the Shieber and Aaronson papers point out. This is why Searle’s Chinese room argument utterly falls apart: computational complexity says you could never have large enough resources to actually build Searle’s Chinese room, nor the book for looking up Chinese characters. It’s the same with the “software” you mention. You might as well call it “magic software.”
Humans don’t get 15TB of RAM, though. It’s unknown how much of the 3-6TB we do get operates essentially like RAM and how much like disk. And comparing flops between humans and a multiprocessor machine isn’t very useful. Human flops don’t correlate to state changes in a program the same way they do in a multiprocessor. The flops details are really comparing apples to oranges. Where flops matter for machines, in my opinion, is in simulating a brain. We know from universality that any Turing machine can simulate any other. But to simulate a brain in a multiprocessor, we’d have to deal with the realities of all of the computational overheads involved in simulating one type of computation with another. The Parberry paper I linked to in the OP treats this as an argument against strong A.I. by arguing that it won’t be feasible with engineering to actually build a machine fast enough to overcome the overhead of simulating a petaflops brain, even if the “petaflops” that the brain is performing are way more hollow that the flops on the multiprocessor machine.
Yeah, it’s not that useful, but power consumption is much less useful.
What I mean to say is that gigaflops are plenty enough such that, with a smart software design, you can achieve software efficiency close to a petaflops (but not digital) brain. I think there is a lot of evidence for this. But it’s not at all the same with power and RAM. If Watson had the same flops as a human mind, but still had the same power and memory listed above, I still wouldn’t feel it was a “success” if Watson could do natural language tasks well enough to statistically fool humans (i.e. pass some kind of Turing test).
When you’re talking about passing a Turing test, power is absolutely key. If Watson was in the room next to me supplying replies to verbal queries, I would start to get hot from all of Watson’s waste heat. It’s great that it can answer natural language questions, but I’m saying a better goal for many reasons is to answer natural language questions in a manner that doesn’t require massive heat removal. Invest time into figuring out how to answer natural language questions on a desktop PC with 12GB RAM. If scaling the flops down to that machine causes performance limitations, then work around them. But if the answer is: “well, having such few flops causes performance limitations, so lets just make the whole thing bigger” then it’s inherently uninteresting. Evolution could only pull that lever a certain amount, which is why brain software is so impressive.
“Solve the hard problem before the easy precursor problem.”
If that wasn’t what you meant, I may have misunderstood you.
But we aren’t even up to using the kind of processing power that evolution used. Human-level reasoning in a machine will be impressive without regard to the physical characteristics of the machine it runs on. Once the problem is well-understood, we’ll get smaller and cheaper versions.
There’s a categorical difference between “try to find a reasonable solution” and “throw money at this until it’s no longer a problem” and you’re acting like there isn’t. I already made exactly the same comments you have in the OP, where I said:
But there’s a categorical difference in the two approaches. In my own field of computer vision, it’s like this: if you want to understand how face recognition works, you will study the neuroscience of primate brains and come up with compact and efficient representations of the problem that can run in a manner similar to the way primates do it. If you just want to recognize faces right now, you just concatenate every feature vector imaginable at every scale level that could conceivably be relevant and you train 10,000 SVMs over a month and then use cross-validation and mutual information to reduce that down to a “lean” set of 2,000 SVMs and there you go, you’ve overfitted a solution that still leaves face recognition as a total black box, and you use orders of magnitude more resources and time to get that solution.
It’s interesting that current researchers who spent years working on the primate brain / Barlow infomax principle idea and studied monkey face recognition at Cal Tech, and couldn’t do good face recognition for years, are now blowing face.com and other proprietary face recognition software out of the water.
There’s a categorical difference between even trying to solve the hard problem and resorting to using more resources when you have to, vs. just overblowing the whole thing and not even making an attempt at solving the hard problem. From what I know about natural language processing, machine learning, and Watson, Watson is the latter approach and its power and memory consumption reveal it to be quite unimpressive… though hopefully trying to miniaturize it will spawn interesting engineering research.
Yeah, I read them at different times, and missed that.
So, if we made a program that beat the Turing test, but the hardware consumed a lot of power, it would be a failure, but if we ran the program on different hardware with the exact same specs, except it was more energy efficient, it would be a success?
You’re ignoring fundamental limits of computing efficiency here. You can’t have the same specs if you have many orders of magnitude more energy efficiency. Something’s got to give. At the transistor level you can’t preserve the same amount of computation for vastly less power. This is why a petaflops human brain is not the same as a petaflops super computer. Computation is represented differently because the power constraints of a human brain force it to be. You cannot do the same amount of processing with a brain that you can do with a modern petaflops cluster. It’s the software that matters because the power constraint forces a better software design.
You could make the same arguments I have made with physical space. Watson would be more impressive if it fit inside of the real volume of an average human head. It’s just simple physics. The hardware and volume occupied are inherent to the computations it’s doing. If no attention is paid to these constraints at all, then it’s not surprising or impressive that a solution can be brute forced with stupendous resources. We’re not talking about the difference between a clean diesel sedan and a Prius. We’re talking about re-purposing a military humvee and being impressed you can use it to take the kids to soccer practice.
You’re assuming we’re at the fundamental limits of computing efficiency. If we are, power is essentially instructions per second. If not, it’s less useful than instructions per second. You might as well just say instructions per second and ignore power.
For what it’s worth, I’m not unimpressed because of the computing power used. I’m unimpressed because of the inflexibility of the program. If we built a true AI using the combined might of every computer on the planet, that would be impressive. Limiting computing power makes creating intelligence harder, but no matter how much you have, it’s far from easy. Chess can be brute forced, but intelligence can’t.
(I know Watson isn’t a chess program, and is far more impressive than one, but it’s still nothing compared to true AI.)
I disagree that intelligence can’t be brute forced, at least if you don’t care about computational resources. Presumably, what we mean by ‘intelligence’ is the passing of some Turing test (otherwise, if you just define ‘intelligence’ to be passing a Turing test with some kind of “elegance” in the design of the program, then your claim is true but only because you defined it to be that way).
If computational resources truly weren’t bounded, then we could build a massively inefficient lookup table whose search grows exponentially (or worse) in the length of the input. See this paper and this one for arguments about how to bound the complexity of such a lookup table argument. This paper is also very useful and the writing style is great.
What you cannot do, however, is claim that intelligence cannot be brute forced (again, under the assumption we ignore resources), without some appeal to computational complexity theory.
In particular, the Aaronson paper points out that according to Searle’s (flawed) Chinese room argument and Ned Block’s criticisms of the Turing test, complexity theory puts us in the situation where it is exactly the efficiency of an algorithm that gives it the property we ascribe to intelligence. We only know something’s intelligent because any reasonable Turing test that “respects” human intelligence will also function like a zero-knowledge proof that any agent who can pass the test does not have an algorithm that’s exponential in the size of the input.
Watson achieves the necessary speed (for the greatly restricted test of playing Jeopardy) but as you mentioned, Watson is easy to unmask simply by asking for the second or third best answers. However, in terms of complexity theory, Watson’s program fails the Turing test detrimentally… the program’s resource efficiency is dismal compared to humans. It’s doing something ‘stupid’ like a slow lookup with some correlations and statistical search. With similar resources to a human, this approach would be doomed to failure, so IBM just scaled up the resources until this bad approach no longer failed on the desired test cases.
Thus, if you would want to use ‘intelligence’ to label a planet sized computer which solves human Turing tests by brute forcing them with tremendously outrageous resource consumption, this would be a fundamental departure from what the literature I linked above considers ‘intelligence.’ If the planet-sized computer did fancy, efficient algorithms, then its massive resources would imply it can blow away human performance. The Turing test should be testing for the “general capacity” to do something, whether by lookup table or some equally stupid way, or by efficient intelligence.
Complexity theory really plays a large large role in all this. I would also add that I see no reason not to call a massive look-up table intelligent… assuming it is implemented in some kind of hardware and architecture that’s much better than anything human’s know about. If it turned out that human minds, for example, were some kind of quantum gravity look-up table (I absolutely do not believe this at all, but just for the sake of argument) I would not instantly believe that humans are not intelligent.
A planet-sized computer isn’t big enough to brute-force a Turing test. At least, it isn’t big enough to build a look-up table. Actually brute forcing a Turing test would require figuring out how the human would react to each possible output, in which case you’ve already solved AI.
If you had nigh infinite computing power you could create AIXI trivially. If you had quite a bit less, but still nigh infinite, computing power and programming time you could create a lookup table. If you had a planet-sized computer, you could probably create a virtual world in which intelligence evolves, though it would be far from trivial. Anything less than that, and it’s a nigh insurmountable task.
Increasing the computing power would make it easier, in that it wouldn’t make it harder, but within any reasonable bounds it’s not going to help much.
Why would the hardware matter?
Because regardless of software inefficiency, unless the hardware is able to produce solutions in real time, it can’t pass a Turing test. A massive look-up table would be just fine as an intelligence if it had the hardware throughput to do its exponential searches fast enough to answer in real time, the way a human does.
No, not even close. A planet sized computer is not even close to being up to that task, both the Parberry paper and Nick Bostrom’s simulation argument papers refute that quantitatively.
It depends on the Turing test. The Shieber paper shows that you are correct if the Turing test is “carry out a 5 minute long English conversation on unrestricted topics” but Watson says you’re wrong if the Turing test is “win at Jeopardy.” Both the test and the hardware matter.
Not true. It just requires coming up with an algorithmic shortcut that mimics a plausible human output. Just think of the Eliza chatbox that fooled people into believing it was a psychologist just by parroting their statements back as questions. Even after being told about it, many people refused to believe they had not just been talking to a psychologist … and yet, almost no one today would ascribe intelligence to Eliza.
A planet-sized computer doing dumb search in a lookup table built around some cute algorithmic tricks could mimic human output extremely well, especially in restricted domains. How would we unmask it as unintelligent? Suppose a company used it as a customer service chatbot for conversations never lasting more than 1 minute, and that its heuristics and massive search were adequate for coming up with appropriate replies to 99.999% of the conversations it would face of length up to 1 minute. As soon as you know the way its software works and that its ability is purely based on scaling up dumb software with lots of hardware, you’d declare it unintelligent. Prior to that, though, it would be indistinguishable from a human in 1 minute or less conversations.
Isn’t that like saying that air would be just fine as an intelligence, if you told a person the questions you were going to use during the Turing test and how much time you would take asking each one (and your hypothetical responses to their hypothetical responses, etc.) if only sound waves could be recorded and replayed at precisely the right time? Which they can be, though that is beside the point.
A look up table is functionally absolute evidence of intelligence without being at all intelligent just as air is for a Turing test between two humans.
I disagree. I think the section called “computation and waterfalls” in this paper makes a good case against this analogy.
I think my point was not clearly communicated, because that section is not relevant to this.
That section is about how just about any instance of a medium could be interpreted as just about any message by some possible minds. It picks an instance of a medium and fits a message and listener to the instance of a medium.
I am suggesting something much more normal: first you pick a particular listener (the normal practice in communication) and particular message, and I can manipulate the listener’s customary medium to give the particular listener the particular message. In this case, many different messages would be acceptable, so much the easier.
When administering a Turing test, why do you say it is the human on the other terminal that is intelligent, rather than their keyboard, or the administrator’s eyeballs? For the same reasons, a look up table is not intelligent.
First of all, the “systems reply” to Searle’s Chinese room argument is exactly the argument that the whole room, Searle plus the book plus the room itself, is intelligent and does understand Chinese, regardless of whether or not Searle does. Since such a situation has never occured with human-to-human interaction, it’s never been relevant for us to reconsider whether what we think is a human interacting with us really is intelligent. It’s easy to envision a future like Blade Runner where bots are successful enough that more sophisticated tests are needed to determine if something is intelligent. And this absolutely involves speed of the hardware.
Also, how do you know that a person isn’t a lookup table?
Would you say that neurosurgery is “teaching”, if one manipulates the brain’s bits such that the patient knows a new fact?
The probability is low that a person is a lookup table based on the rules of physics for which the probability is high. If someone is a lookup table controlled remotely from another universe with incomprehensibly more matter than ours, or similar...so what? That just means an intelligence arranged the lookup table and it did not arise by random, high-entropy coincidence, one can say this with probability as close to absolute as it gets. Whatever arranged the lookup table may have arisen by a random, high-entropy process, like evolution, but so what?
Something arbitrarily slow may still be intelligent, by any normal meaning. More things are intelligent than pass the Turing test (unless it is merely offered as a definition) just as more things fly than are birds.
If the laws of physics are very different than I think they are, one could fit a lookup table inside a human-sized body. That would not make it intelligent any more than expanding the size of a human brain would make it cease to be intelligent. That wouldn’t prevent a robot from operating analogously to a human other than being on a different substrate, either.
What do you mean when you say “intelligence”? If you mean something performing the same functions as what we agree is intelligence given a contrived enough situation, I agree a lookup table could perform that function.
The problem with what I think is your definition isn’t the physical impossibility of creating the lookup table, but that once the informational output to an input is informationally as complex as it will ever be, anything transformation happening afterwards isn’t reasonably intelligence. The whole system of the lookup table’s creator and the lookup table may perhaps be described as an intelligent system, but not the fingers of the creator and the lookup table alone.
I’d hate to argue over definitions, but I’m interested in “Intelligence can be brute forced” and I wonder how common you think your usage is?
Yes, I am only considering the Turing test as a potential definition for intelligence, and I think this is obvious from the OP and all of my comments. See Chapter 7 of David Deutsch’s new book, The Beginnings of Infinity. Something arbitrarily slow can’t pass a Turing test that depends on real time interaction, so complexity theory allows us to treat a Turing test as a zero-knowledge proof that the agent who passes it possess something computationally more tractable than a lookup table. I also dismiss the lookup tables, but the reason why is that iterating conversation in a Turing test is Bayesian evidence that the agent interacting with me can’t be using an exponentially slow lookup table.
I agree with you that a major component of intelligence is how the knowledge is embedded in the program. If the knowledge is embedded solely by some external creator, then we don’t want to label that as intelligent. But how do we detect whether creator-embedded knowledge is a likely explanation? That has to do with the hardware it is implemented on. Since Watson is implemented on such massive resources, the explanation that it produces answers from searching a store of data has more likelihood. That is more plausible because of Watson’s hardware. If Watson achieved the same results with much less capable hardware, it would make the hypothesis that Watson’s responses are “merely pre-sorted embedded knowledge” less likely (assuming I knew no details of the software that Watson used, which is one of the conditions of a Turing test).
If you tell me something can converse with me, but that it takes 340 years to formulate a response to any sentence I utter, then I strongly suspect the implementation is arranged such that it is not intelligent. Similarly, if you tell me something can converse with me, and it only takes 1 second to respond reasonably, but it requires the resources of 10,000 humans and can’t produce responses of any demonstrably better quality than humans, then I also suspect it is just a souped-up version of a stupid algorithm, and thus not intelligent.
The behavior alone is not enough. I need details of how the behavior happens, and if I’m lacking detailed explanations of the software program, then details about the hardware resources it requires also tell me something.
But it would mean that having a conversation with a person was not conclusive evidence that he or she wasn’t a lookup table implemented in a human substrate.
Yes, absolutely. “Regular” teaching is just exactly that, but achieved more slowly by communication over a noisy channel.
To strengthen DanielLC’s point, say we have a software program capable of beating the Turing test. In room A, it runs on a standard home desktop, and it is pitted against a whole-brain-emulation using several supercomputer clusters consuming on the order of ten megawatts.
In room B, the software program is run on a custom-built piece of hardware—a standard home desktop’s components interlinked with a large collection of heating elements, heatsinks attached to these heating elements, and cooling solutions for the enormous amount of heat generated, consuming on the order of ten megawatts. It is pitted against the person whom the whole-brain-emulation copy was made from—a whole-brain-emulation running on one whole brain.
It makes no sense that in room A the software wins and in room B the brain wins.
I agree that if it’s just hooked up to heat producing elements that play absolutely no role in the computation, then that waste heat or extra power consumption is irrelevant. But that’s not the case with any computer I’ve ever heard of and certainly not with Watson. The waste heat is directly related to its compute capability.
See the papers linked in my other comment for much more rigorous dismantling of the idea that resource efficiency doesn’t matter for a Turing test.
Also, the big fundamental flaw here is when you say:
You’re acting like this is a function of the software with no concern for the hardware. A massive, inefficient, exponentially slow (software-wise) look-up table can beat the Turing test if you either (a) give it magically fast hardware or (b) give it extra time to finish. But this clearly doesn’t capture the “spirit” of what you want. You want software that is somehow innately efficient in the manner it solves the problem. This is why most people would say a brute force look-up table is not intelligent, but a human is. Presumably the brain does something more resource efficient to generate responses the way that it does. But all 5 minute conversations, for example, can be bounded in terms of total number of bits transmitted, and you make a giant, unwieldy (but still finite) look-up table for every possible 5 minute conversation that could ever happen, and just make something win the “have a 5 minute conversation” Turing test by doing a horrible search in that table.
When you say “make some software that beats a Turing test” this is == “make some software that does a task in a resource efficient manner”, as the Shieber and Aaronson papers point out. This is why Searle’s Chinese room argument utterly falls apart: computational complexity says you could never have large enough resources to actually build Searle’s Chinese room, nor the book for looking up Chinese characters. It’s the same with the “software” you mention. You might as well call it “magic software.”