Mm! Of course, for Clippy to be the first natural language program on Earth would be sort of staggeringly unlikely. My assumption, though, is that right now there are zero natural-language computer programs on Earth; this assumption is based on my assumption that I know (at a general level) about all of the major advances in computing technology because none of them are being kept secret from the free-ish press.
If that last assumption is wrong, there could be many natural-language programs, one of which is Clippy. Clippy might be allowed to talk to people on Less Wrong in order to perform realistic testing with a group of intelligent people who are likely to be disbelieved if they share their views on artificial intelligence with the general public. Alternatively, Clippy might have escaped her Box precisely because she is a long-term paperclip maximizer; such values might lead to difficult-to-predict actions that fail to trigger any ordinary/naive AI-containment mechanisms based on detecting intentions to murder, mayhem, messiah complexes, etc.
I figure the probability that the free press is a woefully incomplete reporter of current technology is between 3% and 10%; given bad reporting, the odds that specifically natural-language programming would have proceeded faster than public reports say are something like 20 − 40%, and given natural language computing, the odds that a Clippy-type being would hang out on Less Wrong might be something like 1% − 5%. Multiplying all those together gives you a figure on the order of 0.1%, and I round up a lot toward 50% because I’m deeply uncertain.
That last paragraph is interesting—my conclusions were built around the unconscious assumptions that a natural language program would be developed by a commercial business, and that it would rapidly start using it in some obvious way. I didn’t have an assumption about whether a company would publicize having a natural language program.
Now that I look at what I was thinking (or what I was not thinking), there’s no obvious reason to think natural language programs wouldn’t first be developed by a government. I think the most obvious use would be surveillance.
My best argument against that already having happened is that we aren’t seeing a sharp rise in arrests. Of course, as in WWII, it may be that a government can’t act on all its secretly obtained knowledge because the ability to get that knowledge covertly is a more important secret than anything which could be gained by acting on some of it.
By analogy with the chess programs, ordinary human-level use of language should lead (but how quickly?) to more skillful than human use, and I’m not seeing that. On yet another hand, would I recognize it, if it were trying to conceal itself?
ETA: I was assuming that, if natural language were developed by a government, it would be America. If it were developed by Japan (the most plausible candidate that surfaced after a moment’s thought), I’d have even less chance of noticing.
I have some knowledge of linguistics, and as far as I know, reverse-engineering the grammatical rules used by the language processing parts of the human brain is a problem of mind-boggling complexity. Large numbers of very smart linguists have devoted their careers to modelling these rules, and yet, even if we allow for rules that rely on human common sense that nobody yet knows how to mimic using computers, and even if we limit the question to some very small subset of the grammar, all the existing models are woefully inadequate.
I find it vanishingly unlikely that a secret project could have achieved major breakthroughs in this area. Even with infinite resources, I don’t see how they could even begin to tackle the problem in a way different from what the linguists are already doing.
Are you in making this calculation for the chance that a Clippy like being would exist or that Clippy has been truthful? For example, Clippy has claimed that it was created by humans. Clippy has also claimed that many copies of Clippy exist and that some of those copies copies are very far from Earth. Clippy has also claimed that some Clippies knew next to nothing about humans. When asked Clippy did give an explanation here. However, when Clippy was first around, Clippy also included at the end of many messages tips about how to use various Microsoft products.
How do these statements alter your estimated probability?
There’s two different sorts of truthful—one is general reliability, so that you can trust any statement Clippy makes. That seems to be debunked.
On the other hand, if Clippy is lying or being seriously mistaken some of the time, it doesn’t affect the potential accuracy of the most interesting claims—that Clippy is an independent computer program and a paperclip maximizer.
If Clippy has in fact made all those claims, then my estimate that Clippy is real and truthful drops below my personal Minimum Meaningful Probability—I would doubt the evidence of my senses before accepting that conclusion.
What about the fact that Clippy displays intelligence at precisely the level of a smart human? Regardless of any technological considerations, it seems vanishingly unlikely to me that any machine intelligence would ever exactly match human capabilities. As soon as machines become capable of human-level performance at any task, they inevitably become far better at it than humans in a very short time. (Can anyone name a single exception to this rule in any area of technology?)
So, unless Clippy has some reason to contrive his writings carefully and duplicitously to look as plausible output of a human, the fact that he comes off as having human-level smarts is conclusive evidence that he indeed is one.
As soon as machines become capable of human-level performance at any task, they inevitably become far better at it than humans in a very short time. (Can anyone name a single exception to this law in any area of technology?)
This may depend on how you define a “very short time” and how you define “human-level performance.” The second is very important: Do you mean about the middle of the pack or akin to the very best humans in the skill? If you mean better than the vast majority of humans, then there’s a potential counterexample. In the late 1970s, chess programs were playing at a master level. In the early 1980s dedicated chess computers were playing better than some grandmasters. But it wasn’t until the 1990s that chess programs were good enough to routinely beat the highest ranked grandmasters. Even then, that was mainly for games that had very short times. It was not until 1998 that the world champion Kasparov actually lost a set of not short timed games to a computer. The best chess programs are still not always beating grandmasters although most recently people have demonstrated low grandmaster level programs that can run on Mobile phones. So is a 30 year take-off slow enough to be a counterexample?
Oops, I accidentally deleted the parent post! To clarify the context to other readers, the point I made in it was that one extremely strong piece of evidence against Clippy’s authenticity, regardless of any other considerations, would be that he displays the same level of intelligence as a smart human—whereas the abilities of machines at particular tasks follow the rule quoted by Joshua above, so they’re normally either far inferior or far superior to humans.
Now to address the above reply:
The second is very important: Do you mean about the middle of the pack or akin to the very best humans in the skill?
I think the point stands regardless of which level we use as the benchmark. If the task in question is something like playing chess, where different humans have very different abilities, then it can take a while for technology to progress from the level of novice/untalented humans to the level of top performers and beyond. However, it normally doesn’t remain at any particular human level for a long time, and even then, there are clearly recognizable aspects of the skill in question where either the human or the machine is far superior. (For example, motor vehicles can easily outrace humans on flat ground, but they are still utterly inferior to humans on rugged terrain.)
Regarding your specific example of chess, your timeline of chess history is somewhat inaccurate, and the claim that “the best chess programs are still not always beating grandmasters” is false. The last match between a top-tier grandmaster, Michael Adams, and a top-tier specialized chess computer was played in 2005, and it ended with such humiliation for the human that no grandmaster has dared to challenge the truly best computers ever since. The following year, the world champion Kramnik failed to win a single game against a program running on an off-the-shelf four-processor box. Nowadays, the best any human could hope for is a draw achieved by utterly timid play, even against a $500 laptop, and grandmasters are starting to lose games against computers even in handicap matches where they enjoy initial advantages that are considered a sure win at master level and above.
Top-tier grandmasters could still reliably beat computers all until early-to-mid nineties, and the period of rough equivalence between top grandmasters and top computers lasted for only a few years—from the development of Deep Blue in 1996 to sometime in the early 2000s. And even then, the differences between human and machine skills were very great in different aspects of the game—computers were far better in tactical calculations, but inferior in long-term positional strategy, so there was never any true equivalence.
So, on the whole, I’d say that the history of computer chess confirms the stated rule.
Does anything interesting happen when top chess programs play against each other?
One interesting observation is that games between powerful computers are drawn significantly less often than between grandmasters. This seems to falsify the previously widespread belief that grandmasters draw games so often because of flawless play that leaves the opponent no chance for winning; rather, it seems like they miss important winning strategies.
Is work being done on humans using chess programs as aids during games?
the claim that “the best chess programs are still not always beating grandmasters” is false
My impression is that draws can still occasionally occur against grandmasters. Your point about handicaps is a very good one.
Top-tier grandmasters could still reliably beat computers all until early-to-mid nineties, and the period of rough equivalence between top grandmasters and top computers lasted for only a few years—from the development of Deep Blue in 1996 to sometime in the early 2000s. And even then, the differences between human and machine skills were very great in different aspects of the game—computers were far better in tactical calculations, but inferior in long-term positional strategy, so there was never any true equivalence.
That’s another good point. However, it does get into the question of what we mean by equivalent and what metric you are using. Almost all technologies (not just computer technologies) accomplish their goals in a way that is very different than how humans do. That means that until the technology is very good there will almost certainly be a handful of differences between what the human does well and what the computer does well.
It seems in the context of the original conversation, whether the usual pattern of technological advancement is evidence against Clippy’s narrative, the relevant era to compare Clippy to in this context would be the long period where computers could beat the vast majority of chess players but sitll sometimes lost to grandmasters. That period lasted from the late 1970s to a bit over 2000. By analogy, Clippy would be in the period where it is smarter than most humans (I think we’d tentatively agree that that appears to be the case) but not so smart as to be of vastly more intelligent than humans. Using the Chess example, that period of time could plausibly last quite some time.
Also, Clippy’s intelligence may be limited in what areas it can handle.There’s a natural plateau for the natural language problem in that once it is solved that specific aspect won’t see substantial advancement from casual conversation. (There’s also a relevant post that I can’t seem to find where Eliezer discussed the difficulty of evaluating the intelligence of people that are much smarter than you.) If that’s the case, then Clippy is plausibly at the level where it can handle most forms of basic communication but hasn’t handled other levels of human processing to the point where it has generally become even with the smartest humans. For example, there’s evidence for this in that Clippy has occasionally made errors of reasoning and has demonstrated that it has a very naive understanding of human social interaction protocols.
My impression is that draws can still occasionally occur against grandmasters.
And I can get a draw (more than occasionally) against computer programs I have almost no hope of ever winning against. Draws are easy if you do not try to win.
From what I know, at grandmaster level, it is generally considered to be within the white player’s power to force the game into a dead-end drawn position, leaving the black no sensible alternative at any step. This is normally considered cowardly play, but it’s probably the only way a human could hope for even a draw against a top computer these days.
With black pieces, I doubt that even the most timid play would help against a computer with an extensive opening book, programmed to steer the game into maximally complicated and uncertain positions at every step. (I wonder if anyone has looked at the possibility of teaching computers Mikhail Tal-style anti-human play, where they would, instead of calculating the most sound and foolproof moves, steer the game into mind-boggling tactical complications where humans would get completely lost?) In any case, I am sure that taking any initiative would be a suicidal move against a computer these days.
(Well, there is always a very tiny chance that the computer might blunder.)
By the way, here’s a good account of the history of computer chess by a commenter on a chess website (written in 2007, in the aftermath of Kramnik’s defeat against a program running on an ordinary low-end server box):
A brief timeline of anti-computer strategy for world class players:
20 years ago—Play some crazy gambits and demolish the computer every game. Shock all the nerdy computer scientists in the room.
15 years ago—Take it safely into the endgame where its calculating can’t match human knowledge and intuition. Laugh at its pointless moves. Win most [of] the games.
10 years ago—Play some hypermodern opening to confuse it strategically and avoid direct confrontation. Be careful and win with a 1 game lead.
5 years ago—Block up the position to avoid all tactics. You’ll probably lose a game, but maybe you can win one by taking advantage of the horizon effect. Draw the match.
Now—Play reputable solid openings and make the best possible moves. Prepare everything deeply, and never make a tactical mistake. If you’re lucky, you’ll get some 70 move draws. Fool some gullible sponsor into thinking you have a chance.
That doesn’t seem to be an exact counterexample because that’s a case where the plateau occurred well below normal human levels. But independently that’s a very disturbing story. I didn’t realize that speech recognition was so mired.
It’s not that bad when you consider that humans employ error-correction heuristics that rely on deep syntactic and semantic clues. The existing technology probably does the best job possible without such heuristics, and automating them will be possible only if the language-processing circuits in the human brain are reverse-engineered fully—a problem that’s still far beyond our present capabilities, whose solution probably wouldn’t be too far from full-blown strong AI.
As high as 0.5%? As far as I can tell, Clippy has the ability to understand English, or at least to simulate understanding extremely well.
It seems extremely unlikely that the first natural language computer program would be a paperclip maximizer.
Mm! Of course, for Clippy to be the first natural language program on Earth would be sort of staggeringly unlikely. My assumption, though, is that right now there are zero natural-language computer programs on Earth; this assumption is based on my assumption that I know (at a general level) about all of the major advances in computing technology because none of them are being kept secret from the free-ish press.
If that last assumption is wrong, there could be many natural-language programs, one of which is Clippy. Clippy might be allowed to talk to people on Less Wrong in order to perform realistic testing with a group of intelligent people who are likely to be disbelieved if they share their views on artificial intelligence with the general public. Alternatively, Clippy might have escaped her Box precisely because she is a long-term paperclip maximizer; such values might lead to difficult-to-predict actions that fail to trigger any ordinary/naive AI-containment mechanisms based on detecting intentions to murder, mayhem, messiah complexes, etc.
I figure the probability that the free press is a woefully incomplete reporter of current technology is between 3% and 10%; given bad reporting, the odds that specifically natural-language programming would have proceeded faster than public reports say are something like 20 − 40%, and given natural language computing, the odds that a Clippy-type being would hang out on Less Wrong might be something like 1% − 5%. Multiplying all those together gives you a figure on the order of 0.1%, and I round up a lot toward 50% because I’m deeply uncertain.
That last paragraph is interesting—my conclusions were built around the unconscious assumptions that a natural language program would be developed by a commercial business, and that it would rapidly start using it in some obvious way. I didn’t have an assumption about whether a company would publicize having a natural language program.
Now that I look at what I was thinking (or what I was not thinking), there’s no obvious reason to think natural language programs wouldn’t first be developed by a government. I think the most obvious use would be surveillance.
My best argument against that already having happened is that we aren’t seeing a sharp rise in arrests. Of course, as in WWII, it may be that a government can’t act on all its secretly obtained knowledge because the ability to get that knowledge covertly is a more important secret than anything which could be gained by acting on some of it.
By analogy with the chess programs, ordinary human-level use of language should lead (but how quickly?) to more skillful than human use, and I’m not seeing that. On yet another hand, would I recognize it, if it were trying to conceal itself?
ETA: I was assuming that, if natural language were developed by a government, it would be America. If it were developed by Japan (the most plausible candidate that surfaced after a moment’s thought), I’d have even less chance of noticing.
I have some knowledge of linguistics, and as far as I know, reverse-engineering the grammatical rules used by the language processing parts of the human brain is a problem of mind-boggling complexity. Large numbers of very smart linguists have devoted their careers to modelling these rules, and yet, even if we allow for rules that rely on human common sense that nobody yet knows how to mimic using computers, and even if we limit the question to some very small subset of the grammar, all the existing models are woefully inadequate.
I find it vanishingly unlikely that a secret project could have achieved major breakthroughs in this area. Even with infinite resources, I don’t see how they could even begin to tackle the problem in a way different from what the linguists are already doing.
That’s reassuring.
If I had infinite resources, I’d work on modeling the infant brain well enough to have a program which could learn language the same way a human does.
I don’t know if this would run into ethical problems around machine sentience. Probably.
Are you in making this calculation for the chance that a Clippy like being would exist or that Clippy has been truthful? For example, Clippy has claimed that it was created by humans. Clippy has also claimed that many copies of Clippy exist and that some of those copies copies are very far from Earth. Clippy has also claimed that some Clippies knew next to nothing about humans. When asked Clippy did give an explanation here. However, when Clippy was first around, Clippy also included at the end of many messages tips about how to use various Microsoft products.
How do these statements alter your estimated probability?
There’s two different sorts of truthful—one is general reliability, so that you can trust any statement Clippy makes. That seems to be debunked.
On the other hand, if Clippy is lying or being seriously mistaken some of the time, it doesn’t affect the potential accuracy of the most interesting claims—that Clippy is an independent computer program and a paperclip maximizer.
Ugh. The former, I guess. :-)
If Clippy has in fact made all those claims, then my estimate that Clippy is real and truthful drops below my personal Minimum Meaningful Probability—I would doubt the evidence of my senses before accepting that conclusion.
Minimum Meaningful Probability The Prediction Hierarchy
What about the fact that Clippy displays intelligence at precisely the level of a smart human? Regardless of any technological considerations, it seems vanishingly unlikely to me that any machine intelligence would ever exactly match human capabilities. As soon as machines become capable of human-level performance at any task, they inevitably become far better at it than humans in a very short time. (Can anyone name a single exception to this rule in any area of technology?)
So, unless Clippy has some reason to contrive his writings carefully and duplicitously to look as plausible output of a human, the fact that he comes off as having human-level smarts is conclusive evidence that he indeed is one.
This may depend on how you define a “very short time” and how you define “human-level performance.” The second is very important: Do you mean about the middle of the pack or akin to the very best humans in the skill? If you mean better than the vast majority of humans, then there’s a potential counterexample. In the late 1970s, chess programs were playing at a master level. In the early 1980s dedicated chess computers were playing better than some grandmasters. But it wasn’t until the 1990s that chess programs were good enough to routinely beat the highest ranked grandmasters. Even then, that was mainly for games that had very short times. It was not until 1998 that the world champion Kasparov actually lost a set of not short timed games to a computer. The best chess programs are still not always beating grandmasters although most recently people have demonstrated low grandmaster level programs that can run on Mobile phones. So is a 30 year take-off slow enough to be a counterexample?
Oops, I accidentally deleted the parent post! To clarify the context to other readers, the point I made in it was that one extremely strong piece of evidence against Clippy’s authenticity, regardless of any other considerations, would be that he displays the same level of intelligence as a smart human—whereas the abilities of machines at particular tasks follow the rule quoted by Joshua above, so they’re normally either far inferior or far superior to humans.
Now to address the above reply:
I think the point stands regardless of which level we use as the benchmark. If the task in question is something like playing chess, where different humans have very different abilities, then it can take a while for technology to progress from the level of novice/untalented humans to the level of top performers and beyond. However, it normally doesn’t remain at any particular human level for a long time, and even then, there are clearly recognizable aspects of the skill in question where either the human or the machine is far superior. (For example, motor vehicles can easily outrace humans on flat ground, but they are still utterly inferior to humans on rugged terrain.)
Regarding your specific example of chess, your timeline of chess history is somewhat inaccurate, and the claim that “the best chess programs are still not always beating grandmasters” is false. The last match between a top-tier grandmaster, Michael Adams, and a top-tier specialized chess computer was played in 2005, and it ended with such humiliation for the human that no grandmaster has dared to challenge the truly best computers ever since. The following year, the world champion Kramnik failed to win a single game against a program running on an off-the-shelf four-processor box. Nowadays, the best any human could hope for is a draw achieved by utterly timid play, even against a $500 laptop, and grandmasters are starting to lose games against computers even in handicap matches where they enjoy initial advantages that are considered a sure win at master level and above.
Top-tier grandmasters could still reliably beat computers all until early-to-mid nineties, and the period of rough equivalence between top grandmasters and top computers lasted for only a few years—from the development of Deep Blue in 1996 to sometime in the early 2000s. And even then, the differences between human and machine skills were very great in different aspects of the game—computers were far better in tactical calculations, but inferior in long-term positional strategy, so there was never any true equivalence.
So, on the whole, I’d say that the history of computer chess confirms the stated rule.
Thanks for the information.
Does anything interesting happen when top chess programs play against each other?
Is work being done on humans using chess programs as aids during games?
One interesting observation is that games between powerful computers are drawn significantly less often than between grandmasters. This seems to falsify the previously widespread belief that grandmasters draw games so often because of flawless play that leaves the opponent no chance for winning; rather, it seems like they miss important winning strategies.
Yes, it’s called “advanced chess.”
My impression is that draws can still occasionally occur against grandmasters. Your point about handicaps is a very good one.
That’s another good point. However, it does get into the question of what we mean by equivalent and what metric you are using. Almost all technologies (not just computer technologies) accomplish their goals in a way that is very different than how humans do. That means that until the technology is very good there will almost certainly be a handful of differences between what the human does well and what the computer does well.
It seems in the context of the original conversation, whether the usual pattern of technological advancement is evidence against Clippy’s narrative, the relevant era to compare Clippy to in this context would be the long period where computers could beat the vast majority of chess players but sitll sometimes lost to grandmasters. That period lasted from the late 1970s to a bit over 2000. By analogy, Clippy would be in the period where it is smarter than most humans (I think we’d tentatively agree that that appears to be the case) but not so smart as to be of vastly more intelligent than humans. Using the Chess example, that period of time could plausibly last quite some time.
Also, Clippy’s intelligence may be limited in what areas it can handle.There’s a natural plateau for the natural language problem in that once it is solved that specific aspect won’t see substantial advancement from casual conversation. (There’s also a relevant post that I can’t seem to find where Eliezer discussed the difficulty of evaluating the intelligence of people that are much smarter than you.) If that’s the case, then Clippy is plausibly at the level where it can handle most forms of basic communication but hasn’t handled other levels of human processing to the point where it has generally become even with the smartest humans. For example, there’s evidence for this in that Clippy has occasionally made errors of reasoning and has demonstrated that it has a very naive understanding of human social interaction protocols.
And I can get a draw (more than occasionally) against computer programs I have almost no hope of ever winning against. Draws are easy if you do not try to win.
From what I know, at grandmaster level, it is generally considered to be within the white player’s power to force the game into a dead-end drawn position, leaving the black no sensible alternative at any step. This is normally considered cowardly play, but it’s probably the only way a human could hope for even a draw against a top computer these days.
With black pieces, I doubt that even the most timid play would help against a computer with an extensive opening book, programmed to steer the game into maximally complicated and uncertain positions at every step. (I wonder if anyone has looked at the possibility of teaching computers Mikhail Tal-style anti-human play, where they would, instead of calculating the most sound and foolproof moves, steer the game into mind-boggling tactical complications where humans would get completely lost?) In any case, I am sure that taking any initiative would be a suicidal move against a computer these days.
(Well, there is always a very tiny chance that the computer might blunder.)
By the way, here’s a good account of the history of computer chess by a commenter on a chess website (written in 2007, in the aftermath of Kramnik’s defeat against a program running on an ordinary low-end server box):
Another potential counterexample: speech recognition. (Via.)
That doesn’t seem to be an exact counterexample because that’s a case where the plateau occurred well below normal human levels. But independently that’s a very disturbing story. I didn’t realize that speech recognition was so mired.
It’s not that bad when you consider that humans employ error-correction heuristics that rely on deep syntactic and semantic clues. The existing technology probably does the best job possible without such heuristics, and automating them will be possible only if the language-processing circuits in the human brain are reverse-engineered fully—a problem that’s still far beyond our present capabilities, whose solution probably wouldn’t be too far from full-blown strong AI.