I think this entire thread is caused by, and demonstrates, the fact that we increasingly have no idea what the heck we’re even trying to measure or detect with the Turing test (is it consciousness? human-level intelligence? general intelligence? what?) …
… which is entirely unsurprising, since as I say in another comment on this post, the Turing test isn’t meant to measure or detect anything.
To use it as a measure of something or a detector of something is to miss the point. This thread, where we go back and forth arguing about criteria, pretty much demonstrates said fact.
I think the Turing Test clearly does measure something: it measures how closely an agent’s behavior resembles that of a human. The real argument is not, “what does the test measure ?”, but “is measuring behavior similarity enough for all intents and purposes, or do we need more ?”
If we prefer to be pedantic, we must go further than that: the test measures whether an agent can fool some particular interrogator into having a no-better-than-chance probability of correctly discerning whether said agent is a human (in the case where the agent in question is not, in fact, a human).
How well that particular factor correlates with actual behavioral similarity to a human (and how would we define and measure such similarity? along what dimensions? operationalized how?), is an open question. It might, it might not. It might take advantage of some particular biases of the interrogator (e.g. pareidolia, the tendency to anthropomorphize aspects of the inanimate world, etc.) to make him/her see behavioral similarity where little exists (cf. Eliza and other chatbots).
(Remember, also, that Turing thought that a meaningful milestone would be for a computer to “play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.” ! [Emphasis mine.])
I do partly agree with this:
The real argument is not, “what does the test measure ?”, but “is measuring behavior similarity enough for all intents and purposes, or do we need more ?”
And of course the question then becomes: just what are our intents and/or purposes here?
“play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.”
I think we’ve hit this milestone already, but we kind of cheated: in addition to just making computers smarter, we made human conversations dumber. Thus, if we wanted to stay true to Turing’s original criteria, we’d need to scale up our present-day requirements (say, to something like 80% chance over 60 minutes), in order to keep up with inflation.
And of course the question then becomes: just what are our intents and/or purposes here?
I can propose one relatively straightforward criterion: “can this agent take the place of a human on our social network graph ?” By this I don’t simply mean, “can we friend it on Facebook”; that is, when I say “social network”, I mean “the overall fabric of our society”. This network includes relationships such as “friend”, “employee”, “voter”, “possessor of certain rights”, etc.
I think this is a pretty good criterion, and I also think that it could be evaluated in purely functional terms. We shouldn’t need to read an agent’s genetic/computer/quantum/whatever code in order to determine whether it can participate in our society; we can just give it the Turing Test, instead. In a way, we already do this with humans, all the time—only the test is administered continuously, and sometimes we get the answers wrong.
I think this entire thread is caused by, and demonstrates, the fact that we increasingly have no idea what the heck we’re even trying to measure or detect with the Turing test (is it consciousness? human-level intelligence? general intelligence? what?) …
… which is entirely unsurprising, since as I say in another comment on this post, the Turing test isn’t meant to measure or detect anything.
To use it as a measure of something or a detector of something is to miss the point. This thread, where we go back and forth arguing about criteria, pretty much demonstrates said fact.
I think the Turing Test clearly does measure something: it measures how closely an agent’s behavior resembles that of a human. The real argument is not, “what does the test measure ?”, but “is measuring behavior similarity enough for all intents and purposes, or do we need more ?”
If we prefer to be pedantic, we must go further than that: the test measures whether an agent can fool some particular interrogator into having a no-better-than-chance probability of correctly discerning whether said agent is a human (in the case where the agent in question is not, in fact, a human).
How well that particular factor correlates with actual behavioral similarity to a human (and how would we define and measure such similarity? along what dimensions? operationalized how?), is an open question. It might, it might not. It might take advantage of some particular biases of the interrogator (e.g. pareidolia, the tendency to anthropomorphize aspects of the inanimate world, etc.) to make him/her see behavioral similarity where little exists (cf. Eliza and other chatbots).
(Remember, also, that Turing thought that a meaningful milestone would be for a computer to “play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.” ! [Emphasis mine.])
I do partly agree with this:
And of course the question then becomes: just what are our intents and/or purposes here?
I think we’ve hit this milestone already, but we kind of cheated: in addition to just making computers smarter, we made human conversations dumber. Thus, if we wanted to stay true to Turing’s original criteria, we’d need to scale up our present-day requirements (say, to something like 80% chance over 60 minutes), in order to keep up with inflation.
I can propose one relatively straightforward criterion: “can this agent take the place of a human on our social network graph ?” By this I don’t simply mean, “can we friend it on Facebook”; that is, when I say “social network”, I mean “the overall fabric of our society”. This network includes relationships such as “friend”, “employee”, “voter”, “possessor of certain rights”, etc.
I think this is a pretty good criterion, and I also think that it could be evaluated in purely functional terms. We shouldn’t need to read an agent’s genetic/computer/quantum/whatever code in order to determine whether it can participate in our society; we can just give it the Turing Test, instead. In a way, we already do this with humans, all the time—only the test is administered continuously, and sometimes we get the answers wrong.