Jiro comments on The flawed Turing test: language, understanding, and partial p-zombies

Jiro 19 May 2013 22:51 UTC
2 points
No computer is going to fail the Turing test because it can’t compose a piece of music. The questioner might ask it to do that, but if it replies “sorry, I have no ear for music” it doesn’t fail—the questioner then picks something else. If the computer can’t do that either, and if the questioner keeps picking such things, he may eventually get to the point where he says “Okay, I know there are some people who have no ear for music, but there aren’t many people who have no ear for music and can’t make a video and can’t paint a picture and.… He will then fail the computer because although it is plausible that a human can’t do each individual item, it’s not very plausible that the human can’t do anything in the list. No specific item is a requirement to be a human, and no specific inability marks the subject as not being human.
- Said Achmiz 19 May 2013 23:10 UTC
  4 points
  Parent
  
  if the questioner keeps picking such things, he may eventually get to the point where he says “Okay, I know there are some people who have no ear for music, but there aren’t many people who have no ear for music and can’t make a video and can’t paint a picture and.… He will then fail the computer because although it is plausible that a human can’t do each individual item, it’s not very plausible that the human can’t do anything in the list.
  
  But if all the things in that conjunction are creative endeavors, why do you think a human not being able to do any of them is implausible? I have no ear for music, don’t have video-creation skills, can’t paint a picture, can’t write a poem, etc. There are many similar people, whose talents lie elsewhere, or perhaps who are just generally low on the scale of human talent.
  
  If you judge such people to be computers, then your success rate as a judge in a Turing test will be unimpressive.
  - Jiro 20 May 2013 3:44 UTC
    0 points
    Parent
    If the questioner is competent, he won’t pick a list where it’s plausible that some human can’t do anything on the list. If he does pick such a list, he’s performing the questioning incompetently. I think implicit in the idea of the test is that we have to assume some level of competency on the part of the questioner; there are many more ways an incompetent questioner could fail to detect humans other than just ask for a bad set of creative endeavors.
    
    (I think the test also assumes most people are competent enough to administer the test, which also implies that the above scenario won’t happen. I think most people know that there are non-creative humans and won’t give a test that consists solely of asking for creative endeavors—the things they ask the subject to do will include both creative and non-creative but human-sepcific things.)
    - Said Achmiz 20 May 2013 3:58 UTC
      5 points
      Parent
      I think this entire thread is caused by, and demonstrates, the fact that we increasingly have no idea what the heck we’re even trying to measure or detect with the Turing test (is it consciousness? human-level intelligence? general intelligence? what?) …
      
      … which is entirely unsurprising, since as I say in another comment on this post, the Turing test isn’t meant to measure or detect anything.
      
      To use it as a measure of something or a detector of something is to miss the point. This thread, where we go back and forth arguing about criteria, pretty much demonstrates said fact.
      - Bugmaster 20 May 2013 6:09 UTC
        3 points
        Parent
        I think the Turing Test clearly does measure something: it measures how closely an agent’s behavior resembles that of a human. The real argument is not, “what does the test measure ?”, but “is measuring behavior similarity enough for all intents and purposes, or do we need more ?”
        Said Achmiz 20 May 2013 6:41 UTC
        0 points
        Parent
        If we prefer to be pedantic, we must go further than that: the test measures whether an agent can fool some particular interrogator into having a no-better-than-chance probability of correctly discerning whether said agent is a human (in the case where the agent in question is not, in fact, a human).
        
        How well that particular factor correlates with actual behavioral similarity to a human (and how would we define and measure such similarity? along what dimensions? operationalized how?), is an open question. It might, it might not. It might take advantage of some particular biases of the interrogator (e.g. pareidolia, the tendency to anthropomorphize aspects of the inanimate world, etc.) to make him/her see behavioral similarity where little exists (cf. Eliza and other chatbots).
        
        (Remember, also, that Turing thought that a meaningful milestone would be for a computer to “play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.” ! [Emphasis mine.])
        
        I do partly agree with this:
        
        The real argument is not, “what does the test measure ?”, but “is measuring behavior similarity enough for all intents and purposes, or do we need more ?”
        
        And of course the question then becomes: just what are our intents and/or purposes here?
        Bugmaster 20 May 2013 21:21 UTC
        1 point
        Parent
        
        “play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.”
        
        I think we’ve hit this milestone already, but we kind of cheated: in addition to just making computers smarter, we made human conversations dumber. Thus, if we wanted to stay true to Turing’s original criteria, we’d need to scale up our present-day requirements (say, to something like 80% chance over 60 minutes), in order to keep up with inflation.
        
        And of course the question then becomes: just what are our intents and/or purposes here?
        
        I can propose one relatively straightforward criterion: “can this agent take the place of a human on our social network graph ?” By this I don’t simply mean, “can we friend it on Facebook”; that is, when I say “social network”, I mean “the overall fabric of our society”. This network includes relationships such as “friend”, “employee”, “voter”, “possessor of certain rights”, etc.
        
        I think this is a pretty good criterion, and I also think that it could be evaluated in purely functional terms. We shouldn’t need to read an agent’s genetic/computer/quantum/whatever code in order to determine whether it can participate in our society; we can just give it the Turing Test, instead. In a way, we already do this with humans, all the time—only the test is administered continuously, and sometimes we get the answers wrong.
  - Bugmaster 19 May 2013 23:24 UTC
    0 points
    Parent
    Agreed. Pretty much the only creative endeavour I’m capable of is writing computer code; and it’s not even entirely clear whether computer programming can be qualified as “creative” in the first place. I’m a human, though, not an AI. I guess you’d have to take my word for it.