There already is a better Turing test, which is the Turing test as originally described.
To run the test as originally described, you need an active control; a human conversing with the judges at the same time in the same manner, where their decision is “Which is the human?”, not “Is this a human?” If the incompetent judges had been also talking simultaneously with a real 13-year-old from Ukraine, I have no doubt that Eugene Goostman would have bombed horribly.
This is not that much better. The article that The Most Human Human is based on talks about the difficulty of communication in a 5-minute window, and the lack of knowledge lay judges have about what AI involves. The author consistently got named a human by better-applying the tactics of the some of the most successful bots: controlling conversation flow and using humor.
It’s an improvement, but a winner would still win by “gaming” judges’ psychology.
Where’s that article? On the surface of it, that doesn’t seem like a problem, necessarily. And a good active control doesn’t have to be untrained; they could suggest questions to ask the computer, etc.
“Here’s something I’ll bet you the AI can’t do: Ask it to tell you a story about it’s favorite elementary-school teacher”
There already is a better Turing test, which is the Turing test as originally described.
To run the test as originally described, you need an active control; a human conversing with the judges at the same time in the same manner, where their decision is “Which is the human?”, not “Is this a human?” If the incompetent judges had been also talking simultaneously with a real 13-year-old from Ukraine, I have no doubt that Eugene Goostman would have bombed horribly.
This is not that much better. The article that The Most Human Human is based on talks about the difficulty of communication in a 5-minute window, and the lack of knowledge lay judges have about what AI involves. The author consistently got named a human by better-applying the tactics of the some of the most successful bots: controlling conversation flow and using humor.
It’s an improvement, but a winner would still win by “gaming” judges’ psychology.
Where’s that article? On the surface of it, that doesn’t seem like a problem, necessarily. And a good active control doesn’t have to be untrained; they could suggest questions to ask the computer, etc.
or whatever.