Well, as you point out, it’s not that interesting a test, “scientifically” speaking.
But also they haven’t passed it and aren’t close.
The Turing test is adversarial. It assumes that the human judge is actively trying to distinguish the AI from another human, and is free to try anything that can be done through text.
I don’t think any of the current LLMs would pass with any (non-impaired) human judge who was motivated to put in a bit of effort. Not even if you used versions without any of the “safety” hobbling. Not even if the judge knew nothing about LLMs, prompting, jailbreaking, or whatever.
Nor do I think that the “labs” can create an LLM that comes close to passing using the current state of the art. Not with the 4-level generation, not with the 5-level generation, and I suspect probably not with the 6-level generation. There are too many weird human things you’d have to get right. And doing it with pure prompting is right out.
Even if they could, it is, as you suggested, an anti-goal for them, and it’s an expensive anti-goal. They’d be spending vast amounts of money to build something that they couldn’t use as a product, but that could be a huge PR liability.
Well, as you point out, it’s not that interesting a test, “scientifically” speaking.
But also they haven’t passed it and aren’t close.
The Turing test is adversarial. It assumes that the human judge is actively trying to distinguish the AI from another human, and is free to try anything that can be done through text.
I don’t think any of the current LLMs would pass with any (non-impaired) human judge who was motivated to put in a bit of effort. Not even if you used versions without any of the “safety” hobbling. Not even if the judge knew nothing about LLMs, prompting, jailbreaking, or whatever.
Nor do I think that the “labs” can create an LLM that comes close to passing using the current state of the art. Not with the 4-level generation, not with the 5-level generation, and I suspect probably not with the 6-level generation. There are too many weird human things you’d have to get right. And doing it with pure prompting is right out.
Even if they could, it is, as you suggested, an anti-goal for them, and it’s an expensive anti-goal. They’d be spending vast amounts of money to build something that they couldn’t use as a product, but that could be a huge PR liability.