I’m not sure why people would think LLMs understand their own output, we know they’re not up to spotting sometimes human-obvious inconsistencies in it (as soon as they are, things will start moving very quickly).
LLMs can sometimes spot some inconsistencies in their own outputs—for example, here I ask ChatGPT to produce a list of three notable individuals that share a birth date and year, and here I ask it to judge the correctness of the response to that question, and it is able to tell that the response was inaccurate.
It’s certainly not perfect or foolproof, but it’s not something they’re strictly incapable of either.
Although in fairness you would not be wrong if you said “LLMs can sometimes spot human-obvious inconsistencies in their outputs, but also things are currently moving very quickly”.
I’m not sure why people would think LLMs understand their own output, we know they’re not up to spotting sometimes human-obvious inconsistencies in it (as soon as they are, things will start moving very quickly).
LLMs can sometimes spot some inconsistencies in their own outputs—for example, here I ask ChatGPT to produce a list of three notable individuals that share a birth date and year, and here I ask it to judge the correctness of the response to that question, and it is able to tell that the response was inaccurate.
It’s certainly not perfect or foolproof, but it’s not something they’re strictly incapable of either.
Although in fairness you would not be wrong if you said “LLMs can sometimes spot human-obvious inconsistencies in their outputs, but also things are currently moving very quickly”.