Some people have expressed that “GPT-2 doesn’t understand anything about language or reality. It’s just huge statistics.” In at least two senses, this is true.
My complaint is that GPT-2 isn’t able to reason with whatever “understanding” it has (as shown by FeepingCreature’s example “We are in favor of recycling, because recycling doesn’t actually improve the environment, and that’s why we are against recycling.”) which seems like the most important thing we want in an AI that “understands language”.
With an abstract understanding, here are some things one can do:
• answer questions about it in one’s own words
• define it
• use it appropriately in a sentence
• provide details about it
• summarize it
I suggest that these are all tests that in a human highly correlates with being able to reason with a concept (which again is what we really want) but the correlation apparently breaks down when we’re dealing with AI, so the fact that an AI can pass these tests doesn’t mean as much as it would with a human.
At this point we have to decide whether we want the word “understand” to mean ”… and is able to reason with it” and I think we do because if we say “GPT-2 understands language” then a lot of people will misinterpret that as meaning that GPT-2 can do verbal/symbolic reasoning, and that seems worse than the opposite confusion, where we say “GPT-2 doesn’t understand language” and people misinterpret that as meaning that GPT-2 can’t give definitions or summaries.
One way we might choose to draw these distinctions is using the technical vocabulary that teachers have developed. Reasoning about something is more than mere Comprehension: it would be called Application, Analysis or Synthesis, depending on how the reasoning is used.
GPT-2 actually can do a little bit of deductive reasoning, but it is not very good at it.
One way we might choose to draw these distinctions is using the technical vocabulary that teachers have developed. Reasoning about something is more than mere Comprehension: it would be called Application, Analysis or Synthesis, depending on how the reasoning is used.
So would you say that GPT-2 has Comprehension of “recycling” but not Comprehension of “in favor of” and “against”, because it doesn’t show even the basic understand that the latter pair are opposites? I feel like even teachers’ technical vocabulary isn’t great here because it was developed with typical human cognitive development in mind, and AIs aren’t “growing up” the same way.
So would you say that GPT-2 has Comprehension of “recycling” but not Comprehension of “in favor of” and “against”, because it doesn’t show even the basic understand that the latter pair are opposites?
Something like that, yes. I would say that the concept “recycling” is correctly linked to “the environment” by an “improves” relation, and that it Comprehends “recycling” and “the environment” pretty well. But some texts say that the “improves” relation is positive, and some texts say it is negative (“doesn’t really improve”) and so GPT-2 holds both contradictory beliefs about the relation simultaneously. Unlike humans, it doesn’t try to maintain consistency in what it expresses, and doesn’t express uncertainty properly. So we see what looks like waffling between contradictory strongly held opinions in the same sentence or paragraph.
As for whether the vocabulary is appropriate for discussing such an inhuman contraption or whether it is too misleading to use, especially when talking to non-experts, I don’t really know. I’m trying to go beyond descriptions of GPT-2 “doesn’t understand what it is saying” and “understands what it is saying” to a more nuanced picture of what capabilities and internal conceptual structures are actually present and absent.
My complaint is that GPT-2 isn’t able to reason with whatever “understanding” it has (as shown by FeepingCreature’s example “We are in favor of recycling, because recycling doesn’t actually improve the environment, and that’s why we are against recycling.”) which seems like the most important thing we want in an AI that “understands language”.
I suggest that these are all tests that in a human highly correlates with being able to reason with a concept (which again is what we really want) but the correlation apparently breaks down when we’re dealing with AI, so the fact that an AI can pass these tests doesn’t mean as much as it would with a human.
At this point we have to decide whether we want the word “understand” to mean ”… and is able to reason with it” and I think we do because if we say “GPT-2 understands language” then a lot of people will misinterpret that as meaning that GPT-2 can do verbal/symbolic reasoning, and that seems worse than the opposite confusion, where we say “GPT-2 doesn’t understand language” and people misinterpret that as meaning that GPT-2 can’t give definitions or summaries.
One way we might choose to draw these distinctions is using the technical vocabulary that teachers have developed. Reasoning about something is more than mere Comprehension: it would be called Application, Analysis or Synthesis, depending on how the reasoning is used.
GPT-2 actually can do a little bit of deductive reasoning, but it is not very good at it.
So would you say that GPT-2 has Comprehension of “recycling” but not Comprehension of “in favor of” and “against”, because it doesn’t show even the basic understand that the latter pair are opposites? I feel like even teachers’ technical vocabulary isn’t great here because it was developed with typical human cognitive development in mind, and AIs aren’t “growing up” the same way.
Something like that, yes. I would say that the concept “recycling” is correctly linked to “the environment” by an “improves” relation, and that it Comprehends “recycling” and “the environment” pretty well. But some texts say that the “improves” relation is positive, and some texts say it is negative (“doesn’t really improve”) and so GPT-2 holds both contradictory beliefs about the relation simultaneously. Unlike humans, it doesn’t try to maintain consistency in what it expresses, and doesn’t express uncertainty properly. So we see what looks like waffling between contradictory strongly held opinions in the same sentence or paragraph.
As for whether the vocabulary is appropriate for discussing such an inhuman contraption or whether it is too misleading to use, especially when talking to non-experts, I don’t really know. I’m trying to go beyond descriptions of GPT-2 “doesn’t understand what it is saying” and “understands what it is saying” to a more nuanced picture of what capabilities and internal conceptual structures are actually present and absent.