Ok I think we are talking past each other, hence the accusation of a straw man. When you say “concepts” you are referring to the predictive models, both learned knowledge and dynamic state, which DOES exist inside an instance of GPT-2. This dynamic state is initialized with the input, at which point it encodes, to some degree, the content of the input. You are calling this “understanding.”
However when I say “concept modeling” I mean the ability to reason about this at a meta-level. To be able to not just *have* a belief which is useful in predicting the next token in a sequence, but to understand *why* you have that belief, and use that knowledge to inform your actions. These are ‘lifted’ beliefs, in the terminology of type theory, or quotations in functional programming. So to equate belief (predictive capability) and belief-about-belief (understanding of predictive capability) is a type error from my perspective, and does not compute.
GPT-2 has predictive capabilities. It does not instantiate a conceptual understanding of its predictive capabilities. It has no self-awareness, which I see as a prerequisite for “understanding.”
Ok I think we are talking past each other, hence the accusation of a straw man. When you say “concepts” you are referring to the predictive models, both learned knowledge and dynamic state, which DOES exist inside an instance of GPT-2. This dynamic state is initialized with the input, at which point it encodes, to some degree, the content of the input. You are calling this “understanding.”
However when I say “concept modeling” I mean the ability to reason about this at a meta-level. To be able to not just *have* a belief which is useful in predicting the next token in a sequence, but to understand *why* you have that belief, and use that knowledge to inform your actions. These are ‘lifted’ beliefs, in the terminology of type theory, or quotations in functional programming. So to equate belief (predictive capability) and belief-about-belief (understanding of predictive capability) is a type error from my perspective, and does not compute.
GPT-2 has predictive capabilities. It does not instantiate a conceptual understanding of its predictive capabilities. It has no self-awareness, which I see as a prerequisite for “understanding.”
Yeah, you’re right. It seems like we both have a similar picture of what GPT-2 can and can’t do, and are just using the word “understand” differently.