lenivchick comments on The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

lenivchick 7 Nov 2023 22:03 UTC
3 points
−1
True, but you can always wriggle out saying that all of that doesn’t count as “truly understanding”. Yes, LLM’s capabilities are impressive, but does drawing SVG changes the fact that somewhere inside the model all of these capabilities are represented by “mere” number relations?

Do LLM’s “merely” repeat the training data? They do, but do they do it “merely”? There is no answer, unless somebody gives a commonly accepted criterion of “mereness”.

The core issue with that is of course that since no one has a more or less formal and comprehensive definition of “truly understanding” that everyone agrees with—you can play with words however you like to rationalize whatever prior you had about LLM.

Substituting one vaguely defined concept of “truly understanding” with another vaguely defined concept of a “world model” doesn’t help much. For example, does “this token is often followed by that token” constitutes a world model? If not—why not? It is really primitive, but who said world model has to be complex and have something to do with 3D space or theory of mind to be a world model? Isn’t our manifest image of reality also a shadow on the wall since it lacks “true understanding” of underlying quantum fields or superstrings or whatever in the same way that long list of correlations between tokens is a shadow of our world?

The “stochastic parrot” argument has been an armchair philosophizing from the start, so no amount of evidence like that will convince people that take it seriously. Even if LLM-based AGI will take over the world—the last words of such a person gonna be “but that’s not true thinking”. And I’m not using that as a strawman—there’s nothing wrong with a priori reasoning as such, unless you doing it wrong.

I think the best response to “stochastic parrot” is asking three questions:

1. What is your criterion of “truly understanding”? Answer concretely in a terms of the structure or behavior of the model itself and without circular definitions like “having a world model” which is defined as “conscious experience” and that is defined as “feeling redness of red” etc. Otherwise the whole argument becomes completely orthogonal to any reality at all.

2. Why do you think LLM’s do not satisfy that criterion and human brain does?

3. Why do you think it is relevant to any practical intents and purposes, for example to the question “will it kill you if you turn it on”?