The “hidden tokens” approach seems to be pretty much what (some of?) our somewhat-LLM-like brains do: if we have something complicated to think about, or we don’t want to share every detail of our thinking with those around us, we produce something resembling language or imagery internally but don’t say it out loud.
My understanding is that people vary more than one might think in exactly what this internal thinking-trace looks like—e.g., some people are more verbal, some less. I’m pretty verbal myself and don’t know what the experience of thinking through, say, how to prove a theorem or implement an algorithm is like for less verbal people. Maybe it’s not so much like “hidden tokens”. But my guess is that it is still somewhat hidden-token-like, under the hood. (Maybe something like an “internalized” version of the output of multimodal models that produce both text and images? But I don’t know much about how those work.)
The “hidden tokens” approach seems to be pretty much what (some of?) our somewhat-LLM-like brains do: if we have something complicated to think about, or we don’t want to share every detail of our thinking with those around us, we produce something resembling language or imagery internally but don’t say it out loud.
My understanding is that people vary more than one might think in exactly what this internal thinking-trace looks like—e.g., some people are more verbal, some less. I’m pretty verbal myself and don’t know what the experience of thinking through, say, how to prove a theorem or implement an algorithm is like for less verbal people. Maybe it’s not so much like “hidden tokens”. But my guess is that it is still somewhat hidden-token-like, under the hood. (Maybe something like an “internalized” version of the output of multimodal models that produce both text and images? But I don’t know much about how those work.)