It is surely possible that there are mesa optimizers present in many, even relatively simple LLMs. But the question is: How powerful are these? How large is the state space that they can search through, for example? The state space of the mesa-optimizer can’t be larger than the the context window it is using to generate the answer, for example, while the state space of the full LLM is much bigger—basically all its weights.
It is surely possible that there are mesa optimizers present in many, even relatively simple LLMs. But the question is: How powerful are these? How large is the state space that they can search through, for example? The state space of the mesa-optimizer can’t be larger than the the context window it is using to generate the answer, for example, while the state space of the full LLM is much bigger—basically all its weights.