I think we’re saying the same thing? “The LLM being given less information [about the internal state of the actor it is imitating]” and “the LLM needs to maintain a probability distribution over possible internal states of the actor it is imitating” seem pretty equivalent.
I think we’re saying the same thing? “The LLM being given less information [about the internal state of the actor it is imitating]” and “the LLM needs to maintain a probability distribution over possible internal states of the actor it is imitating” seem pretty equivalent.