We’re all more or less doing that when we speak or write, though there are times when we may set out to be deliberately surprising – but we can set such complications aside
We’re all more or less doing that when we speak or write?
We’re all more or less doing that when we speak or write, though there are times when we may set out to be deliberately surprising – but we can set such complications aside
We’re all more or less doing that when we speak or write?
If you think of the LLM as a complex dynamical system, then the trajectory is a valley in the system’s attractor landscape.
The real argument here is that you can construct simple dynamical systems, in the sense that the equation is quite simple, that have complex behavior. For example, the Lorenz system though there should be an even more simple example of say, ergodic behavior.
The train/test framework is not helpful for understanding this. The dynamical system view is more useful (though beware that this starts to get close to the term “emergent behavior” which we must be wary of). The interesting thing about chaos is that, while the behavior is not perfectly predictable, maybe even surprising, it has well-defined properties and mathematical constraints. Everything is not possible. The Lorenz System has finite support. In the same spirit, we need to take a step back and realize that the kind of “real AI” that people are afraid of would require causal modeling which is mathematically impossible to construct using correlation only. If the model is able to start making interventions in the world, then we need to consider the possibility that it will be able to construct a casual model. But this goes beyond predicting the next word, which is the scope of this article.