I think this is an excellent description of GPT-like models. It both fits with my observations and clarifies my thinking. It also leads me to examine in a new light questions which have been on my mind recently:
What is the limit of power of simulation that our current architectures (with some iterative improvements) can achieve when scaled to greater power (via additional computation, improved datasets, etc)?
Is a Simulator model really what we want? Can we trust the outputs we get from it to help us with things like accelerating alignment research? What might failure modes look like?
I think this is an excellent description of GPT-like models. It both fits with my observations and clarifies my thinking. It also leads me to examine in a new light questions which have been on my mind recently:
What is the limit of power of simulation that our current architectures (with some iterative improvements) can achieve when scaled to greater power (via additional computation, improved datasets, etc)?
Is a Simulator model really what we want? Can we trust the outputs we get from it to help us with things like accelerating alignment research? What might failure modes look like?