I see some loose analogies between the capabilities of such models and the capabilities of the Turing machine and Turing-complete systems.
Those models might not be best suited for some of the tasks, but with enough complexity and learning, they might model things that they were not initially designed or thought of modeling (likely in a strange obscure way).
Similarly, you can, even if not very efficiently, implement any algorithm in any Turing-complete system (including bizarre ones like an abstract pure Turing machine or Minecraft redstone).
In both cases, it is clear to me that you can have a system with some relatively simple rules and internal workings but it does not mean that the only thing it can do is compute or model something similar to these rules.
There’s an asterisk to the idea that Turing machines can implement truly any algorithm: It obviously can’t solve the halting problem or generate all of PA’s theorems, and there are stronger computers than Turing machines, but the properties required for that are for our purposes inaccessible, so the Turing machine analogy works for LLMs.
I see some loose analogies between the capabilities of such models and the capabilities of the Turing machine and Turing-complete systems.
Those models might not be best suited for some of the tasks, but with enough complexity and learning, they might model things that they were not initially designed or thought of modeling (likely in a strange obscure way).
Similarly, you can, even if not very efficiently, implement any algorithm in any Turing-complete system (including bizarre ones like an abstract pure Turing machine or Minecraft redstone).
In both cases, it is clear to me that you can have a system with some relatively simple rules and internal workings but it does not mean that the only thing it can do is compute or model something similar to these rules.
There’s an asterisk to the idea that Turing machines can implement truly any algorithm: It obviously can’t solve the halting problem or generate all of PA’s theorems, and there are stronger computers than Turing machines, but the properties required for that are for our purposes inaccessible, so the Turing machine analogy works for LLMs.