They haven’t proven any theorems that anyone cares about. They haven’t written anything that anyone will want to read in ten years (or even one year). Despite apparently memorizing more information than any human could ever dream of, they have made precisely zero novel connections or insights in any area of science[3].
An anecdote I heard through the grapevine: some chemist was trying to synthesize some chemical. He couldn’t get some step to work, and tried for a while to find solutions on the internet. He eventually asked an LLM. The LLM gave a very plausible causal story about what was going wrong and suggested a modified setup which, in fact, fixed the problem. The idea seemed so hum-drum that the chemist thought, surely, the idea was actually out there in the world and the LLM had scraped it from the internet. However, the chemist continued searching and, even with the details in hand, could not find anyone talking about this anywhere. Weak conclusion: the LLM actually came up with this idea due to correctly learning a good-enough causal model generalizing not-very-closely-related chemistry ideas in its training set.
Weak conclusion: there are more than precisely zero novel scientific insights in LLMs.
I think non-reasoning models such as 4o and Claude are better-understood as doing induction with a “circuit prior” which is going to be significantly different from Solomonoff (longer-running programs require larger circuits, which will be penalized).
Reasoning models such as o1 and r1 are in some sense Turing-complete, and so, much more akin to Solomonoff. Of course, the RL used in such models is not training on the prediction task like Solomonoff Induction.