It is much, much easier for me to predict a text if I have seen a lot of similar texts beforehand, compared to if I have never seen such a text, and need to model the mind that is writing it with their knowledge and intentions, and the causal relations, to generate the result myself. I think the prime numbers are a good illustration here. I can easily imagine a machine learning algorithm that has seen people list prime numbers in order, and you can give it the first couple of primes and it will spit out the next couple, while having no idea what prime numbers are, let alone how to generate them, or how to predict further primes.
Have you asked ChatGPT to generate scientific papers? It is fascinating. They look like scientific papers. Except the experiments did not happen. The references lead into the void. The conclusions are nonsense.
Similar with continuing screenplay. They are excellent at capturing the voice of characters, but when asked to e.g. write a new Game of Thrones Ending, what they came up with, while surprising and involving the right characters and dragons and violence and moral greyness etc. etc. was littered with plot holes. E.g. their ending included Cersei having a hidden dragon under the red keep all along, which makes no sense at all.
I am surprised at how well they are doing. They are definitely indicating some appreciation of causal reasoning. They are doing some things I would not expect a stupid predictor to be able to predict. E.g. I am stunned that they can insert a character from one novel into another and make reasonable predictions, or follow along with some moral reasoning, or physics. There is some genuine intelligence there, not just stochastically parroting. E.g. you can speak nonsense*, or morse code, or remove all vowels from your words, and they will pick up on it and go along surprisingly quickly. (*Nonsense tends to actually be a lot less random than the humans producing it think.)
But you are mistaking the way you would predict the next piece of text for the only way to do it. This is closely related to the fact that you have not, in fact, read pretty much the whole internet. Humans are excellent at inferring a lot from very little. It is quite different to from what ChatGPT is doing.
Also, ChatGPT is not actually going for the likeliest prediction. They developers tried that at the beginning, and found the results dull, and tweaked them. In order to give continuations and responses that are interesting, inspiring etc., they are actually deviating somewhat from the likeliest next tokens.
It is much, much easier for me to predict a text if I have seen a lot of similar texts beforehand, compared to if I have never seen such a text, and need to model the mind that is writing it with their knowledge and intentions, and the causal relations, to generate the result myself. I think the prime numbers are a good illustration here. I can easily imagine a machine learning algorithm that has seen people list prime numbers in order, and you can give it the first couple of primes and it will spit out the next couple, while having no idea what prime numbers are, let alone how to generate them, or how to predict further primes.
Have you asked ChatGPT to generate scientific papers? It is fascinating. They look like scientific papers. Except the experiments did not happen. The references lead into the void. The conclusions are nonsense.
Similar with continuing screenplay. They are excellent at capturing the voice of characters, but when asked to e.g. write a new Game of Thrones Ending, what they came up with, while surprising and involving the right characters and dragons and violence and moral greyness etc. etc. was littered with plot holes. E.g. their ending included Cersei having a hidden dragon under the red keep all along, which makes no sense at all.
I am surprised at how well they are doing. They are definitely indicating some appreciation of causal reasoning. They are doing some things I would not expect a stupid predictor to be able to predict. E.g. I am stunned that they can insert a character from one novel into another and make reasonable predictions, or follow along with some moral reasoning, or physics. There is some genuine intelligence there, not just stochastically parroting. E.g. you can speak nonsense*, or morse code, or remove all vowels from your words, and they will pick up on it and go along surprisingly quickly. (*Nonsense tends to actually be a lot less random than the humans producing it think.)
But you are mistaking the way you would predict the next piece of text for the only way to do it. This is closely related to the fact that you have not, in fact, read pretty much the whole internet. Humans are excellent at inferring a lot from very little. It is quite different to from what ChatGPT is doing.
Also, ChatGPT is not actually going for the likeliest prediction. They developers tried that at the beginning, and found the results dull, and tweaked them. In order to give continuations and responses that are interesting, inspiring etc., they are actually deviating somewhat from the likeliest next tokens.