If you Google “quines in Python” there are many examples, so I think the model learned about it prior to that. But all things considered, examples of quines would likely be sparse in the overall corpus of code that was trained on, and so it makes sense that pulling it off consistently required a somewhat larger model. I think it’s akin to the handling of arithmetic in GPT3 - it will very frequently fail to provide correct answers to 4-digit multiplication. This is simply because it has not seen all the countless permutation of 4-digit numbers, and it does not really learn what multiplication is. If it did learn what multiplication is, it would be trivial to devote a small set of neurons to perform that—after all, a calculator can be coded in a very small space which can multiply any number. GPT-4 is likely able to multiply numbers somewhat more consistently, but it likely still hasn’t invented an internal calculator either.
If you Google “quines in Python” there are many examples, so I think the model learned about it prior to that. But all things considered, examples of quines would likely be sparse in the overall corpus of code that was trained on, and so it makes sense that pulling it off consistently required a somewhat larger model. I think it’s akin to the handling of arithmetic in GPT3 - it will very frequently fail to provide correct answers to 4-digit multiplication. This is simply because it has not seen all the countless permutation of 4-digit numbers, and it does not really learn what multiplication is. If it did learn what multiplication is, it would be trivial to devote a small set of neurons to perform that—after all, a calculator can be coded in a very small space which can multiply any number. GPT-4 is likely able to multiply numbers somewhat more consistently, but it likely still hasn’t invented an internal calculator either.