See this is another thing that is sorta false. We have the code around the production of those matrices. We know it’s “just” a compression system. It compressed all the text it was trained on, and the preferences of the human graders during RLHF, into a many dimensional function space we don’t understand.
Calling it a “blurry jpeg of the internet” is a perfectly apropos analogy.
See this is another thing that is sorta false. We have the code around the production of those matrices. We know it’s “just” a compression system. It compressed all the text it was trained on, and the preferences of the human graders during RLHF, into a many dimensional function space we don’t understand.
Calling it a “blurry jpeg of the internet” is a perfectly apropos analogy.
This is like saying we know how to read DNA because it’s “just” proteins.