We can discuss anything that exists, that might exist, that did exist, that could exist, and that could not exist. So no matter what form your predict-the-next-token language model takes, if it is trained over the entire corpus of the written word, the representations it forms will be pretty hard to understand, because the representations encode an entire understanding of the entire world.
Perhaps.
Imagine a huge number of very skilled programmers tried to manually hard code a ChatGPT in python.
Ask this pyGPT to play chess, and it will play chess. Look under the hood, and you see a chess engine programmed in. Ask it to solve algebra problems, a symbolic algebra package is in there. All in the best neat and well commented code.
Ask it to compose poetry, and you have some algorithm that checks if 2 words rhyme. Some syllable counter. Etc.
Rot13 is done with a hardcoded rot13 algorithm.
Somewhere in the algorithm is a giant list of facts, containing “Penguins Live In Antarctica”. And if you change this fact to say “Penguins Live in Canada”, then the AI will believe this. (Or spot it’s inconsistency with other facts?)
And with one simple change, the AI believes this consistently. Penguins appear when this AI is asked for poems about canada, and don’t appear in poems about Antarctica.
When asked about the native canadian diet, it will speculate that this likely included penguin, but say that it doesn’t know of any documented examples of this.
Can you build something with ChatGPT level performance entirely out of human comprehensible programmatic parts?
Obviously having humans program these parts directly would be slow. (We are still talking about a lot of code.) But if some algorithm could generate that code?
Perhaps.
Imagine a huge number of very skilled programmers tried to manually hard code a ChatGPT in python.
Ask this pyGPT to play chess, and it will play chess. Look under the hood, and you see a chess engine programmed in. Ask it to solve algebra problems, a symbolic algebra package is in there. All in the best neat and well commented code.
Ask it to compose poetry, and you have some algorithm that checks if 2 words rhyme. Some syllable counter. Etc.
Rot13 is done with a hardcoded rot13 algorithm.
Somewhere in the algorithm is a giant list of facts, containing “Penguins Live In Antarctica”. And if you change this fact to say “Penguins Live in Canada”, then the AI will believe this. (Or spot it’s inconsistency with other facts?)
And with one simple change, the AI believes this consistently. Penguins appear when this AI is asked for poems about canada, and don’t appear in poems about Antarctica.
When asked about the native canadian diet, it will speculate that this likely included penguin, but say that it doesn’t know of any documented examples of this.
Can you build something with ChatGPT level performance entirely out of human comprehensible programmatic parts?
Obviously having humans program these parts directly would be slow. (We are still talking about a lot of code.) But if some algorithm could generate that code?