The question is—how far can we get with in-context learning. If we filled Gemini’s 10 million tokens with Sudoku rules and examples, showing where it went wrong each time, would it generalize? I’m not sure but I think it’s possible
I agree that filling a context window with worked sudoku examples wouldn’t help for solving hidouku. But, there is a common element here to the games. Both look like math, but aren’t about numbers except that there’s an ordered sequence. The sequence of items could just as easily be an alphabetically ordered set of words. Both are much more about geometry, or topology, or graph theory, for how a set of points is connected. I would not be surprised to learn that there is a set of tokens, containing no examples of either game, combined with a checker (like your link has) that points out when a mistake has been made, that enables solving a wide range of similar games.
I think one of the things humans do better than current LLMs is that, as we learn a new task, we vary what counts as a token and how we nest tokens. How do we chunk things? In sudoku, each box is a chunk, each row and column are a chunk, the board is a chunk, “sudoku” is a chunk, “checking an answer” is a chunk, “playing a game” is a chunk, and there are probably lots of others I’m ignoring. I don’t think just prompting an LLM with the full text of “How to solve it” in its context window would get us to a solution, but at some level I do think it’s possible to make explicit, in words and diagrams, what it is humans do to solve things, in a way legible to it. I think it largely resembles repeatedly telescoping in and out, to lower and higher abstractions applying different concepts and contexts, locally sanity checking ourselves, correcting locally obvious insanity, and continuing until we hit some sort of reflective consistency. Different humans have different limits on what contexts they can successfully do this in.
Absolutely. I don’t think it’s impossible to build such a system. In fact, I think a transformer is probably about 90% there. Need to add trial and error, some kind of long-term memory/fine-tuning and a handful of default heuristics. Scale will help too, but no amount of scale alone will get us there.
The question is—how far can we get with in-context learning. If we filled Gemini’s 10 million tokens with Sudoku rules and examples, showing where it went wrong each time, would it generalize? I’m not sure but I think it’s possible
It certainly wouldn’t generalize to e.g Hidouku
I agree that filling a context window with worked sudoku examples wouldn’t help for solving hidouku. But, there is a common element here to the games. Both look like math, but aren’t about numbers except that there’s an ordered sequence. The sequence of items could just as easily be an alphabetically ordered set of words. Both are much more about geometry, or topology, or graph theory, for how a set of points is connected. I would not be surprised to learn that there is a set of tokens, containing no examples of either game, combined with a checker (like your link has) that points out when a mistake has been made, that enables solving a wide range of similar games.
I think one of the things humans do better than current LLMs is that, as we learn a new task, we vary what counts as a token and how we nest tokens. How do we chunk things? In sudoku, each box is a chunk, each row and column are a chunk, the board is a chunk, “sudoku” is a chunk, “checking an answer” is a chunk, “playing a game” is a chunk, and there are probably lots of others I’m ignoring. I don’t think just prompting an LLM with the full text of “How to solve it” in its context window would get us to a solution, but at some level I do think it’s possible to make explicit, in words and diagrams, what it is humans do to solve things, in a way legible to it. I think it largely resembles repeatedly telescoping in and out, to lower and higher abstractions applying different concepts and contexts, locally sanity checking ourselves, correcting locally obvious insanity, and continuing until we hit some sort of reflective consistency. Different humans have different limits on what contexts they can successfully do this in.
Absolutely. I don’t think it’s impossible to build such a system. In fact, I think a transformer is probably about 90% there. Need to add trial and error, some kind of long-term memory/fine-tuning and a handful of default heuristics. Scale will help too, but no amount of scale alone will get us there.