I’ve now made two posts about LLMs and ‘general reasoning’, but used a fairly handwavy definition of that term. I don’t yet have a definition I feel fully good about, but my current take is something like:
The ability to do deduction, induction, and abduction
in a careful, step by step way, without many errors that a better reasoner could avoid,
including in new domains; and
the ability to use all of that to build a self-consistent internal model of the domain under consideration.
What am I missing? Where does this definition fall short?
The Ord piece is really intriguing, although I’m not sure I’m entirely convinced that it’s a useful framing.
Some of his examples (eg cosine-ish wave to ripple) rely on the fundamental symmetry between spatial dimensions, which wouldn’t apply to many kinds of hyperpolation.
The video frame construction seems more like extrapolation using an existing knowledge base about how frames evolve over time (eg how ducks move in the water).
Given an infinite number of possible additional dimensions, it’s not at all clear how a NN could choose a particular one to try to hyperpolate into.
It’s a fascinating idea, though, and one that’ll definitely stick with me as a possible framing. Thanks!
With respect to Chollet’s definition (the youtube link):
I agree with many of Chollet’s points, and the third and fourth items in my list are intended to get at those.
I do find Chollet a bit frustrating in some ways, because he seems somewhat inconsistent about what he’s saying. Sometimes he seems to be saying that LLMs are fundamentally incapable of handling real novelty, and we need something very new and different. Other times he seems to be saying it’s a matter of degree: that LLMs are doing the right things but are just sample-inefficient and don’t have a good way to incorporate new information. I imagine that he has a single coherent view internally and just isn’t expressing it as clearly as I’d like, although of course I can’t know.
I think part of the challenge around all of this is that (AFAIK but I would love to be corrected) we don’t have a good way to identify what’s in and out of distribution for models trained on such diverse data, and don’t have a clear understanding of what constitutes novelty in a problem.
I agree with your frustrations, I think his views are somewhat inconsistent and confusing. But I also find my own understanding to be a bit confused and in need of better sources.
I do think the discussion François has in this interview is interesting. He talks about the ways people have tried to apply LLMs to ARC, and I think he makes some good points about the strengths and shortcomings of LLMs on tasks like this.
But I also find my own understanding to be a bit confused and in need of better sources.
Mine too, for sure.
And agreed, Chollet’s points are really interesting. As much as I’m sometimes frustrated with him, I think that ARC-AGI and his willingness to (get someone to) stake substantial money on it has done a lot to clarify the discourse around LLM generality, and also makes it harder for people to move the goalposts and then claim they were never moved).
I find it useful sometimes to think about “how to differentiate this term” when defining a term. In this case, in my mind it would be thinking about “reasoning”, vs “general reasoning” vs “generalization”.
Reasoning: narrower than general reasoning, probably would be your first two bullet points combined in my opinion
Generalization: even more general than general reasoning (does not need to be focused on reasoning). Seems could be the last two bullet points you have, particularly the third
General reasoning (this is not fully thought through): Now that we talked about “reasoning” and “generalization”, I see two types of definition
1. A bit closer to “reasoning”. first two of your bullet points, plus in multiple domains/multiple ways, but not necessarily unseen domains. In other simpler words, “reasoning in multiple domains and ways”.
2. A bit closer to “general” (my guess is this is closer to what you intended to have?): generalization ability, but focused on reasoning.
After some discussion elsewhere with @zeshen, I’m feeling a bit less comfortable with my last clause, building an internal model. I think of general reasoning as essentially a procedural ability, and model-building as a way of representing knowledge. In practice they seem likely to go hand-in-hand, but it seems in-principle possible that one could reason well, at least in some ways, without building and maintaining a domain model. For example, one could in theory perform a series of deductions using purely local reasoning at each step (although plausibly one might need a domain model in order to choose what steps to take?).
I’ve now made two posts about LLMs and ‘general reasoning’, but used a fairly handwavy definition of that term. I don’t yet have a definition I feel fully good about, but my current take is something like:
The ability to do deduction, induction, and abduction
in a careful, step by step way, without many errors that a better reasoner could avoid,
including in new domains; and
the ability to use all of that to build a self-consistent internal model of the domain under consideration.
What am I missing? Where does this definition fall short?
My current top picks for general reasoning in AI discussion are:
https://arxiv.org/abs/2409.05513
https://m.youtube.com/watch?v=JTU8Ha4Jyfc
The Ord piece is really intriguing, although I’m not sure I’m entirely convinced that it’s a useful framing.
Some of his examples (eg cosine-ish wave to ripple) rely on the fundamental symmetry between spatial dimensions, which wouldn’t apply to many kinds of hyperpolation.
The video frame construction seems more like extrapolation using an existing knowledge base about how frames evolve over time (eg how ducks move in the water).
Given an infinite number of possible additional dimensions, it’s not at all clear how a NN could choose a particular one to try to hyperpolate into.
It’s a fascinating idea, though, and one that’ll definitely stick with me as a possible framing. Thanks!
With respect to Chollet’s definition (the youtube link):
I agree with many of Chollet’s points, and the third and fourth items in my list are intended to get at those.
I do find Chollet a bit frustrating in some ways, because he seems somewhat inconsistent about what he’s saying. Sometimes he seems to be saying that LLMs are fundamentally incapable of handling real novelty, and we need something very new and different. Other times he seems to be saying it’s a matter of degree: that LLMs are doing the right things but are just sample-inefficient and don’t have a good way to incorporate new information. I imagine that he has a single coherent view internally and just isn’t expressing it as clearly as I’d like, although of course I can’t know.
I think part of the challenge around all of this is that (AFAIK but I would love to be corrected) we don’t have a good way to identify what’s in and out of distribution for models trained on such diverse data, and don’t have a clear understanding of what constitutes novelty in a problem.
I agree with your frustrations, I think his views are somewhat inconsistent and confusing. But I also find my own understanding to be a bit confused and in need of better sources.
I do think the discussion François has in this interview is interesting. He talks about the ways people have tried to apply LLMs to ARC, and I think he makes some good points about the strengths and shortcomings of LLMs on tasks like this.
Mine too, for sure.
And agreed, Chollet’s points are really interesting. As much as I’m sometimes frustrated with him, I think that ARC-AGI and his willingness to (get someone to) stake substantial money on it has done a lot to clarify the discourse around LLM generality, and also makes it harder for people to move the goalposts and then claim they were never moved).
I find it useful sometimes to think about “how to differentiate this term” when defining a term. In this case, in my mind it would be thinking about “reasoning”, vs “general reasoning” vs “generalization”.
Reasoning: narrower than general reasoning, probably would be your first two bullet points combined in my opinion
Generalization: even more general than general reasoning (does not need to be focused on reasoning). Seems could be the last two bullet points you have, particularly the third
General reasoning (this is not fully thought through): Now that we talked about “reasoning” and “generalization”, I see two types of definition
1. A bit closer to “reasoning”. first two of your bullet points, plus in multiple domains/multiple ways, but not necessarily unseen domains. In other simpler words, “reasoning in multiple domains and ways”.
2. A bit closer to “general” (my guess is this is closer to what you intended to have?): generalization ability, but focused on reasoning.
Interesting approach, thanks!
After some discussion elsewhere with @zeshen, I’m feeling a bit less comfortable with my last clause, building an internal model. I think of general reasoning as essentially a procedural ability, and model-building as a way of representing knowledge. In practice they seem likely to go hand-in-hand, but it seems in-principle possible that one could reason well, at least in some ways, without building and maintaining a domain model. For example, one could in theory perform a series of deductions using purely local reasoning at each step (although plausibly one might need a domain model in order to choose what steps to take?).