one kind of reasoning in humans is a kind of instant intuition; you see something and something immediately and effortlessly pops into your mind. examples include recalling vocabulary in a language you’re fluent in, playing a musical instrument proficiently, or having a first guess at what might be going wrong when debugging.
another kind of reasoning is the chain of thought, or explicit reasoning: you lay out your reasoning steps as words in your head, interspersed perhaps with visuals, or abstract concepts that you would have a hard time putting in words. It feels like you’re consciously picking each step of the reasoning. Working through a hard math problem, or explicitly designing a codebase by listing the constraints and trying to satisfy them, are examples of this.
so far these map onto what people call system 1 and 2, but I’ve intentionally avoided these labels because I think there’s actually a third kind of reasoning that doesn’t fit well into either of these buckets.
sometimes, I need to put the relevant info into my head, and then just let it percolate slowly without consciously thinking about it. at some later time, insights into the problem will suddenly and unpredictably pop into my head. I’ve found this mode of reasoning to be indispensible for dealing with the hardest problems, or for generating insights, where if I just did explicit reasoning I’d just get stuck.
of course, you can’t just sit around and do nothing and hope insights come to you—to make this process work you have to absorb lots of info, and also do a lot of explicit reasoning before and after to take flashes of insight and turn them into actual fleshed-out knowledge. and there are conditions that are more or less conducive to this kind of reasoning.
I’m still figuring out how to best leverage it, but I think one hypothesis this raises is the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning, and creating whatever conditions are needed for subconscious insight-generating reasoning.
the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning
I have a pet theory that there are literally physiological events that take minutes, hours, or maybe even days or longer, to happen, which are basically required for some kinds of insight. This would look something like:
First you do a bunch of explicit work trying to solve the problem. This makes a bunch of progress, and also starts to trace out the boundaries of where you’re confused / missing info / missing ideas.
You bash your head against that boundary even more.
You make much less explicit progress.
But, you also leave some sort of “physiological questions”. I don’t know the neuroscience at all, but to make up a story to illustrate what sort of thing I mean: One piece of your brain says “do I know how to do X?”. Some other pieces say “maybe I can help”. The seeker talks to the volunteers, and picks the best one or two. The seeker says “nah, that’s not really what I’m looking for, you didn’t address Y”. And this plays out as some pattern of electrical signals which mean “this and this and this neuron shouldn’t have been firing so much” (like a backprop gradient, kinda), or something, and that sets up some cell signaling state, which will take a few hours to resolve (e.g. downregulating some protein production, which will eventually make the neuron a bit less excitable by changing the number of ion pumps, or decreasing the number of synaptic vesicles, or something).
Then you chill, and the physiological questions mostly don’t do anything, but some of them answer themselves in the background; neurons in some small circuit can locally train themselves to satisfy the question left there exogenously.
a thing i think is probably happening and significant in such cases: developing good ‘concepts/ideas’ to handle a problem, ‘getting a feel for what’s going on in a (conceptual) situation’
a plausibly analogous thing in humanity(-seen-as-a-single-thinker): humanity states a conjecture in mathematics, spends centuries playing around with related things (tho paying some attention to that conjecture), building up mathematical machinery/understanding, until a proof of the conjecture almost just falls out of the machinery/understanding
This is learning of a narrow topic, which builds representations that make thinking on that topic more effective, novel insights might become feasible even through system 1 where before system 2 couldn’t help. With o1, LLMs have systems 1 and 2, but all learning is in pretraining, not targeting the current problem and in any case with horrible sample efficiency. Could be a crucial missing capability, though with scale even in-context learning might get there.
of course, you can’t just sit around and do nothing and hope insights come to you—to make this process work you have to absorb lots of info, and also do a lot of explicit reasoning before and after to take flashes of insight and turn them into actual fleshed-out knowledge.
Giorgio Parisi mentionned this in his book; he said that the ah-ah moments tend to spark randomly when doing something else. Bertrand Russell had a very active social life (he praised leisure) and believed it is an active form of idleness that could reveal very productive. A good balance might be the best way to leverage it.
one kind of reasoning in humans is a kind of instant intuition; you see something and something immediately and effortlessly pops into your mind. examples include recalling vocabulary in a language you’re fluent in, playing a musical instrument proficiently, or having a first guess at what might be going wrong when debugging.
another kind of reasoning is the chain of thought, or explicit reasoning: you lay out your reasoning steps as words in your head, interspersed perhaps with visuals, or abstract concepts that you would have a hard time putting in words. It feels like you’re consciously picking each step of the reasoning. Working through a hard math problem, or explicitly designing a codebase by listing the constraints and trying to satisfy them, are examples of this.
so far these map onto what people call system 1 and 2, but I’ve intentionally avoided these labels because I think there’s actually a third kind of reasoning that doesn’t fit well into either of these buckets.
sometimes, I need to put the relevant info into my head, and then just let it percolate slowly without consciously thinking about it. at some later time, insights into the problem will suddenly and unpredictably pop into my head. I’ve found this mode of reasoning to be indispensible for dealing with the hardest problems, or for generating insights, where if I just did explicit reasoning I’d just get stuck.
of course, you can’t just sit around and do nothing and hope insights come to you—to make this process work you have to absorb lots of info, and also do a lot of explicit reasoning before and after to take flashes of insight and turn them into actual fleshed-out knowledge. and there are conditions that are more or less conducive to this kind of reasoning.
I’m still figuring out how to best leverage it, but I think one hypothesis this raises is the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning, and creating whatever conditions are needed for subconscious insight-generating reasoning.
I have a pet theory that there are literally physiological events that take minutes, hours, or maybe even days or longer, to happen, which are basically required for some kinds of insight. This would look something like:
First you do a bunch of explicit work trying to solve the problem. This makes a bunch of progress, and also starts to trace out the boundaries of where you’re confused / missing info / missing ideas.
You bash your head against that boundary even more.
You make much less explicit progress.
But, you also leave some sort of “physiological questions”. I don’t know the neuroscience at all, but to make up a story to illustrate what sort of thing I mean: One piece of your brain says “do I know how to do X?”. Some other pieces say “maybe I can help”. The seeker talks to the volunteers, and picks the best one or two. The seeker says “nah, that’s not really what I’m looking for, you didn’t address Y”. And this plays out as some pattern of electrical signals which mean “this and this and this neuron shouldn’t have been firing so much” (like a backprop gradient, kinda), or something, and that sets up some cell signaling state, which will take a few hours to resolve (e.g. downregulating some protein production, which will eventually make the neuron a bit less excitable by changing the number of ion pumps, or decreasing the number of synaptic vesicles, or something).
Then you chill, and the physiological questions mostly don’t do anything, but some of them answer themselves in the background; neurons in some small circuit can locally train themselves to satisfy the question left there exogenously.
See also “Planting questions”.
a thing i think is probably happening and significant in such cases: developing good ‘concepts/ideas’ to handle a problem, ‘getting a feel for what’s going on in a (conceptual) situation’
a plausibly analogous thing in humanity(-seen-as-a-single-thinker): humanity states a conjecture in mathematics, spends centuries playing around with related things (tho paying some attention to that conjecture), building up mathematical machinery/understanding, until a proof of the conjecture almost just falls out of the machinery/understanding
This is learning of a narrow topic, which builds representations that make thinking on that topic more effective, novel insights might become feasible even through system 1 where before system 2 couldn’t help. With o1, LLMs have systems 1 and 2, but all learning is in pretraining, not targeting the current problem and in any case with horrible sample efficiency. Could be a crucial missing capability, though with scale even in-context learning might get there.
Sounds like a synthetic data generation pipeline.
Relatable.
Giorgio Parisi mentionned this in his book; he said that the ah-ah moments tend to spark randomly when doing something else. Bertrand Russell had a very active social life (he praised leisure) and believed it is an active form of idleness that could reveal very productive. A good balance might be the best way to leverage it.