epistemic status: unoriginal. trying to spread a useful framing of theoretical progress introduced from an old post.
Tl;dr, often the greatest theoretical challenge comes from the step of crossing the chasm from [developing an impractical solution to a problem] to [developing some sort of a polytime solution to a problem], because the nature of their solutions can be opposites.
Solving a foundational problem to its implementation often takes the following steps (some may be skipped):
developing a philosophical problem
developing a solution given infinite computing power
developing an impractical solution
developing some sort of polytime solution
developing a practical solution
and he says that it is often during the 3 → 4 step in which understanding gets stuck and the most technical and brute-force math (and i would add sometimes philosophical) work is needed, because:
a common motif in 3) is that they’re able to proving interesting things about their solutions, like asymptotic properties, by e.g., having their algorithms iterate through all turing machines, hence somewhat conferring the properties of the really good turing machine solution that exists somewhere in this massive search space to the overall search algorithm (up to a massive constant, usually).
think of Levin’s Universal Search, AIXItl, Logical Induction.
he says such algorithms are secretly a black box algorithm; there are no real gears.
Meanwhile, algorithms in 4) have the opposite nature—they are polynomial often because they characterize exploitable patterns that make a particular class of problems easier than most others, which requires Real Understanding. So algorithms of 3) and 4) often look nothing alike.
I liked this post and the idea of the “3-4 chasm,” because it explicitly captures the vibes of why I personally felt the vibes that, e.g., AIT, might be less useful for my work: after reading this post, I realized that for example when I refer to the word “structure,” I’m usually pointing at the kind of insights required to cross the 3-4 gap, while others might be using the same word to refer to things at a different level. This causes me to get confused as to how some tool X that someone brought up is supposed to help with the 3-4 gap I’m interested in.[1]
Vanessa Cosoy refers to this post, saying (in my translation of her words) that a lot of the 3-4 gap in computational learning theory has to do with our lack of understanding of deep learning theory, like how the NP-complete barrier is circumvented in practical problems, what are restrictions we can put on out hypothesis class to make them efficiently learnable in the same way our world seems efficiently learnable, etc.
She mentions that this gap, at least in the context of deep learning theory, isn’t too much of a pressing problem because it already has mainstream attention—which explains why a lot of her work seems to lie in the 1-3 regime.
I asked GPT for examples of past crossings of the 3-4 chasm in other domains, and it suggested [Shannon’s original technically-constructive-but-highly-infeasible proof for the existence of optimal codes] vs. [recent progress on Turbocodes that actually approach this limit while being very practical], which seems like a perfect example.
I agree with this framing.
The issue of characterizing in what way Our World is Special is the core theoretical question of learning theory.
The way of framing it as a single bottleneck 3-4 maybe understates how large the space of questions is here. E.g. it encompasses virtually every field of theoretical computer science, and physics& mathematics relevant to computation outside of AIT and numerical math.
I’d vote for removing the stage “developing some sort of polytime solution” and just calling 4 “developing a practical solution”. I think listing that extra step is coming from the perspective of something who’s more heavily involved in complexity classes. We’re usually interested in polynomial time algorithms because they’re usually practical, but there are lots of contexts where practicality doesn’t require a polynomial time algorithm, or really, where we’re just not working in a context where it’s natural to think in terms of algorithms with run-times.
The 3-4 Chasm of Theoretical Progress
epistemic status: unoriginal. trying to spread a useful framing of theoretical progress introduced from an old post.
Tl;dr, often the greatest theoretical challenge comes from the step of crossing the chasm from [developing an impractical solution to a problem] to [developing some sort of a polytime solution to a problem], because the nature of their solutions can be opposites.
Summarizing Diffractor’s post on Program Search and Incomplete Understanding:
Solving a foundational problem to its implementation often takes the following steps (some may be skipped):
developing a philosophical problem
developing a solution given infinite computing power
developing an impractical solution
developing some sort of polytime solution
developing a practical solution
and he says that it is often during the 3 → 4 step in which understanding gets stuck and the most technical and brute-force math (and i would add sometimes philosophical) work is needed, because:
a common motif in 3) is that they’re able to proving interesting things about their solutions, like asymptotic properties, by e.g., having their algorithms iterate through all turing machines, hence somewhat conferring the properties of the really good turing machine solution that exists somewhere in this massive search space to the overall search algorithm (up to a massive constant, usually).
think of Levin’s Universal Search, AIXItl, Logical Induction.
he says such algorithms are secretly a black box algorithm; there are no real gears.
Meanwhile, algorithms in 4) have the opposite nature—they are polynomial often because they characterize exploitable patterns that make a particular class of problems easier than most others, which requires Real Understanding. So algorithms of 3) and 4) often look nothing alike.
I liked this post and the idea of the “3-4 chasm,” because it explicitly captures the vibes of why I personally felt the vibes that, e.g., AIT, might be less useful for my work: after reading this post, I realized that for example when I refer to the word “structure,” I’m usually pointing at the kind of insights required to cross the 3-4 gap, while others might be using the same word to refer to things at a different level. This causes me to get confused as to how some tool X that someone brought up is supposed to help with the 3-4 gap I’m interested in.[1]
Vanessa Cosoy refers to this post, saying (in my translation of her words) that a lot of the 3-4 gap in computational learning theory has to do with our lack of understanding of deep learning theory, like how the NP-complete barrier is circumvented in practical problems, what are restrictions we can put on out hypothesis class to make them efficiently learnable in the same way our world seems efficiently learnable, etc.
She mentions that this gap, at least in the context of deep learning theory, isn’t too much of a pressing problem because it already has mainstream attention—which explains why a lot of her work seems to lie in the 1-3 regime.
I asked GPT for examples of past crossings of the 3-4 chasm in other domains, and it suggested [Shannon’s original technically-constructive-but-highly-infeasible proof for the existence of optimal codes] vs. [recent progress on Turbocodes that actually approach this limit while being very practical], which seems like a perfect example.
AIT in specific seems to be useful primarily in the 1-2 level?
I agree with this framing. The issue of characterizing in what way Our World is Special is the core theoretical question of learning theory.
The way of framing it as a single bottleneck 3-4 maybe understates how large the space of questions is here. E.g. it encompasses virtually every field of theoretical computer science, and physics& mathematics relevant to computation outside of AIT and numerical math.
I’d vote for removing the stage “developing some sort of polytime solution” and just calling 4 “developing a practical solution”. I think listing that extra step is coming from the perspective of something who’s more heavily involved in complexity classes. We’re usually interested in polynomial time algorithms because they’re usually practical, but there are lots of contexts where practicality doesn’t require a polynomial time algorithm, or really, where we’re just not working in a context where it’s natural to think in terms of algorithms with run-times.
What contexts is it not natural to think in terms of algorithms with specific run-times?