The post is about predictions made by experts in number theory and complexity theory.
If you think that this can not be predicted, and that they are thus wrong about their predictions, I would be interested in knowing why.
Namely:
Do you have mechanistic / gear-level / inside view reasons for why the difficulty of problems can not be predicted ahead of time, where you disagree with those experts?
Do you have empirical / outside view reasons for why those experts are badly calibrated?
The post is about predictions made by experts in number theory and complexity theory.
Isn’t it about empirical evidence that these problems are hard, not “predictions”? They’re considered hard because many people have tried to solve them for a long time and failed, not because experts glanced at them once and knew on priors they’d be legendarily difficult.
Aside from that, an expert can estimate how hard a problem is by eyeballing how distant the abstractions needed to solve it feel from the known ones — whether we have almost the right tools for solving it, or have no idea how the right tools would look at all. They’re able to do this because they’ve developed strong intuitions for their professional domain: they roughly know what’s possible, what’s on the edge of the possible, and what’s very much not. And even then, such intuitions are often very wrong, see Fermat’s Last Theorem.
But there’s no objective property that makes these problems intrinsically hard, only subjectively hard from the point of view of our conceptual toolbox.
Isn’t it about empirical evidence that these problems are hard, not “predictions”? They’re considered hard because many people have tried to solve them for a long time and failed.
No, this is Preemption 1 in the Original Post.
“hard” doesn’t mean “people have tried and failed”, and you can only witness the latter after the fact. If you prefer, even if have empirical evidence for the problem being “level n hard” (people have tried up to level n), you;d still do not have empirical evidence for the problem being “level n+1 hard” (you’d need people to try more to state that if there’s nothing you can say about it ahead of time). Ie, no predictive power.
An expert can estimate how hard a problem is by eyeballing how distant the abstractions needed to solve it feel from the known ones
Great! We’re getting closer to what I care about.
Then what I am saying is that there is a heuristic that the experts are using to eyeball this, and I want to know what that is, start ingwith those 2 conjectures!
I am also saying that the more distant “the abstraction need to solve it feel from the known ones”, the easier it should be to do so.
They’re able to do this because they’ve developed strong intuitions for their professional domain: they roughly know what’s possible, what’s on the edge of the possible, and what’s very much not.
Exactly, but those intuitions are implemented somehow. How?
Also, the more experts agree on a judgment, and the stronger their judgment, the easier you expect to be to explain that intuition.
But there’s no objective property that makes these problems intrinsically hard, only subjectively hard from the point of view of our conceptual toolbox.
I was confused very confused when I read this. For instance, the part in bold is already reflected in the Original Post.
There are many theoretical problems that are considered to be obviously far far outside our problem solving ability.
If you prefer, interpret “intrinsically hard” as “having an intrinsic property that makes it subjectively hard for us. To model how that would look, consider the following setup:
The space of problems is just the real plane, and our ability to solve problems is modeled by a unit disk in the plane (if in the disk, solved, and the closer it is to the disk, the easier it is to solve). Then, the difficulty of a problem is subjective, it depends on where the disk is.
But let’s say the disk is somewhere on the x axis, then the intrinsic property of a problem being far having a high y coordinate, makes it subjectively hard.
I’ll make some edits to the post. I thought most of this was clear because of Preemption #1, but it was not.
even if have empirical evidence for the problem being “level n hard” (people have tried up to level n), you;d still do not have empirical evidence for the problem being “level n+1 hard”
This is implicitly assuming that our expectation of how long a problem should take to solve is memoryless. But a breakthrough is much more likely on the 1st day of working on a problem than on the 1000th day. More generally, if problems vary greatly in difficulty, then our failure to solve a given problem provides evidence that it’s one of the harder problems. So a more reasonable prior in this case is something like logarithmic—e.g. it’s equally likely that a problem takes 1-10 days, or 10-100 days, or 100-1000 days, etc, to solve.
A similar model can give rise to the Lindy effect, where the expected lifetime is proportional to the lifetime so far. (In this case it’d be the expected time to solving the problem which would be proportional to the time which the problem has been open.)
even if have empirical evidence for the problem being “level n hard” (people have tried up to level n), you;d still do not have empirical evidence for the problem being “level n+1 hard” (you’d need people to try more to state that if there’s nothing you can say about it ahead of time). Ie, no predictive power.
Suppose I take out a coin and flip it 100 times in front of your eyes, and it lands heads every time. Will you have no ability to predict how it lands the next 30 times? Will you need some special domain knowledge of coin aerodynamics to predict this?
Then what I am saying is that there is a heuristic that the experts are using to eyeball this, and I want to know what that is, start ingwith those 2 conjectures!
I mean… That heuristic is that heuristic? “Experts have a precise model of the known subset of the concept-space of their domain, and they can make vague high-level extrapolations on how that domain looks outside the known subset, and where in the wider domain various unsolved problems are located, and how distant they are from the known domain”. The way I see it, that’s it. This statement isn’t reducible to something more neat and simple. For any given difficult problem, you can walk up to an expert and ask them why it’s considered hard, but the answers they give you won’t have any unifying theme aside from that. It’s all ad hoc.
Why would you think there’s something else? What shape do you want the answer to have?
Suppose I take out a coin and flip it 100 times in front of your eyes, and it lands heads every time. Will you have no ability to predict how it lands the next 30 times? Will you need some special domain knowledge of coin aerodynamics to predict this?
Coin = problem
Flipping head = not being solved
Flipping tail = being solved
More flips = more time passing
Then, yes. Because you had many other coins that had started flipping tail at some point, and there is no easily discernable pattern.
By your interpretation, the Solomonoff induced prior for that coin is basically “it will never flip tail”. Whereas, you do expect that most problems that have not been solved now will be solved at some point, which does mean that you are incorporating more knowledge.
“Experts have a precise model of the known subset of the concept-space of their domain, and they can make vague high-level extrapolations on how that domain looks outside the known subset, and where in the wider domain various unsolved problems are located, and how distant they are from the known domain”
Experts from many different fields of Maths and CS have tried to tackle the Collatz’ Conjecture and the P vs NP problem. Most of them agree that those problems are way beyond what they set out to prove. I mostly agree with you on the fact that each expert’s intuition vaguely tracks one specific dimension of the problem. But any simplicity prior tells you that it is more likely for there to be a general reason for why those problems are hard along all those dimensions, rather than a whole bunch of ad-hoc reasons.
The way I see it, that’s it. This statement isn’t reducible to something more neat and simple.
For any given difficult problem, you can walk up to an expert and ask them why it’s considered hard, but the answers they give you won’t have any unifying theme aside from that. It’s all ad hoc.
What makes you think that? I see you repeating this, but I don’t see why that would be the case.
Why would you think there’s something else?
Good question, thanks! I tried to hint at this in the Original Post, but I think I should have been more explicit. I will make a second edit that incorporates the following.
The first reason is that many different approaches have been tried. In the case where only a couple of specific approaches have been tried, I expect the reason for why it hasn’t been solved to be ad-hoc and related to the specific approaches that have been tried. The more approaches are tried, the more I expect a general reason that applies to all those approaches.
The second reason is that the problems are simple. In the case of a complicated problem, I would expect the reason for why it hasn’t been solved to be ad-hoc. I have much less of this expectation for simple problems.
The post is about predictions made by experts in number theory and complexity theory.
If you think that this can not be predicted, and that they are thus wrong about their predictions, I would be interested in knowing why.
Namely:
Do you have mechanistic / gear-level / inside view reasons for why the difficulty of problems can not be predicted ahead of time, where you disagree with those experts?
Do you have empirical / outside view reasons for why those experts are badly calibrated?
Isn’t it about empirical evidence that these problems are hard, not “predictions”? They’re considered hard because many people have tried to solve them for a long time and failed, not because experts glanced at them once and knew on priors they’d be legendarily difficult.
Aside from that, an expert can estimate how hard a problem is by eyeballing how distant the abstractions needed to solve it feel from the known ones — whether we have almost the right tools for solving it, or have no idea how the right tools would look at all. They’re able to do this because they’ve developed strong intuitions for their professional domain: they roughly know what’s possible, what’s on the edge of the possible, and what’s very much not. And even then, such intuitions are often very wrong, see Fermat’s Last Theorem.
But there’s no objective property that makes these problems intrinsically hard, only subjectively hard from the point of view of our conceptual toolbox.
No, this is Preemption 1 in the Original Post.
“hard” doesn’t mean “people have tried and failed”, and you can only witness the latter after the fact. If you prefer, even if have empirical evidence for the problem being “level n hard” (people have tried up to level n), you;d still do not have empirical evidence for the problem being “level n+1 hard” (you’d need people to try more to state that if there’s nothing you can say about it ahead of time). Ie, no predictive power.
Great! We’re getting closer to what I care about.
Then what I am saying is that there is a heuristic that the experts are using to eyeball this, and I want to know what that is, start ingwith those 2 conjectures!
I am also saying that the more distant “the abstraction need to solve it feel from the known ones”, the easier it should be to do so.
Exactly, but those intuitions are implemented somehow. How?
Also, the more experts agree on a judgment, and the stronger their judgment, the easier you expect to be to explain that intuition.
I was confused very confused when I read this. For instance, the part in bold is already reflected in the Original Post.
If you prefer, interpret “intrinsically hard” as “having an intrinsic property that makes it subjectively hard for us. To model how that would look, consider the following setup:
The space of problems is just the real plane, and our ability to solve problems is modeled by a unit disk in the plane (if in the disk, solved, and the closer it is to the disk, the easier it is to solve). Then, the difficulty of a problem is subjective, it depends on where the disk is.
But let’s say the disk is somewhere on the x axis, then the intrinsic property of a problem being far having a high y coordinate, makes it subjectively hard.
I’ll make some edits to the post. I thought most of this was clear because of Preemption #1, but it was not.
This is implicitly assuming that our expectation of how long a problem should take to solve is memoryless. But a breakthrough is much more likely on the 1st day of working on a problem than on the 1000th day. More generally, if problems vary greatly in difficulty, then our failure to solve a given problem provides evidence that it’s one of the harder problems. So a more reasonable prior in this case is something like logarithmic—e.g. it’s equally likely that a problem takes 1-10 days, or 10-100 days, or 100-1000 days, etc, to solve.
A similar model can give rise to the Lindy effect, where the expected lifetime is proportional to the lifetime so far. (In this case it’d be the expected time to solving the problem which would be proportional to the time which the problem has been open.)
Suppose I take out a coin and flip it 100 times in front of your eyes, and it lands heads every time. Will you have no ability to predict how it lands the next 30 times? Will you need some special domain knowledge of coin aerodynamics to predict this?
I mean… That heuristic is that heuristic? “Experts have a precise model of the known subset of the concept-space of their domain, and they can make vague high-level extrapolations on how that domain looks outside the known subset, and where in the wider domain various unsolved problems are located, and how distant they are from the known domain”. The way I see it, that’s it. This statement isn’t reducible to something more neat and simple. For any given difficult problem, you can walk up to an expert and ask them why it’s considered hard, but the answers they give you won’t have any unifying theme aside from that. It’s all ad hoc.
Why would you think there’s something else? What shape do you want the answer to have?
Coin = problem
Flipping head = not being solved
Flipping tail = being solved
More flips = more time passing
Then, yes. Because you had many other coins that had started flipping tail at some point, and there is no easily discernable pattern.
By your interpretation, the Solomonoff induced prior for that coin is basically “it will never flip tail”. Whereas, you do expect that most problems that have not been solved now will be solved at some point, which does mean that you are incorporating more knowledge.
Experts from many different fields of Maths and CS have tried to tackle the Collatz’ Conjecture and the P vs NP problem. Most of them agree that those problems are way beyond what they set out to prove. I mostly agree with you on the fact that each expert’s intuition vaguely tracks one specific dimension of the problem.
But any simplicity prior tells you that it is more likely for there to be a general reason for why those problems are hard along all those dimensions, rather than a whole bunch of ad-hoc reasons.
What makes you think that? I see you repeating this, but I don’t see why that would be the case.
Good question, thanks! I tried to hint at this in the Original Post, but I think I should have been more explicit. I will make a second edit that incorporates the following.
The first reason is that many different approaches have been tried. In the case where only a couple of specific approaches have been tried, I expect the reason for why it hasn’t been solved to be ad-hoc and related to the specific approaches that have been tried. The more approaches are tried, the more I expect a general reason that applies to all those approaches.
The second reason is that the problems are simple. In the case of a complicated problem, I would expect the reason for why it hasn’t been solved to be ad-hoc. I have much less of this expectation for simple problems.