Proteins are finite in length. Why would nature care if it can’t do something in polynomial time?
Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?
can we put an upper bound on the length of proteins using evolutionary time scales?
Not really, most big proteins consist of ‘domains’ which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line… over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.
Actually, the lab I’m working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted—one domain didn’t fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn’t work despite having exactly the same sequence as the wild protein.
I think the more important point is that Nature doesn’t care about the worst case (if a protein takes forever to fold correctly then it’s not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Nature designed ribosomes without solving any hard cases of NP-hard problems.
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
The thing with such problems is that you don’t solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem.
I think at this point, taboo “solved a hard problem”. Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn’t have to.
Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible.
(The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there’s a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that’s no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I’m up to date on protein folding.)
In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn’t be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the “secure hash” scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computation. (likewise should fermi here, but would need some harder assumptions)
Of course you already knew all this… (But I didn’t))
So yeah, this just made nano seem a bit more real to me.
In the worst case, the protein folding could be a secure hash
Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
Yep, it’s highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that’s before we look at how evolution works.
Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).
That said, the “worst possible case” is usually interesting.
Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
Isn’t that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren’t.
(I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
Lets go!
From Wik:
In early 2008, Rosetta was used to computationally design a protein with a function never before observed in nature.
Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme.
So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let’s call it a week of computation total. There’s 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops.
They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn’t add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that.
Don’t know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops.
Let’s call it 1e25 flops with current techniques to design nanotech.
If AI is in 20 years, that’s 13 moores doublings or 1e4, then let’s say the AI can seize a network of as much computational power as they used, plus moores scaling.
So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network.
So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today’s algorithms.
Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes.
The AI will probably do better than that, but that would be good enough to fuck us.
This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used)
Let’s get this thing right the first time....
EDIT: an interesting property of exponential processes is that things go from “totally impossible” to “trivial” very quickly.
I can feel some inferential distance here that isn’t successfully being bridged. It’s far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
I guess to me the notion of “solve an NP-hard problem” (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic—all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it—so I feel a bit at a loss to relate to the question. It’s like bringing up PSPACE completeness—I feel a sense of ‘where did that come from?’ and find it hard to think of what to say, except for “Nothing that’s happened so far could’ve been PSPACE complete.”
Agreed if you mean “Nothing that’s happened so far could’ve been [computationally hard] to predict given the initial conditions.”
But the reverse problem—finding initial conditions that produce a desired output—could be very hard. Nature doesn’t care about this, but an AI plausibly might.
I’m not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement “P=NP” is more or less equivalent to “The reverse problem I mention above is always easy.” Things become very different if this is true.
Proteins are finite in length. Why would nature care if it can’t do something in polynomial time?
Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?
Not really, most big proteins consist of ‘domains’ which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line… over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.
Actually, the lab I’m working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted—one domain didn’t fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn’t work despite having exactly the same sequence as the wild protein.
I think the more important point is that Nature doesn’t care about the worst case (if a protein takes forever to fold correctly then it’s not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
The thing with such problems is that you don’t solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem.
I think at this point, taboo “solved a hard problem”. Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn’t have to.
Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible.
(The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there’s a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that’s no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I’m up to date on protein folding.)
In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn’t be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the “secure hash” scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computation. (likewise should fermi here, but would need some harder assumptions)
Of course you already knew all this… (But I didn’t))
So yeah, this just made nano seem a bit more real to me.
Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
Yep, it’s highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that’s before we look at how evolution works.
Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).
That said, the “worst possible case” is usually interesting.
Isn’t that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren’t.
Lets go!
From Wik:
Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme.
So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let’s call it a week of computation total. There’s 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops.
They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn’t add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that.
Don’t know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops.
Let’s call it 1e25 flops with current techniques to design nanotech.
If AI is in 20 years, that’s 13 moores doublings or 1e4, then let’s say the AI can seize a network of as much computational power as they used, plus moores scaling.
So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network.
So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today’s algorithms.
Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes.
The AI will probably do better than that, but that would be good enough to fuck us.
This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used)
Let’s get this thing right the first time....
EDIT: an interesting property of exponential processes is that things go from “totally impossible” to “trivial” very quickly.
Note that by these estimates, humans should be able to have nano around 2030. Scary stuff.
I can feel some inferential distance here that isn’t successfully being bridged. It’s far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
I guess to me the notion of “solve an NP-hard problem” (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic—all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it—so I feel a bit at a loss to relate to the question. It’s like bringing up PSPACE completeness—I feel a sense of ‘where did that come from?’ and find it hard to think of what to say, except for “Nothing that’s happened so far could’ve been PSPACE complete.”
Agreed if you mean “Nothing that’s happened so far could’ve been [computationally hard] to predict given the initial conditions.”
But the reverse problem—finding initial conditions that produce a desired output—could be very hard. Nature doesn’t care about this, but an AI plausibly might.
I’m not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement “P=NP” is more or less equivalent to “The reverse problem I mention above is always easy.” Things become very different if this is true.