Page 4: the sum log(w) + log(log(w)) + … doesn’t converge. Some logarithm will be negative and then the next one will be undefined. Presumably you meant to stop the sum once it becomes negative, but then I’m somewhat confused about this argument because I’m not sure it’s dimensionally consistent (I’m not sure what units cognitive work is being measured in).
Top of page 18: there’s a reference to “this graph” but no graph...?
General comment 1: who’s the intended audience here? Most of the paper reads like a blog post, which I imagine could be disconcerting for newcomers trying to evaluate whether they should be paying attention to MIRI and expecting a typical research paper from a fancy-looking .pdf.
General comment 2: I still think this discussion needs more computational complexity. I brought this up to you earlier and I didn’t really digest your reply. The question of what you can and can’t do with a given amount of computational resources seems highly relevant to understanding what the intelligence explosion could look like; in particular I would be surprised if questions like P vs. NP didn’t have a strong bearing on the distribution over timelines (I expect that the faster it is possible to solve NP-complete problems, which apparently includes protein folding, the faster AI could go foom). But then I’m not a domain expert here and I could be off-base for various reasons.
The reference to “this graph” is a hyperlink. There are many such hyperlinks in the document. They feel rather weird, and are easy to miss, given the generally print-like typesetting. It might be worth writing them all out in something like the form one would use in an ordinary printed document, while preserving their hyperlinkiness.
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.
The physical universe doesn’t need to “solve” protein folding in the sense of having a worst-case polynomial-time algorithm. It just needs to fold proteins. Many NP-complete problems are “mostly easy” with a few hard instances that rarely come up. (In fact, it’s hard to find an NP-complete problem for which random instances are hard: if we could do this, we would use it for cryptography.) It’s reasonable to suppose protein folding is like this.
Of course, if this is the case, maybe the AI doesn’t care about the rare hard instances of protein folding, either.
Proteins are finite in length. Why would nature care if it can’t do something in polynomial time?
Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?
can we put an upper bound on the length of proteins using evolutionary time scales?
Not really, most big proteins consist of ‘domains’ which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line… over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.
Actually, the lab I’m working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted—one domain didn’t fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn’t work despite having exactly the same sequence as the wild protein.
I think the more important point is that Nature doesn’t care about the worst case (if a protein takes forever to fold correctly then it’s not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Nature designed ribosomes without solving any hard cases of NP-hard problems.
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
The thing with such problems is that you don’t solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem.
I think at this point, taboo “solved a hard problem”. Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn’t have to.
Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible.
(The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there’s a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that’s no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I’m up to date on protein folding.)
In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn’t be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the “secure hash” scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computation. (likewise should fermi here, but would need some harder assumptions)
Of course you already knew all this… (But I didn’t))
So yeah, this just made nano seem a bit more real to me.
In the worst case, the protein folding could be a secure hash
Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
Yep, it’s highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that’s before we look at how evolution works.
Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).
That said, the “worst possible case” is usually interesting.
Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
Isn’t that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren’t.
(I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
Lets go!
From Wik:
In early 2008, Rosetta was used to computationally design a protein with a function never before observed in nature.
Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme.
So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let’s call it a week of computation total. There’s 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops.
They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn’t add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that.
Don’t know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops.
Let’s call it 1e25 flops with current techniques to design nanotech.
If AI is in 20 years, that’s 13 moores doublings or 1e4, then let’s say the AI can seize a network of as much computational power as they used, plus moores scaling.
So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network.
So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today’s algorithms.
Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes.
The AI will probably do better than that, but that would be good enough to fuck us.
This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used)
Let’s get this thing right the first time....
EDIT: an interesting property of exponential processes is that things go from “totally impossible” to “trivial” very quickly.
I can feel some inferential distance here that isn’t successfully being bridged. It’s far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
I guess to me the notion of “solve an NP-hard problem” (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic—all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it—so I feel a bit at a loss to relate to the question. It’s like bringing up PSPACE completeness—I feel a sense of ‘where did that come from?’ and find it hard to think of what to say, except for “Nothing that’s happened so far could’ve been PSPACE complete.”
Agreed if you mean “Nothing that’s happened so far could’ve been [computationally hard] to predict given the initial conditions.”
But the reverse problem—finding initial conditions that produce a desired output—could be very hard. Nature doesn’t care about this, but an AI plausibly might.
I’m not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement “P=NP” is more or less equivalent to “The reverse problem I mention above is always easy.” Things become very different if this is true.
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.
Here’s a paper claiming that the most widely studied model of protein folding in 1998 is NP-Complete. I don’t know enough about modern research into protein folding to comment how applicable that result still is.
My guess is you’re referring to Aaronson’s paper, which doesn’t seem relevant here. The universe doesn’t solve NP-hard problems in P time, but the universe took NP time to build the first useful proteins, didn’t it?
The solution space is large enough that even proteins sampling it’s points at a rate of trillions per second couldn’t really fold if they were just searching randomly through all possible configurations, that would be NP complete. They don’t actually do this of course. Instead they fold piece by piece as they are produced, with local interactions forming domains which tend to retain their approximate structure once they come together to form a whole protein. They don’t enter the lowest possible energy state therefore. Prion diseases are an example of what can happen when proteins enter a normally inaccessible local energy minimum, which in that case happens to have a snowballing effect on other proteins.
The result is that they follow grooves in the energy landscape towards an energy well which is robust enough to withstand all sorts of variation, including the horrific inaccuracies of our attempts at modeling. Our energy functions are just very crude approximations to the real one, which is dependent on quantum level effects and therefore intractable.
Another issue is that proteins don’t fold in isolation—they interact with chaperone proteins and all sorts of other crap. So simulating their folding might require one to simulate a LOT of other things besides just the protein in question.
Even our ridiculously poor attempts at in silico folding are not completely useless though. They can even be improved with the help of the human brain (see Foldit). I think an A.I. should make good progress on the proteins that exist. Even if it can’t design arbitrary new ones from scratch, intelligent modification of existing ones would likely be enough to get almost anything done. Note also that an A.I. with that much power wouldn’t be limited to what already exists, technology is already in the works to produce arbitrary non-protein polymers using ribosome like systems, stuff like that would open up an unimaginably large space of solutions that existing biology doesn’t have access to.
Right. I’m actually not sure how relevant it all is to discussions of an A.I. trying to get arbitrary things done with proteins. Folding an existing protein may be a great deal easier than finding a protein which folds into an arbitrary shape. Probably not all shapes are allowed by the physics of the problem. Evolution can’t really be said to solve that problem either. It just produces small increments in fitness. Otherwise organisms’ proteomes would be a lot more efficient.
Although on second thought, an A.I. would probably just be able to design a protein with a massively low energy funnel, so that even if it couldn’t simulate folding perfectly, it could still get things done.
Regardless, an imperfect solution would probably suffice for world domination...
I believe I’ve seen that discussed before and the answer is just that in real life, proteins don’t fold into the lowest-energy conformation. It’s like saying that the global minimum energy for soap bubbles is NP-complete. Finding new useful proteins tends to occur via point mutations so that can’t be NP-complete either.
So, I can think of several different things that could all be the ‘protein folding problem.’
Figure out the trajectory an unfolded protein takes towards a folded protein, with a known starting state. (P)
Given a known folded protein, find local minima that unfolded proteins starting with random start states might get stuck in. (NP, probably)
Given a desired chemical reaction, find a folded protein that will catalyze the reaction. (Not sure, probably NP.)
Given a desired chemical reaction, find a folded protein that will catalyze the reaction that is the local minimum reached by most arbitrary unfolded positions for that protein (Optimal is definitely NP, but I suspect feasible is too.)
Others. (Here, the closest I get to molecular nanotech is ‘catalyze reactions,’ but I imagine the space for ‘build a protein that looks like X’ might actually be smaller.)
It looks to me like the problems here that have significant returns are NP.
Finding new useful proteins tends to occur via point mutations so that can’t be NP-complete either.
It’s not at all clear to me what you mean by this. I mean, take the traveling salesman problem. It’s NP-Hard*, but you can get decent solutions by using genetic algorithms to breed solutions given feasible initial solutions. Most improvements to the route will be introduced by mutations, and yet the problem is still NP-hard.
That is, it’s not clear to me that you’re differentiating between the problem of finding an optimal solution being NP hard, it taking NP time to find a ‘decent’ solution, and an algorithm requiring NP time to finish running.
(The second is rarely true for things like the traveling salesman problem, but is often true for practical problems where you throw in tons of constraints.)
* A variant is NP-Complete, which is what I originally wrote.
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period. It could be a physical event that you choose to regard as a P-approximation to a theoretical problem whose optimal solution would be NP-hard, but so what, that wouldn’t have anything to do with what physically happened. It would take unknown, exotic physics to have anything NP-hard physically happen. Anything that could not plausibly have involved black holes rotating at half the speed of light to produce closed timelike curves, or whatever, cannot have plausibly involved NP-hard problems. NP-hard = “did not physically happen”. “Physically happened” = not NP-hard.
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period.
I’ve seen you say this a couple of times, and your interlocutors seem to understand you, even when they dispute your conclusion. But my brain keeps returning an error when I try to parse your claim.
Read literally, “NP-hard” is not a predicate that can be meaningfully applied to individual events. So, in that sense, trivially, nothing that happens (physically or otherwise, if “physically” is doing any work here) can be NP-hard. But you are evidently not making such a trivial claim.
So, what would it look like if the physical universe “solved an NP-hard problem”? Presumably it wouldn’t just mean that some actual salesman found a why to use existing airline routes to visit a bunch of pre-specified cities without revisiting any one of them. Presumably it wouldn’t just mean that someone built a computer that implements a brute-force exhaustive search for a solution to the traveling salesman problem given an arbitrary graph (a search that the computer will never finish before the heat death of the universe if the example is large). But I can’t think of any other interpretation to give to your claim.
Here’s Scott Aaronson describing people (university professors in computer science and cognitive science at RPI) who claim that the physical universe efficiently solves NP-hard problems:
It was only a matter of time before someone put the pieces together. Last summer Bringsjord
and Taylor [24] posted a paper entitled “P=NP” to the arXiv. This paper argues that, since (1)
finding a Steiner tree is NP-hard, (2) soap bubbles find a Steiner tree in polynomial time, (3) soap
bubbles are classical objects, and (4) classical physics can be simulated by a Turing machine with
polynomial slowdown, it follows that P = NP.
My immediate reaction was that the paper was a parody. However, a visit to Bringsjord’s
home page2 suggested that it was not. Impelled, perhaps, by the same sort of curiosity that
causes people to watch reality TV shows, I checked the discussion of this paper on the comp.theory
newsgroup to see if anyone recognized the obvious error. And indeed, several posters pointed out
that, although soap bubbles might reach a minimum-energy configuration with a small number of
pegs, there is no “magical” reason why this should be true in general. By analogy, a rock in a
mountain crevice could reach a lower-energy configuration by rolling up first and then down, but it
is not observed to do so.
In other news, Bringsjord also claims to show by a modal argument, similar to the theistic modal argument (which he also endorses), that human brains are capable of hypercomputation: “it’s possible humans are capable of hypercomputation, so they are capable of hypercomputation.” For this reason he argues that superhumanly intelligent Turing machines/Von Neumann computers are impossible and belief in their possibility is fideistic.
Here’s Scott Aaronson describing people (university professors at RPI) who claim that the physical universe efficiently solves NP-hard problems.
This doesn’t refute what you are responding to. Saying the universe can’t solve a general NP problem in polynomial time is not the same thing as saying the universe cannot possibly solve specific instances of generally NP-complete problems, which is Tyrrell_McAllister’s point, as far as I can parse. In general, the traveling salesman is NP-complete, however there are lots of cases where heuristics get the job done in polynomial time, even if those heuristics would run-away if they were given the wrong case.
To use Aaronson’s soap bubbles, sometimes the soap bubble finds a Steiner tree, sometimes it doesn’t. When it DOES, it has solved one instance of an NP-complete problem fairly quickly.
Existence of God, not existence of god [or gods]; my sincere apologies for pedantry, it’s just that LW folk generally misunderstand theology and so it seems important to be especially careful when discussing it. The distinction between God and god is quite important and the words look so similar, it’s unfortunate. ETA: Retracting because Carl edited the comment. (Thanks Carl!)
In many contexts where a quite important distinction between two terms exists, it is helpful to say what it is, rather than simply assert that there is one.
Admittedly, I have no reason to believe this is one such context.
I’m pretty sure the distinction Will was making was as follows.
A “god” is a being with superhuman powers and so forth, possibly existing somehow “on another plane of existence”, but still in some sense the same kind of thing as us: another thing among all the things there are.
“God” is not in any way the same sort of thing as we are. God is not just “another thing” but, in theologian-speak, the Ground Of All Being, a sort of precondition for everything else.
However, it’s possible that he’s making a different distinction, wherein “God” means a unique divine being of infinite power who created everything and “god” means lesser beings like the Greek gods.
(If he’s making the first distinction, I would remark that (1) most theists’ notions of God do in fact fall under the description of “god”, and (2) the radically different notion of “God” is of very doubtful coherence. If he’s making the second, I’m unconvinced that the distinction is as important as he suggests it is.)
(Hastily written:) I agree with your (1), but think that’s all the more reason to make clear distinctions instead of further muddying the epistemic waters; it’s much as if creationists didn’t bother to distinguish between something like Pokemon evolution and actual evolutionary biology, because after all most evolutionists can’t tell the difference.
Mildly disagree with your (2): I can see how the coherence of the God idea is somewhat doubtful, but there aren’t actually any overwhelmingly strong arguments in terms of metaphysics that that is the case, and most atheists take a different route by more or less rejecting all of metaphysics and instead placing emphasis on epistemology. (Then there are weaksauce theists like Kant and William James to argue with but I don’t think that’s as challenging.) Although I’m sympathetic to skepticism of metaphysics we should keep in mind that the obvious attempts to banish it have failed (e.g. logical positivism), and we should also keep in mind that though LessWrong is (somewhat justifiably) allergic to the word “metaphysics”, metaphysics actually shows up here quite a bit in the guise of computationalism/simulationism and in some semi-epistemic rules like Eliezer’s GAZP. So to reject metaphysics entirely would be inconsistent; from there, charitably engaging the actual metaphysical arguments of philosophical theists would be necessary, and I see this done very rarely.
In the meantime I think assigning probabilities below, say, 1% to philosophical theism would be premature, especially when the motivations for doing so seem largely to be desires to reverse the stupidity of religious thinking, when philosophical theism stems from Socrates, Plato, and Aristotle and isn’t easily dismissed as the rationalizations of a Christian hegemony the way that most atheists seem to assume in practice.
(ETA: It occurs to me that David Chalmers, who is LW-friendly and the editor of Metametaphysics, would be a good person to ask about the tenability of philosophical theism, from a metaphilosophical perspective. I might send him an email / LW message.)
(ETA: It occurs to me that David Chalmers, who is LW-friendly and the editor of Metametaphysics, would be a good person to ask about the tenability of philosophical theism, from a metaphilosophical perspective. I might send him an email / LW message.)
Did you ever end up doing this, and if so, would you mind sharing the response?
That seems a little strongly put—NP-hard scales very poorly, so no real process can take N up to large numbers. I can solve the traveling salesman problem in my head with only modest effort if there are only 4 stops. And it’s trivial if there are 2 or 3 stops.
Yes (At least that’s the general consensus among complexity theorists, though it hasn’t been proved.) This doesn’t contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:
P⊂BQP⊂EXP P⊂NP⊂EXP BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.) If you don’t measure any distinctions finer than P vs EXP, then you’re using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren’t.
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard.
I don’t understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers—just not on large instances of the problem. New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine.
If you want to argue that actual practical biological folding of proteins isn’t NP-hard, the argument can’t start from “it happens quickly”—you need to say something about how the time to fold scales with the length of the amino acid strings, and in particular in the limit for very large strings.
Similarly, I don’t see why biological optimization couldn’t have solved hard cases of NP-compete problems. If you wait long enough for evolution to do its thing, the result could be equivalent to an exhaustive search. No new physics required.
I don’t understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers—just not on large instances of the problem.
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.)
New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine.
The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn’t settled yet.
(In other words, I am suggesting that “just fine” is an something of an overstatement when it comes to solving seriously difficult problems by brute force.)
The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn’t settled yet.
That counterargument is a bit too general, since it applies not only to NP problems, but even to P problems (such as deciding whether a number is the GCD of two other numbers), or even any arbitrary algorithm modified by a few lines of codes such that its result is unaffected, merely delayed until after the stars burned out, or whatever limit we postulate.
For NP problems and e.g. P problems both, given how we understand the universe, there is only a finite number of inputs in both cases which are tractable, and an infinite number of inputs which aren’t. Though the finite number is well different for both, as a fraction of all “possible”, or rather well-defined (let’s avoid that ambiguity cliff) inputs, it would be the same.
Cue “We all live in a Finite State Machine, Finite State Machine, Finite State Machine …”
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.)
The point can’t be confined to “trivial instances”. For any NP-complete problem on some reasonable computing platform that can solve small instances quickly, there will be instance sizes that are non-trivial (take appreciable time to solve) but do not require eons to solve. There is absolutely no mathematical reason for assuming that for “natural” NP-complete problems, interesting-sized instances can’t be solved on a timescale of months/years/centuries by natural processes.
The dichotomy between “trivial” and “impossible to solve in a useful time-frame” is a false one.
It occurs to me that we can’t really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended.
Imagine you were living in a game, and trying to judge the game’s hardware requirements. If you did that by looking at a clock in the game, you’d need to assume that the clock is synchronized to the actual system time. If you had a counter you increased, you wouldn’t be able to say from inside the program every which step you get to that counter++ instruction.
The problem being that we don’t have access to anything external, we aren’t watching the Turing Machine compute, we are inside the Turing Machine “watching” the effects of other parts of the program, such as a folding protein (observing whenever it’s our turn to be simulated). We don’t, however, see the Turing Machine compute, we only see the output. The raw computing power / requirements “behind the scenes”, even if such a behind the scenes is only a non-existent abstraction, is impossible to judge with certainty, similar to a map-territory divide. Since there is no access in principle, we cannot observe anything but the “output”, we have no way of verifying any assumptions about a correspondence between “game timer” and “system timer” we may make, or of devising any experiments.
Even the recently linked “test the computational limits” doesn’t break the barrier, since for all we know the program may stall, and the next “frame” it outputs may still seem consistent, with no stalling, when viewed from inside the program, which we are. We wouldn’t subjectively realize the stall. If such an experiment did find something, it would be akin to a bug, not to a measurement of computational resources expended.
It occurs to me that we can’t really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended.
That’s a valid point, but it does presuppose exotic new physics to make that substrate, in which “our” time passes arbitrarily slowly compared to the really real time, so that it can solve NP-hard problems between our clock ticks. We would, in effect be in a simulation. Evidence of NP-hard problems actually being solved in P could be taken as evidence that we are in one.
we only have access to the time of the program [...] we are inside the Turing Machine “watching” the effects of other parts of the program, such as a folding protein
If we assume that protein folding occurs according to the laws of quantum mechanics, then it shouldn’t tell us anything about the computational complexity of our universe besides what quantum mechanics tells us, right?
Well, yea that’s what I’m leaning towards. The laws of physics themselves need not govern the machine (Turing or otherwise), they are effects we observe, us being other effects. The laws of physics and the observers both are part of the output.
Like playing an online roleplaying game and inferring what the program can actually do or what resources it takes, when all you can access is “how high can my character jump” and other in-game rules. The rules regarding the jumping, and any limits the program chose to confer to the jumping behavior are not indicative of the resource requirements and efficiency of the underlying system. Is calculating the jumping easy or hard for the computer? How would you know as a character? The output, again, is a bad judge, take this example:
Imagine using an old Intel 386 system which you rigged into running the latest FPS shooter. It may only output one frame every few hours, but as a sentient character inside that game you wouldn’t notice. Things would be “smooth” for you because the rules would be unchanged from your point of view.
We can only say that given our knowledge of the laws of physics, the TM running the universe doesn’t output anything which seems like an efficient NP-problem solver, whether the program contains one, or the correct hardware abstraction running it uses one, is anyone’s guess. (The “contains one” probably isn’t anyone’s guess because of Occam’s Razor considerations.)
If this is all confused (it may well be, was mostly a stray thought), I’d appreciate a refutation.
If I understand correctly you’re saying that what is efficiently computable within a universe is not necessarily the same as what is efficiently computable on a computer simulating that universe. That is a good point.
Exactly. Thanks for succinctly expressing my point better than I could.
The question is whether assuming a correspondence as a somewhat default case (implied by the “not necessarily”) is even a good default assumption.
Why would the rules inherent in what we see inside the universe be any more indicative of the rules of the computer simulating that universe than the rules inside a computer game are reflective of the instruction set of the CPU running it (they are not)?
I am aware that the reference class “computer running super mario brother / kirby’s dream land” implies for the rules to be different, but on what basis would we choose any reference class which implies a correspondence?
Also, I’m not advocating simulationism with this per se, the “outer” computer can be strictly an abstraction.
Finding new useful proteins tends to occur via point mutations so that can’t be NP-complete either.
This does not follow. It may be that finding new useful proteins just takes a very long time and is very inefficient. The rest of your comment seems correct though.
Actually, even that understates the argument. If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard—you only had to search through 19 variants on each of 20,000 cases.
Huh? How does this argument work? That doesn’t mean that evolution can’t happen in real life, it would be a reason to think that evolution is very slow (which it is!) or that evolution is missing a lot of interesting proteins (which seems plausible).
If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard—you only had to search through 19 variants on each of 20,000 cases.
I’m not sure I follow your logic. Are you arguing that because log 20,000 <19? Yes, you can check every possible position in a base sequence this way, but there are still a lot more proteins than those 19. One doesn’t get something special from just changing a specific base. Moreover, even if something interesting does happen for changing a specific one, it might not happen if one changes some other one.
missing a lot of interesting proteins (which seems plausible).
Definitely, since evolution keeps introducing new interesting proteins.
That doesn’t mean that evolution can’t happen in real life, it would be a reason to think that evolution is very slow (which it is!)
But it’s not slow on a scale of e^n for even modestly large n. If you can produce millions of proteins with hundreds to thousands of amino acids in a few billion years, then approximate search for useful proteins is not inefficient like finding the lowest-energy conformation is (maybe polynomial approximation, or the base is much better, or functional chunking lets you effectively reduce n greatly...).
[...evolution is] missing a lot of interesting proteins (which seems plausible).
Definitely, since evolution keeps introducing new interesting proteins.
Wait, the fact the evolution is often introducing a interesting new proteins is evidence that evolution is missing a lot of interesting proteins? How does that follow?
Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?
Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?
Yes.
That would be evidence that the supply of interesting proteins had been exhausted, just as computer performance at tic-tac-toe and checkers has stopped improving. I don’t see where you’re coming from here.
Because evolution can’t get stuck in the domain of attraction of a local optimum? It always finds any good points?
Edit to add: Intelligent humans can quickly refactor their programs out of poor regions of designspace. Evolution must grope within its neighborhood.
2nd Edit: How about this argument:
“Evolution has stopped producing interesting new ways of flying; therefore, there are probably no other interesting ways of accomplishing flight, since after all, if there were a good way of doing it, evolution would find it.”
Point mutations aren’t the only way for new things to be produced. You can also recombine large chunks and domains together from multiple previous genes.
Hell, there are even examples of genes evolving via a frame-shift that knocks the 3-base frame of a gene off by one producing a gobbeldygook protein that selection then acts upon...
Yes, we can be definitely confident that there are more interesting proteins in the vicinity because of continuing production. We have less evidence about more distant extrapolations, although they could exist too.
It’s just that, from the context, you seemed to be making a claim about evolution’s ability to find all cool proteins, rather than just the ones within organisms’ local search neighborhood (which would thus be within evolution’s reach).
Hence why you appeared, from my reading, to be making the common mistake of attributing intelligence (and global search capabilities) to evolution, which it definitely does not have.
This insinuation was compounded by your comparison to human-intelligence-designed game algorithms, further making it sound like you attributed excessive search capability to evolution.
(And I’m a little scared, to be honest, that the linked comment got several upvotes.)
If you actually recognize the different search capabilities of evolution version more intelligent algorithms, I suggest you retract, or significantly revise, the linked comment.
20 standard amino acids, so 19 * 20,000 one-amino-acid-changed variants. If you can find something by searching 380,000 cases, it wasn’t an NP-hard problem.
EDIT: Actually, since I originally said 20,000 bases, it should just be 3 variants per base (4 standard DNA bases). I don’t know if there’s any significant 20,000-unit peptides.
20 amino acids, so 19 * 20,000 one-amino-acid-changed variants.
Ok. Got it.
If you can find something by searching 380,000 cases, it wasn’t an NP-hard problem.
This does not follow. Instances of NP-hard problems can be quite easy. For example, consider the traveling salesman with 2 cities or SAT with 2 or 3 variables. These are easy enough to solve by hand. But the general problems are still NP-hard.
Note incidentally, that part of why proteins are difficult is that even when you do have all the variations, how they behave in the actual living thing is really complicated. You can’t easily grow thousands of mice each with each small protein difference and see what happens. So there are a lot of associated problems with brute forcing this sort of thing.
Sure, but standard padding arguments can make NP-hard problems with much larger initial easy instances. It may be that there’s some general heuristic that real life, non-constructed problems never act this way, but that’s a distinct claim than what Eliezer seems to be making here.
If you can find something by searching 380,000 cases, it wasn’t an NP-hard problem.
The “NP” terminology is typically not a reference to hard problems are on an absolute scale. It’s a about how the difficulty of the problem in a specified class changes as its scale increases. So, there’s no issue with evolution solving particular instances of problems from classes of problem that are NP-hard—and talk about solving a whole class of such problems would be silly—all such classes consist of infinite sets of problems.
For various values of “solve”. NP-hard problems may not be analytically solvable, but numerical approximations can get pretty darn close, certainly enough for evolutionary advantage.
Is this your complete response? I guess I could expand this to “I expect all the problems an AI needs to solve on the way to an intelligence explosion to be easy in principle but hard in practice,” and I guess I could expand your other comments to “the problem sizes an AI will need to deal with are small enough that asymptotic statements about difficulty won’t come into play.” Both of these claims seem like they require justification.
It’s not meant as a response to everything, just noting that protein structure prediction can’t be NP-hard. More generally, I tend to take P!=NP as a background assumption; I can’t say I’ve worried too much about how the universe would look different if P=NP. I never thought superintelligences could solve NP-hard problems to begin with, since they’re made out of wavefunction and quantum mechanics can’t do that. My model of an intelligence explosion just doesn’t include anyone trying to do anything NP-hard at any point, unless it’s in the trivial sense of doing it for N=20 or something. Since I already expect things to local FOOM with P!=NP, adding P=NP doesn’t seem to change much, even if the polynomial itself is small. Though Scott Aaronson seems to think there’d be long-term fun-theoretic problems because it would make so many challenges uninteresting. :)
Suppose there is a single A.I. with a ‘Devote x % of resources to Smartening myself’ directive. Suppose further that the A.I is already operating with Daid Lewis ‘elite eligible’ ways of carving up the World along its joints- i.e. it is climbing the right hill. Presumably, the Smartening module faces a race hazard type problem in deciding whether it is smarter to devote resources to evaluating returns to smartness or to just release resources back to existing operations. I suppose it could internally breed its own heuristics for Karnaugh map type pattern recognition so as to avoid falling into an NP problem. However, if NP hard problems are like predators, there has to be a heuristic to stop the A.I avoiding them to the extent of roaming uninteresting space and breeding only ‘Speigelman monster’ or trivial or degenerate results. In other words the A.I’s ‘smarten yourself’ Module is now doing just enough to justify its upkeep but not so much as to endanger its own survival.
At this point it is enough for there to be some exogenous shock or random discontinuity on the morphology of the fitness landscape for some sort of gender dimorphism and sexual selection to start taking place within the A.I. with speciation events and so on. However, this opens an exploit for parasites- i.e. humans- so FOOM cashes out as …of fuck, it’s the episode of South Park with the cat saying ‘O long Johnson’.
Beenakker solution to Hempel’s dillemma was wrong- http://en.wikipedia.org/wiki/Hempel’s_dilemma- The boundary between physics and metaphysics is NOT the boundary between what can and what cannot be computed in the age of the universe’ because South Park has resolved every philosophical puzzle in the space of what?- a few hundred hours?
Page 4: the sum log(w) + log(log(w)) + … doesn’t converge. Some logarithm will be negative and then the next one will be undefined. Presumably you meant to stop the sum once it becomes negative, but then I’m somewhat confused about this argument because I’m not sure it’s dimensionally consistent (I’m not sure what units cognitive work is being measured in).
Top of page 18: there’s a reference to “this graph” but no graph...?
General comment 1: who’s the intended audience here? Most of the paper reads like a blog post, which I imagine could be disconcerting for newcomers trying to evaluate whether they should be paying attention to MIRI and expecting a typical research paper from a fancy-looking .pdf.
General comment 2: I still think this discussion needs more computational complexity. I brought this up to you earlier and I didn’t really digest your reply. The question of what you can and can’t do with a given amount of computational resources seems highly relevant to understanding what the intelligence explosion could look like; in particular I would be surprised if questions like P vs. NP didn’t have a strong bearing on the distribution over timelines (I expect that the faster it is possible to solve NP-complete problems, which apparently includes protein folding, the faster AI could go foom). But then I’m not a domain expert here and I could be off-base for various reasons.
IIRC, Daniel Dewey said at the April workshop that this is, in fact, his current project.
The reference to “this graph” is a hyperlink. There are many such hyperlinks in the document. They feel rather weird, and are easy to miss, given the generally print-like typesetting. It might be worth writing them all out in something like the form one would use in an ordinary printed document, while preserving their hyperlinkiness.
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.
The physical universe doesn’t need to “solve” protein folding in the sense of having a worst-case polynomial-time algorithm. It just needs to fold proteins. Many NP-complete problems are “mostly easy” with a few hard instances that rarely come up. (In fact, it’s hard to find an NP-complete problem for which random instances are hard: if we could do this, we would use it for cryptography.) It’s reasonable to suppose protein folding is like this.
Of course, if this is the case, maybe the AI doesn’t care about the rare hard instances of protein folding, either.
If we have an NP-complete problem for which random instances are hard, but we can’t generate them with solutions, that doesn’t help cryptography.
Proteins are finite in length. Why would nature care if it can’t do something in polynomial time?
Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?
Not really, most big proteins consist of ‘domains’ which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line… over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.
Actually, the lab I’m working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted—one domain didn’t fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn’t work despite having exactly the same sequence as the wild protein.
I think the more important point is that Nature doesn’t care about the worst case (if a protein takes forever to fold correctly then it’s not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
The thing with such problems is that you don’t solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem.
I think at this point, taboo “solved a hard problem”. Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn’t have to.
Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible.
(The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there’s a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that’s no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I’m up to date on protein folding.)
In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn’t be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the “secure hash” scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computation. (likewise should fermi here, but would need some harder assumptions)
Of course you already knew all this… (But I didn’t))
So yeah, this just made nano seem a bit more real to me.
Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.
This isn’t a minor quibble, it’s a major reason to go “Whaa?” at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn’t traverse that space at all.
Yep, it’s highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that’s before we look at how evolution works.
Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).
That said, the “worst possible case” is usually interesting.
Isn’t that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren’t.
Lets go!
From Wik:
Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme.
So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let’s call it a week of computation total. There’s 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops.
They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn’t add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that.
Don’t know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops.
Let’s call it 1e25 flops with current techniques to design nanotech.
If AI is in 20 years, that’s 13 moores doublings or 1e4, then let’s say the AI can seize a network of as much computational power as they used, plus moores scaling.
So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network.
So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today’s algorithms.
Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes.
The AI will probably do better than that, but that would be good enough to fuck us.
This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used)
Let’s get this thing right the first time....
EDIT: an interesting property of exponential processes is that things go from “totally impossible” to “trivial” very quickly.
Note that by these estimates, humans should be able to have nano around 2030. Scary stuff.
I can feel some inferential distance here that isn’t successfully being bridged. It’s far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
I guess to me the notion of “solve an NP-hard problem” (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic—all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it—so I feel a bit at a loss to relate to the question. It’s like bringing up PSPACE completeness—I feel a sense of ‘where did that come from?’ and find it hard to think of what to say, except for “Nothing that’s happened so far could’ve been PSPACE complete.”
Agreed if you mean “Nothing that’s happened so far could’ve been [computationally hard] to predict given the initial conditions.”
But the reverse problem—finding initial conditions that produce a desired output—could be very hard. Nature doesn’t care about this, but an AI plausibly might.
I’m not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement “P=NP” is more or less equivalent to “The reverse problem I mention above is always easy.” Things become very different if this is true.
Here’s a paper claiming that the most widely studied model of protein folding in 1998 is NP-Complete. I don’t know enough about modern research into protein folding to comment how applicable that result still is.
My guess is you’re referring to Aaronson’s paper, which doesn’t seem relevant here. The universe doesn’t solve NP-hard problems in P time, but the universe took NP time to build the first useful proteins, didn’t it?
The solution space is large enough that even proteins sampling it’s points at a rate of trillions per second couldn’t really fold if they were just searching randomly through all possible configurations, that would be NP complete. They don’t actually do this of course. Instead they fold piece by piece as they are produced, with local interactions forming domains which tend to retain their approximate structure once they come together to form a whole protein. They don’t enter the lowest possible energy state therefore. Prion diseases are an example of what can happen when proteins enter a normally inaccessible local energy minimum, which in that case happens to have a snowballing effect on other proteins.
The result is that they follow grooves in the energy landscape towards an energy well which is robust enough to withstand all sorts of variation, including the horrific inaccuracies of our attempts at modeling. Our energy functions are just very crude approximations to the real one, which is dependent on quantum level effects and therefore intractable. Another issue is that proteins don’t fold in isolation—they interact with chaperone proteins and all sorts of other crap. So simulating their folding might require one to simulate a LOT of other things besides just the protein in question.
Even our ridiculously poor attempts at in silico folding are not completely useless though. They can even be improved with the help of the human brain (see Foldit). I think an A.I. should make good progress on the proteins that exist. Even if it can’t design arbitrary new ones from scratch, intelligent modification of existing ones would likely be enough to get almost anything done. Note also that an A.I. with that much power wouldn’t be limited to what already exists, technology is already in the works to produce arbitrary non-protein polymers using ribosome like systems, stuff like that would open up an unimaginably large space of solutions that existing biology doesn’t have access to.
Right, this gets called Levinthal’s Paradox.
Right. I’m actually not sure how relevant it all is to discussions of an A.I. trying to get arbitrary things done with proteins. Folding an existing protein may be a great deal easier than finding a protein which folds into an arbitrary shape. Probably not all shapes are allowed by the physics of the problem. Evolution can’t really be said to solve that problem either. It just produces small increments in fitness. Otherwise organisms’ proteomes would be a lot more efficient.
Although on second thought, an A.I. would probably just be able to design a protein with a massively low energy funnel, so that even if it couldn’t simulate folding perfectly, it could still get things done.
Regardless, an imperfect solution would probably suffice for world domination...
I believe I’ve seen that discussed before and the answer is just that in real life, proteins don’t fold into the lowest-energy conformation. It’s like saying that the global minimum energy for soap bubbles is NP-complete. Finding new useful proteins tends to occur via point mutations so that can’t be NP-complete either.
So, I can think of several different things that could all be the ‘protein folding problem.’
Figure out the trajectory an unfolded protein takes towards a folded protein, with a known starting state. (P)
Given a known folded protein, find local minima that unfolded proteins starting with random start states might get stuck in. (NP, probably)
Given a desired chemical reaction, find a folded protein that will catalyze the reaction. (Not sure, probably NP.)
Given a desired chemical reaction, find a folded protein that will catalyze the reaction that is the local minimum reached by most arbitrary unfolded positions for that protein (Optimal is definitely NP, but I suspect feasible is too.)
Others. (Here, the closest I get to molecular nanotech is ‘catalyze reactions,’ but I imagine the space for ‘build a protein that looks like X’ might actually be smaller.)
It looks to me like the problems here that have significant returns are NP.
It’s not at all clear to me what you mean by this. I mean, take the traveling salesman problem. It’s NP-Hard*, but you can get decent solutions by using genetic algorithms to breed solutions given feasible initial solutions. Most improvements to the route will be introduced by mutations, and yet the problem is still NP-hard.
That is, it’s not clear to me that you’re differentiating between the problem of finding an optimal solution being NP hard, it taking NP time to find a ‘decent’ solution, and an algorithm requiring NP time to finish running.
(The second is rarely true for things like the traveling salesman problem, but is often true for practical problems where you throw in tons of constraints.)
* A variant is NP-Complete, which is what I originally wrote.
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period. It could be a physical event that you choose to regard as a P-approximation to a theoretical problem whose optimal solution would be NP-hard, but so what, that wouldn’t have anything to do with what physically happened. It would take unknown, exotic physics to have anything NP-hard physically happen. Anything that could not plausibly have involved black holes rotating at half the speed of light to produce closed timelike curves, or whatever, cannot have plausibly involved NP-hard problems. NP-hard = “did not physically happen”. “Physically happened” = not NP-hard.
I’ve seen you say this a couple of times, and your interlocutors seem to understand you, even when they dispute your conclusion. But my brain keeps returning an error when I try to parse your claim.
Read literally, “NP-hard” is not a predicate that can be meaningfully applied to individual events. So, in that sense, trivially, nothing that happens (physically or otherwise, if “physically” is doing any work here) can be NP-hard. But you are evidently not making such a trivial claim.
So, what would it look like if the physical universe “solved an NP-hard problem”? Presumably it wouldn’t just mean that some actual salesman found a why to use existing airline routes to visit a bunch of pre-specified cities without revisiting any one of them. Presumably it wouldn’t just mean that someone built a computer that implements a brute-force exhaustive search for a solution to the traveling salesman problem given an arbitrary graph (a search that the computer will never finish before the heat death of the universe if the example is large). But I can’t think of any other interpretation to give to your claim.
ETA: this is a side point.
Here’s Scott Aaronson describing people (university professors in computer science and cognitive science at RPI) who claim that the physical universe efficiently solves NP-hard problems:
In other news, Bringsjord also claims to show by a modal argument, similar to the theistic modal argument (which he also endorses), that human brains are capable of hypercomputation: “it’s possible humans are capable of hypercomputation, so they are capable of hypercomputation.” For this reason he argues that superhumanly intelligent Turing machines/Von Neumann computers are impossible and belief in their possibility is fideistic.
This doesn’t refute what you are responding to. Saying the universe can’t solve a general NP problem in polynomial time is not the same thing as saying the universe cannot possibly solve specific instances of generally NP-complete problems, which is Tyrrell_McAllister’s point, as far as I can parse. In general, the traveling salesman is NP-complete, however there are lots of cases where heuristics get the job done in polynomial time, even if those heuristics would run-away if they were given the wrong case.
To use Aaronson’s soap bubbles, sometimes the soap bubble finds a Steiner tree, sometimes it doesn’t. When it DOES, it has solved one instance of an NP-complete problem fairly quickly.
Existence of God, not existence of god [or gods]; my sincere apologies for pedantry, it’s just that LW folk generally misunderstand theology and so it seems important to be especially careful when discussing it. The distinction between God and god is quite important and the words look so similar, it’s unfortunate. ETA: Retracting because Carl edited the comment. (Thanks Carl!)
In many contexts where a quite important distinction between two terms exists, it is helpful to say what it is, rather than simply assert that there is one.
Admittedly, I have no reason to believe this is one such context.
I’m pretty sure the distinction Will was making was as follows.
A “god” is a being with superhuman powers and so forth, possibly existing somehow “on another plane of existence”, but still in some sense the same kind of thing as us: another thing among all the things there are.
“God” is not in any way the same sort of thing as we are. God is not just “another thing” but, in theologian-speak, the Ground Of All Being, a sort of precondition for everything else.
However, it’s possible that he’s making a different distinction, wherein “God” means a unique divine being of infinite power who created everything and “god” means lesser beings like the Greek gods.
(If he’s making the first distinction, I would remark that (1) most theists’ notions of God do in fact fall under the description of “god”, and (2) the radically different notion of “God” is of very doubtful coherence. If he’s making the second, I’m unconvinced that the distinction is as important as he suggests it is.)
(Hastily written:) I agree with your (1), but think that’s all the more reason to make clear distinctions instead of further muddying the epistemic waters; it’s much as if creationists didn’t bother to distinguish between something like Pokemon evolution and actual evolutionary biology, because after all most evolutionists can’t tell the difference.
Mildly disagree with your (2): I can see how the coherence of the God idea is somewhat doubtful, but there aren’t actually any overwhelmingly strong arguments in terms of metaphysics that that is the case, and most atheists take a different route by more or less rejecting all of metaphysics and instead placing emphasis on epistemology. (Then there are weaksauce theists like Kant and William James to argue with but I don’t think that’s as challenging.) Although I’m sympathetic to skepticism of metaphysics we should keep in mind that the obvious attempts to banish it have failed (e.g. logical positivism), and we should also keep in mind that though LessWrong is (somewhat justifiably) allergic to the word “metaphysics”, metaphysics actually shows up here quite a bit in the guise of computationalism/simulationism and in some semi-epistemic rules like Eliezer’s GAZP. So to reject metaphysics entirely would be inconsistent; from there, charitably engaging the actual metaphysical arguments of philosophical theists would be necessary, and I see this done very rarely.
In the meantime I think assigning probabilities below, say, 1% to philosophical theism would be premature, especially when the motivations for doing so seem largely to be desires to reverse the stupidity of religious thinking, when philosophical theism stems from Socrates, Plato, and Aristotle and isn’t easily dismissed as the rationalizations of a Christian hegemony the way that most atheists seem to assume in practice.
(ETA: It occurs to me that David Chalmers, who is LW-friendly and the editor of Metametaphysics, would be a good person to ask about the tenability of philosophical theism, from a metaphilosophical perspective. I might send him an email / LW message.)
Did you ever end up doing this, and if so, would you mind sharing the response?
I agree with your parse error. It looks like EY has moved away from the claim made in the grandparent, though.
That seems a little strongly put—NP-hard scales very poorly, so no real process can take N up to large numbers. I can solve the traveling salesman problem in my head with only modest effort if there are only 4 stops. And it’s trivial if there are 2 or 3 stops.
Conceded.
Um… doesn’t it take exponential time in order to simulate quantum mechanics on a classical computer?
Yes (At least that’s the general consensus among complexity theorists, though it hasn’t been proved.) This doesn’t contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:
P⊂BQP⊂EXP
P⊂NP⊂EXP
BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.)
If you don’t measure any distinctions finer than P vs EXP, then you’re using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren’t.
Is that just that we don’t know any better algorithms, or is there a proof that exptime is needed?
I really don’t know; some Wikipedia browsing suggests that there’s a proof, but I’d rather have a statement from someone who actually knows.
I don’t understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers—just not on large instances of the problem. New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine.
If you want to argue that actual practical biological folding of proteins isn’t NP-hard, the argument can’t start from “it happens quickly”—you need to say something about how the time to fold scales with the length of the amino acid strings, and in particular in the limit for very large strings.
Similarly, I don’t see why biological optimization couldn’t have solved hard cases of NP-compete problems. If you wait long enough for evolution to do its thing, the result could be equivalent to an exhaustive search. No new physics required.
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.)
The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn’t settled yet.
(In other words, I am suggesting that “just fine” is an something of an overstatement when it comes to solving seriously difficult problems by brute force.)
That counterargument is a bit too general, since it applies not only to NP problems, but even to P problems (such as deciding whether a number is the GCD of two other numbers), or even any arbitrary algorithm modified by a few lines of codes such that its result is unaffected, merely delayed until after the stars burned out, or whatever limit we postulate.
For NP problems and e.g. P problems both, given how we understand the universe, there is only a finite number of inputs in both cases which are tractable, and an infinite number of inputs which aren’t. Though the finite number is well different for both, as a fraction of all “possible”, or rather well-defined (let’s avoid that ambiguity cliff) inputs, it would be the same.
Cue “We all live in a Finite State Machine, Finite State Machine, Finite State Machine …”
The point can’t be confined to “trivial instances”. For any NP-complete problem on some reasonable computing platform that can solve small instances quickly, there will be instance sizes that are non-trivial (take appreciable time to solve) but do not require eons to solve. There is absolutely no mathematical reason for assuming that for “natural” NP-complete problems, interesting-sized instances can’t be solved on a timescale of months/years/centuries by natural processes.
The dichotomy between “trivial” and “impossible to solve in a useful time-frame” is a false one.
Presumably quantum suicide is a part of “whatever”.
It occurs to me that we can’t really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended.
Imagine you were living in a game, and trying to judge the game’s hardware requirements. If you did that by looking at a clock in the game, you’d need to assume that the clock is synchronized to the actual system time. If you had a counter you increased, you wouldn’t be able to say from inside the program every which step you get to that counter++ instruction.
The problem being that we don’t have access to anything external, we aren’t watching the Turing Machine compute, we are inside the Turing Machine “watching” the effects of other parts of the program, such as a folding protein (observing whenever it’s our turn to be simulated). We don’t, however, see the Turing Machine compute, we only see the output. The raw computing power / requirements “behind the scenes”, even if such a behind the scenes is only a non-existent abstraction, is impossible to judge with certainty, similar to a map-territory divide. Since there is no access in principle, we cannot observe anything but the “output”, we have no way of verifying any assumptions about a correspondence between “game timer” and “system timer” we may make, or of devising any experiments.
Even the recently linked “test the computational limits” doesn’t break the barrier, since for all we know the program may stall, and the next “frame” it outputs may still seem consistent, with no stalling, when viewed from inside the program, which we are. We wouldn’t subjectively realize the stall. If such an experiment did find something, it would be akin to a bug, not to a measurement of computational resources expended.
Back to diapers.
That’s a valid point, but it does presuppose exotic new physics to make that substrate, in which “our” time passes arbitrarily slowly compared to the really real time, so that it can solve NP-hard problems between our clock ticks. We would, in effect be in a simulation. Evidence of NP-hard problems actually being solved in P could be taken as evidence that we are in one.
If we assume that protein folding occurs according to the laws of quantum mechanics, then it shouldn’t tell us anything about the computational complexity of our universe besides what quantum mechanics tells us, right?
Well, yea that’s what I’m leaning towards. The laws of physics themselves need not govern the machine (Turing or otherwise), they are effects we observe, us being other effects. The laws of physics and the observers both are part of the output.
Like playing an online roleplaying game and inferring what the program can actually do or what resources it takes, when all you can access is “how high can my character jump” and other in-game rules. The rules regarding the jumping, and any limits the program chose to confer to the jumping behavior are not indicative of the resource requirements and efficiency of the underlying system. Is calculating the jumping easy or hard for the computer? How would you know as a character? The output, again, is a bad judge, take this example:
Imagine using an old Intel 386 system which you rigged into running the latest FPS shooter. It may only output one frame every few hours, but as a sentient character inside that game you wouldn’t notice. Things would be “smooth” for you because the rules would be unchanged from your point of view.
We can only say that given our knowledge of the laws of physics, the TM running the universe doesn’t output anything which seems like an efficient NP-problem solver, whether the program contains one, or the correct hardware abstraction running it uses one, is anyone’s guess. (The “contains one” probably isn’t anyone’s guess because of Occam’s Razor considerations.)
If this is all confused (it may well be, was mostly a stray thought), I’d appreciate a refutation.
If I understand correctly you’re saying that what is efficiently computable within a universe is not necessarily the same as what is efficiently computable on a computer simulating that universe. That is a good point.
Exactly. Thanks for succinctly expressing my point better than I could.
The question is whether assuming a correspondence as a somewhat default case (implied by the “not necessarily”) is even a good default assumption.
Why would the rules inherent in what we see inside the universe be any more indicative of the rules of the computer simulating that universe than the rules inside a computer game are reflective of the instruction set of the CPU running it (they are not)?
I am aware that the reference class “computer running super mario brother / kirby’s dream land” implies for the rules to be different, but on what basis would we choose any reference class which implies a correspondence?
Also, I’m not advocating simulationism with this per se, the “outer” computer can be strictly an abstraction.
This does not follow. It may be that finding new useful proteins just takes a very long time and is very inefficient. The rest of your comment seems correct though.
Then evolution wouldn’t happen in real life.
Actually, even that understates the argument. If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard—you only had to search through 19 variants on each of 20,000 cases.
Huh? How does this argument work? That doesn’t mean that evolution can’t happen in real life, it would be a reason to think that evolution is very slow (which it is!) or that evolution is missing a lot of interesting proteins (which seems plausible).
I’m not sure I follow your logic. Are you arguing that because log 20,000 <19? Yes, you can check every possible position in a base sequence this way, but there are still a lot more proteins than those 19. One doesn’t get something special from just changing a specific base. Moreover, even if something interesting does happen for changing a specific one, it might not happen if one changes some other one.
Definitely, since evolution keeps introducing new interesting proteins.
But it’s not slow on a scale of e^n for even modestly large n. If you can produce millions of proteins with hundreds to thousands of amino acids in a few billion years, then approximate search for useful proteins is not inefficient like finding the lowest-energy conformation is (maybe polynomial approximation, or the base is much better, or functional chunking lets you effectively reduce n greatly...).
Wait, the fact the evolution is often introducing a interesting new proteins is evidence that evolution is missing a lot of interesting proteins? How does that follow?
Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?
Yes.
That would be evidence that the supply of interesting proteins had been exhausted, just as computer performance at tic-tac-toe and checkers has stopped improving. I don’t see where you’re coming from here.
Because evolution can’t get stuck in the domain of attraction of a local optimum? It always finds any good points?
Edit to add: Intelligent humans can quickly refactor their programs out of poor regions of designspace. Evolution must grope within its neighborhood.
2nd Edit: How about this argument:
“Evolution has stopped producing interesting new ways of flying; therefore, there are probably no other interesting ways of accomplishing flight, since after all, if there were a good way of doing it, evolution would find it.”
Point mutations aren’t the only way for new things to be produced. You can also recombine large chunks and domains together from multiple previous genes.
Hell, there are even examples of genes evolving via a frame-shift that knocks the 3-base frame of a gene off by one producing a gobbeldygook protein that selection then acts upon...
Carl wasn’t commenting on whether it would be very strong evidence but whether it would be evidence.
Yes, we can be definitely confident that there are more interesting proteins in the vicinity because of continuing production. We have less evidence about more distant extrapolations, although they could exist too.
That makes a lot more sense.
It’s just that, from the context, you seemed to be making a claim about evolution’s ability to find all cool proteins, rather than just the ones within organisms’ local search neighborhood (which would thus be within evolution’s reach).
Hence why you appeared, from my reading, to be making the common mistake of attributing intelligence (and global search capabilities) to evolution, which it definitely does not have.
This insinuation was compounded by your comparison to human-intelligence-designed game algorithms, further making it sound like you attributed excessive search capability to evolution.
(And I’m a little scared, to be honest, that the linked comment got several upvotes.)
If you actually recognize the different search capabilities of evolution version more intelligent algorithms, I suggest you retract, or significantly revise, the linked comment.
20 standard amino acids, so 19 * 20,000 one-amino-acid-changed variants. If you can find something by searching 380,000 cases, it wasn’t an NP-hard problem.
EDIT: Actually, since I originally said 20,000 bases, it should just be 3 variants per base (4 standard DNA bases). I don’t know if there’s any significant 20,000-unit peptides.
Ok. Got it.
This does not follow. Instances of NP-hard problems can be quite easy. For example, consider the traveling salesman with 2 cities or SAT with 2 or 3 variables. These are easy enough to solve by hand. But the general problems are still NP-hard.
Note incidentally, that part of why proteins are difficult is that even when you do have all the variations, how they behave in the actual living thing is really complicated. You can’t easily grow thousands of mice each with each small protein difference and see what happens. So there are a lot of associated problems with brute forcing this sort of thing.
Removed for duplication with comment elsewhere that discussed padding.
Sure, but standard padding arguments can make NP-hard problems with much larger initial easy instances. It may be that there’s some general heuristic that real life, non-constructed problems never act this way, but that’s a distinct claim than what Eliezer seems to be making here.
The “NP” terminology is typically not a reference to hard problems are on an absolute scale. It’s a about how the difficulty of the problem in a specified class changes as its scale increases. So, there’s no issue with evolution solving particular instances of problems from classes of problem that are NP-hard—and talk about solving a whole class of such problems would be silly—all such classes consist of infinite sets of problems.
For various values of “solve”. NP-hard problems may not be analytically solvable, but numerical approximations can get pretty darn close, certainly enough for evolutionary advantage.
Is this your complete response? I guess I could expand this to “I expect all the problems an AI needs to solve on the way to an intelligence explosion to be easy in principle but hard in practice,” and I guess I could expand your other comments to “the problem sizes an AI will need to deal with are small enough that asymptotic statements about difficulty won’t come into play.” Both of these claims seem like they require justification.
It’s not meant as a response to everything, just noting that protein structure prediction can’t be NP-hard. More generally, I tend to take P!=NP as a background assumption; I can’t say I’ve worried too much about how the universe would look different if P=NP. I never thought superintelligences could solve NP-hard problems to begin with, since they’re made out of wavefunction and quantum mechanics can’t do that. My model of an intelligence explosion just doesn’t include anyone trying to do anything NP-hard at any point, unless it’s in the trivial sense of doing it for N=20 or something. Since I already expect things to local FOOM with P!=NP, adding P=NP doesn’t seem to change much, even if the polynomial itself is small. Though Scott Aaronson seems to think there’d be long-term fun-theoretic problems because it would make so many challenges uninteresting. :)
Suppose there is a single A.I. with a ‘Devote x % of resources to Smartening myself’ directive. Suppose further that the A.I is already operating with Daid Lewis ‘elite eligible’ ways of carving up the World along its joints- i.e. it is climbing the right hill. Presumably, the Smartening module faces a race hazard type problem in deciding whether it is smarter to devote resources to evaluating returns to smartness or to just release resources back to existing operations. I suppose it could internally breed its own heuristics for Karnaugh map type pattern recognition so as to avoid falling into an NP problem. However, if NP hard problems are like predators, there has to be a heuristic to stop the A.I avoiding them to the extent of roaming uninteresting space and breeding only ‘Speigelman monster’ or trivial or degenerate results. In other words the A.I’s ‘smarten yourself’ Module is now doing just enough to justify its upkeep but not so much as to endanger its own survival. At this point it is enough for there to be some exogenous shock or random discontinuity on the morphology of the fitness landscape for some sort of gender dimorphism and sexual selection to start taking place within the A.I. with speciation events and so on. However, this opens an exploit for parasites- i.e. humans- so FOOM cashes out as …of fuck, it’s the episode of South Park with the cat saying ‘O long Johnson’. Beenakker solution to Hempel’s dillemma was wrong- http://en.wikipedia.org/wiki/Hempel’s_dilemma- The boundary between physics and metaphysics is NOT the boundary between what can and what cannot be computed in the age of the universe’ because South Park has resolved every philosophical puzzle in the space of what?- a few hundred hours?