I am intrigued by the estimate for the difficulty of recapitulating evolution. Bostrom estimates 1E30 to 1E40 FLOPSyears. A conservative estimate for the value of a successful program to recapitulate evolution might be around $500B. This is enough to buy something like 10k very large supercomputers for a year, which gets you something like 1E20 FLOPSyears. So the gap is between 10 and 20 orders of magnitude. In 35 years, this gap would fall to 5 to 15 orders of magnitude (at the current rate of progress in hardware, which seems likely to slow).
One reason this possibility is important is that it seems to offer one of the strongest possible environments for a disruptive technological change.
This seems sensible as a best guess, but it is interesting to think about scenarios where it turns out to be surprisingly easy to simulate evolution. For example, if there were a 10% chance of this project being economically feasible within 20 years, that would be an extremely interesting fact, and one that might affect my views of the plausibility of AI soon. (Not necessarily because such an evolutionary simulation per se is likely to occur, but mostly because it says something about the overall difficulty of building up an intelligence by a brute force search.)
But it is easy to see how this estimate might be many orders of magnitude too high (and also to see how it could be somewhat quite a bit too low, but it is interesting to look at the low tail in particular):
It may be that you can memoize much of the effort of fitness evaluations, evaluating the fitness of many similar organisms in parallel. This appears to be a trick that is unavailable to evolution, but the key idea on which modern approaches to training neural nets rely. Gradient descent + backpropagation gives you a large speedup for training neural nets, as much as linear in the number of parameters which you are training. In the case of evolution, this could already produce a 10 order of magnitude speedup.
It may be possible to evaluate the fitness of an organism radically faster than it is revealed by nature. For example, it would not be too surprising if you could short-circuit most of the work of development. Moreover, nature doesn’t get very high-fidelity estimates of fitness, and it wouldn’t be at all surprising to me if it were possible to get a comparably good estimate of a human’s fitness over the course of an hour in a carefully designed environment (this is a speedup of about 5 orders of magnitude over the default).
It seems plausible that mutation in nature does not search the space as effectively as human engineers could even with a relatively weak understanding of intelligence or evolution. It would not be surprising to me if you could conduct the search a few orders of magnitude faster merely by using optimal mutation rates, using estimates for fitness including historical performance, choosing slightly better distribution of mutations, or whatever.
Overall, I would not be too surprised (certainly I would give it > 1%) if a clean theoretical understanding of evolution made this a tractable project today given sufficient motification, which is quite a frightening prospect. The amount of effort that has gone into developing such an understanding, with an eye towards this kind of engineering project, also appears to be surprisingly small.
These are all good points, and I think you’re understating the case on (at least) the third one. An EA search can afford to kill off most of its individuals for the greater good; biological evolution cannot tolerate a substantial hit to individual fitness for the sake of group fitness. An EA search can take reproduction for granted, tying it directly to performance on an intelligence test.
If AI were developed in this way, one upside might be that you would have a decent understanding of how smart it was at any given point (since that is how the AI was selected), so would be unlikely to accidentally have a system that was much more capable than you expected. I’m not sure how probable that is in other cases, but people sometimes express concern about it.
If the simulation is running at computer speed then only in it’s subjective time this would be the case. By the time we notice that variation number 34 on backtracking and arc consistency actually speeds up evolution by the final 4 orders of magnitude, it may be too late to stop the process from creating simulants that are smarter than us and can achieve takeoff acceleration.
Keep in mind though that if this were to happen than we’d be overwhelmingly likely to be living in a simulation already, for the reasons given in the Simulation Argument paper.
You could have a system that automatically stopped when you reached a system with a certain level of tested intelligence. But perhaps I’m misunderstanding you.
With respect to the simulation argument, would you mind reminding us of why we would be overwhelmingly likely to be in a simulation if we build AI which ‘achieves takeoff acceleration’, if it’s quick to summarize? I expect many people haven’t read the simulation argument paper.
Having a system that stops when you reach a level of tested intelligence sounds appealing, but I’d be afraid of the measure of intelligence at hand being too fuzzy.
So the system would not detect it had achieved that level of intelligence, and it would bootstrap and take off in time to destroy the control system which was supposed to halt it. This would happen if we failed to solve any of many distinct problems that we don’t know how to solve yet, like symbol grounding, analogical and metaphorical processing and 3 more complex ones that I can think of, but don’t want to spend resources in this thread mentioning. Same for simulation argument.
I agree with this. Brute force searching AI did not seem to be a relevant possibility to me prior to reading this chapter and this comment, and now it does.
One more thought/concern regarding the evolutionary approach:
Humans perform poorly when estimating the cost and duration of software projects, particularly as the size and complexity of the project grows. Recapitulating evolution is a large project, and so it wouldn’t be at all surprising if it ended up requiring more compute time and person-hours than expected, pushing out the timeline for success via this approach.
While humans do perform poorly estimating time to complete projects, I expect that only adds a factor of two or something, which is fairly small next to the many order of magnitude uncertainty around the cost.
I’m really curious as to where you’re getting the $500B number from. I felt like I didn’t understand this argument very well at all, and I’m wondering what sorts of results you’re imagining as a result of such a program.
It’s worth noting that 1E30-1E40 is only the cost of simulating the neurons, and an estimate for the computational cost of simulating the fitness function is not given, although it is stated that the fitness function “is typically the most computationally expensive component”. So the evaluation of the fitness function (which presumably has to be complicated enough to accurately assess intelligence), isn’t even included in that estimate.
It’s also not clear to me at least that simulating neurons is capable of recapitulating the evolution of general intelligence. I don’t believe it is a property of individual neurons that causes the brain to be divided into two hemispheres. I don’t know anything about brains, but I’ve never heard of left neurons or right neurons. So is it the neurons that are supposed to be mutating or some unstated variable that describes the organization of the various neurons. If the latter, then what is the computational cost associated with that super structure?
I feel like “recapitulating evolution” is a poor term for this. It’s not clear that there’s a lot of overlap between this sort of massive genetic search and actual evolution. It’s not clear that computational cost is the limiting factor. Can we design a series of fitness functions capable of guiding a randomly evolving algorithm to some sort of general intelligence? For humans, it seems that the mixture of cooperation and competition with other equally intelligent humans resulted in some sort of intelligence arms race, but the evolutionary fitness function that led to humans, or to the human ancestors isn’t really known. How do you select for an intelligent/human like niche in your fitness function? What series of problems can you create that will allow general intelligence to triumph over specialized algorithms?
Will the simulated creatures be given time to learn before their fitness is evaluated? Will learning produce changes in neural structure? Is the genotype/phenotype distinction being preserved? I feel like it’s almost misleading to include numerical estimates for the computational cost of what is arguably the easiest part of this problem without addressing the far more difficult theoretical problem of devising a fitness landscape that has a reasonable chance to produce intelligence. I’m even more blown away by the idea that it would be possible to estimate a cash value to any degree of precision for such a program. I have literally no idea what the probability distribution of possible outcomes for such a program would be. I don’t even have a good estimate of the cost or the theory behind the inputs.
This seems like an information hazard, since it has the form: This estimate for process X which may destroy the value of the future seems too low, also, not many people are currently studying X, which is surprising.
If X is some genetically engineered variety of smallpox, it seems clear that mentioning those facts is hazardous.
If the World didn’t know about brain emulation, calling it an under-scrutinized area would be dangerous, relative to, say, just mention to a few safety savvy, x-risk savvy neuroscientist friends who would go on to design safety protocols for it, as well as, if possible, slow down progress in the field.
If the idea is obvious enough to AI researchers (evolutionary approaches are not uncommon—they have entire conferences dedicated to the sub-field)), then avoiding discussion by Bostrom et al. doesn’t reduce information hazard, it just silences the voices of the x-risk savvy while evolutionary AI researchers march on, probably less aware of the risks of what they are doing than if the x-risk savvy keep discussing it.
So, to the extent this idea is obvious / independently discoverable by AI researchers, this approach should not be taken in this case.
I am intrigued by the estimate for the difficulty of recapitulating evolution. Bostrom estimates 1E30 to 1E40 FLOPSyears. A conservative estimate for the value of a successful program to recapitulate evolution might be around $500B. This is enough to buy something like 10k very large supercomputers for a year, which gets you something like 1E20 FLOPSyears. So the gap is between 10 and 20 orders of magnitude. In 35 years, this gap would fall to 5 to 15 orders of magnitude (at the current rate of progress in hardware, which seems likely to slow).
One reason this possibility is important is that it seems to offer one of the strongest possible environments for a disruptive technological change.
This seems sensible as a best guess, but it is interesting to think about scenarios where it turns out to be surprisingly easy to simulate evolution. For example, if there were a 10% chance of this project being economically feasible within 20 years, that would be an extremely interesting fact, and one that might affect my views of the plausibility of AI soon. (Not necessarily because such an evolutionary simulation per se is likely to occur, but mostly because it says something about the overall difficulty of building up an intelligence by a brute force search.)
But it is easy to see how this estimate might be many orders of magnitude too high (and also to see how it could be somewhat quite a bit too low, but it is interesting to look at the low tail in particular):
It may be that you can memoize much of the effort of fitness evaluations, evaluating the fitness of many similar organisms in parallel. This appears to be a trick that is unavailable to evolution, but the key idea on which modern approaches to training neural nets rely. Gradient descent + backpropagation gives you a large speedup for training neural nets, as much as linear in the number of parameters which you are training. In the case of evolution, this could already produce a 10 order of magnitude speedup.
It may be possible to evaluate the fitness of an organism radically faster than it is revealed by nature. For example, it would not be too surprising if you could short-circuit most of the work of development. Moreover, nature doesn’t get very high-fidelity estimates of fitness, and it wouldn’t be at all surprising to me if it were possible to get a comparably good estimate of a human’s fitness over the course of an hour in a carefully designed environment (this is a speedup of about 5 orders of magnitude over the default).
It seems plausible that mutation in nature does not search the space as effectively as human engineers could even with a relatively weak understanding of intelligence or evolution. It would not be surprising to me if you could conduct the search a few orders of magnitude faster merely by using optimal mutation rates, using estimates for fitness including historical performance, choosing slightly better distribution of mutations, or whatever.
Overall, I would not be too surprised (certainly I would give it > 1%) if a clean theoretical understanding of evolution made this a tractable project today given sufficient motification, which is quite a frightening prospect. The amount of effort that has gone into developing such an understanding, with an eye towards this kind of engineering project, also appears to be surprisingly small.
These are all good points, and I think you’re understating the case on (at least) the third one. An EA search can afford to kill off most of its individuals for the greater good; biological evolution cannot tolerate a substantial hit to individual fitness for the sake of group fitness. An EA search can take reproduction for granted, tying it directly to performance on an intelligence test.
If AI were developed in this way, one upside might be that you would have a decent understanding of how smart it was at any given point (since that is how the AI was selected), so would be unlikely to accidentally have a system that was much more capable than you expected. I’m not sure how probable that is in other cases, but people sometimes express concern about it.
If the simulation is running at computer speed then only in it’s subjective time this would be the case. By the time we notice that variation number 34 on backtracking and arc consistency actually speeds up evolution by the final 4 orders of magnitude, it may be too late to stop the process from creating simulants that are smarter than us and can achieve takeoff acceleration.
Keep in mind though that if this were to happen than we’d be overwhelmingly likely to be living in a simulation already, for the reasons given in the Simulation Argument paper.
You could have a system that automatically stopped when you reached a system with a certain level of tested intelligence. But perhaps I’m misunderstanding you.
With respect to the simulation argument, would you mind reminding us of why we would be overwhelmingly likely to be in a simulation if we build AI which ‘achieves takeoff acceleration’, if it’s quick to summarize? I expect many people haven’t read the simulation argument paper.
Having a system that stops when you reach a level of tested intelligence sounds appealing, but I’d be afraid of the measure of intelligence at hand being too fuzzy.
So the system would not detect it had achieved that level of intelligence, and it would bootstrap and take off in time to destroy the control system which was supposed to halt it. This would happen if we failed to solve any of many distinct problems that we don’t know how to solve yet, like symbol grounding, analogical and metaphorical processing and 3 more complex ones that I can think of, but don’t want to spend resources in this thread mentioning. Same for simulation argument.
I agree with this. Brute force searching AI did not seem to be a relevant possibility to me prior to reading this chapter and this comment, and now it does.
One more thought/concern regarding the evolutionary approach: Humans perform poorly when estimating the cost and duration of software projects, particularly as the size and complexity of the project grows. Recapitulating evolution is a large project, and so it wouldn’t be at all surprising if it ended up requiring more compute time and person-hours than expected, pushing out the timeline for success via this approach.
While humans do perform poorly estimating time to complete projects, I expect that only adds a factor of two or something, which is fairly small next to the many order of magnitude uncertainty around the cost.
I’m really curious as to where you’re getting the $500B number from. I felt like I didn’t understand this argument very well at all, and I’m wondering what sorts of results you’re imagining as a result of such a program.
It’s worth noting that 1E30-1E40 is only the cost of simulating the neurons, and an estimate for the computational cost of simulating the fitness function is not given, although it is stated that the fitness function “is typically the most computationally expensive component”. So the evaluation of the fitness function (which presumably has to be complicated enough to accurately assess intelligence), isn’t even included in that estimate.
It’s also not clear to me at least that simulating neurons is capable of recapitulating the evolution of general intelligence. I don’t believe it is a property of individual neurons that causes the brain to be divided into two hemispheres. I don’t know anything about brains, but I’ve never heard of left neurons or right neurons. So is it the neurons that are supposed to be mutating or some unstated variable that describes the organization of the various neurons. If the latter, then what is the computational cost associated with that super structure?
I feel like “recapitulating evolution” is a poor term for this. It’s not clear that there’s a lot of overlap between this sort of massive genetic search and actual evolution. It’s not clear that computational cost is the limiting factor. Can we design a series of fitness functions capable of guiding a randomly evolving algorithm to some sort of general intelligence? For humans, it seems that the mixture of cooperation and competition with other equally intelligent humans resulted in some sort of intelligence arms race, but the evolutionary fitness function that led to humans, or to the human ancestors isn’t really known. How do you select for an intelligent/human like niche in your fitness function? What series of problems can you create that will allow general intelligence to triumph over specialized algorithms?
Will the simulated creatures be given time to learn before their fitness is evaluated? Will learning produce changes in neural structure? Is the genotype/phenotype distinction being preserved? I feel like it’s almost misleading to include numerical estimates for the computational cost of what is arguably the easiest part of this problem without addressing the far more difficult theoretical problem of devising a fitness landscape that has a reasonable chance to produce intelligence. I’m even more blown away by the idea that it would be possible to estimate a cash value to any degree of precision for such a program. I have literally no idea what the probability distribution of possible outcomes for such a program would be. I don’t even have a good estimate of the cost or the theory behind the inputs.
This seems like an information hazard, since it has the form: This estimate for process X which may destroy the value of the future seems too low, also, not many people are currently studying X, which is surprising.
If X is some genetically engineered variety of smallpox, it seems clear that mentioning those facts is hazardous.
If the World didn’t know about brain emulation, calling it an under-scrutinized area would be dangerous, relative to, say, just mention to a few safety savvy, x-risk savvy neuroscientist friends who would go on to design safety protocols for it, as well as, if possible, slow down progress in the field.
Same should be done in this case.
If the idea is obvious enough to AI researchers (evolutionary approaches are not uncommon—they have entire conferences dedicated to the sub-field)), then avoiding discussion by Bostrom et al. doesn’t reduce information hazard, it just silences the voices of the x-risk savvy while evolutionary AI researchers march on, probably less aware of the risks of what they are doing than if the x-risk savvy keep discussing it.
So, to the extent this idea is obvious / independently discoverable by AI researchers, this approach should not be taken in this case.