Intelligence Explosion analysis draft: Why designing digital intelligence gets easier over time

Again, I invite your feedback on this snippet from an intelligence explosion analysis Anna Salamon and myself have been working on. This section is less complete than the others; missing text is indicated with brackets: [].

_____

Many predictions of human-level digital intelligence have been wrong.¹ On the other hand, machines surpass human ability at new tasks with some regularity (Kurzweil 2005). For example, machines recently achieved superiority at visually identifying traffic signs at low resolution (Sermanet and LeCun 2011), diagnosing cardiovascular problems from some types of MRI scan images (Li et al. 2009), and playing Jeopardy! (Markoff 2011). Below, we consider several factors that, considered together, appear to increase the odds that we will develop digital intelligence as the century progresses.

More hardware. For at least four decades, computing power² has increased exponentially, in accordance with Moore’s law.³ Experts disagree on how much longer Moore’s law will hold (e.g. Mack 2011; Lundstrom 2003), but if it holds for two more decades then we may have enough computing power to emulate human brains by 2029.⁴ Even if Moore’s law fails to hold, our hardware should become much more powerful in the coming decades.⁵ More hardware doesn’t by itself give us digital intelligence, but it contributes to the development of digital intelligence in several ways:

Powerful hardware may improve performance simply by allowing existing “brute force” solutions to run faster (Moravec, 1976). Where such solutions do not yet exist, researchers might be incentivized to quickly develop them given abundant hardware to exploit. Cheap computing may enable much more extensive experimentation in algorithm design, tweaking parameters or using methods such as genetic algorithms. Indirectly, computing may enable the production and processing of enormous datasets to improve AI performance (Halevi et al., 2009), or result in an expansion of the information technology industry and the quantity of researchers in the field.⁶

Massive datasets. The greatest leaps forward in speech recognition and translation software have come not from faster hardware or smarter hand-coded algorithms, but from access to massive data sets of human-transcribed and human-translated words (Halevy, Norvig, and Pereira 2009). [add sentence about how datasets are expected to increase massively, or have been increasing massively and trends are expected to continue] [Possibly a sentence about Watson or usefulness of data for AI]

Better algorithms. Mathematical insights can reduce the computation time of a program by many orders of magnitude without additional hardware. For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.15 TIPS. Thus, the power of the chess algorithms increased by a factor of 100 in only six years, or 3.33 orders of magnitude per decade (Richard and Shaw 2004). [add sentence about how this sort of improvement is not uncommon, with citations]

Progress in neuroscience. [neuroscientists have figured out brain algorithms X, Y, and Z that are related to intelligence.] New insights into how the brain achieves human-level intelligence can inform our attempts to build human-level intelligence with silicon (van der Velde 2010; Koene 2011).

Accelerated science. A growing First World will mean that more researchers at well-funded universities will be available to do research relevant to digital intelligence. The world’s scientific output (in publications) grew by a third from 2002 to 2007 alone, much of this driven by the rapid growth of scientific output in developing nations like China and India (Smith 2011). New tools can accelerate particular fields, just as fMRI accelerated neuroscience in the 1990s. Finally, the effectiveness of scientists themselves can potentially be increased with cognitive enhancement drugs (Sandberg and Bostrom 2009) and brain-computer interfaces that allow direct neural access to large databases (Groß 2009). Better collaboration tools like blogs and Google scholar are already yielding results (Nielsen 2011).

Automated science. Early attempts at automated science — e.g., using data mining algorithms to make discoveries from existing data (Szalay and Gray 2006), or having a machine with no physics knowledge correctly infer natural laws from motion-tracking data (Schmidt and Lipson 2009) — were limited by the slowest part of the process: the human in the loop. Recently, the first “closed-loop” robot scientist successfully devised its own hypotheses (about yeast genomics), conducted experiments to test those hypotheses, assessed the results, and made novel scientific discoveries, all without human intervention (King et al. 2009). Current closed-loop robot scientists can only work on a narrow set of scientific problems, but future advances may allow for scalable, automated scientific discovery (Sparkes et al. 2010).

Embryo selection for better scientists. At age 8, Terrence Tao scored 760 on the math SAT, one of only [2?3?] children ever to do this at such an age; he later went on to [have a lot of impact on math]. Studies of similar kids convince researchers that there is a large “aptitude” component to mathematical achievement, even at the high end.⁷ How rapidly would mathematics or AI progress if we could create hundreds of thousands of Terrence Tao’s? This is a serious question because the creation of large numbers of exceptional scientists is an engineering project that we know in principle how to do. The plummeting costs of genetic sequencing [expected to go below AMOUNT per genome by SOONYEAR e.g. 2015] will soon make it feasible to compare the characteristics of an entire population of adults with those adults’ full genomes, and, thereby, to unravel the heritable components of intelligence, dilligence, and other contributors to scientific achievement. To make large numbers of babies with scientific abilities near the top of the current human range⁸ would then require only the ability to combine known alleles onto a single genome; procedures that can do this have already been developed for mice. China, at least, appears interested in this prospect.⁹

It isn’t clear which of these factors will ease progress toward digital intelligence, but it seems likely that — across a broad range of scenarios — some of these inputs will do so.

____

¹ For example, Simon (1965, 96) predicted that “machines will be capable, within twenty years, of doing any work a man can do.”

² The technical measure predicted by Moore’s law is the density of components on an integrated circuit, but this is closely tied to affordable computing power.

³ For important qualifications, see Nagy et al. (2010); Mack (2011).

⁴ This calculation depends on the “level of emulation” expected to be necessary for successful WBE. Sandberg and Bostrom (2008) report that attendees to a workshop on WBE tended to expect that emulation at the level of the brain’s spiking neural network, perhaps including membrane states and concentrations of metabolites and neurotransmitters, would be required for successful WBE. They estimate that if Moore’s law continues, we will have the computational capacity to emulate a human brain at the level of its spiking neural network by 2019, or at the level of metabolites and neurotransmitters by 2029.

⁵ Quantum computing may also emerge during this period. Early worries that quantum computing may not be feasible have been overcome, but it is hard to predict whether quantum computing will contribute significantly to the development of digital intelligence because progress in quantum computing depends heavily on unpredictable insights in quantum algorithms (Rieffel and Polak 2011).

⁶ Shulman and Sandberg (2010).

⁷ [Benbow etc. on study of exceptional talent; genetics of g; genetics of conscientiousness and openness, pref. w/ any data linking conscientiousness or openness to scientific achievement. Try to frame in a way that highlights hard work type variables, so as to alienate people less.]

⁸ [folks with very top scientific achievement likely had lucky circumstances as well as initial gifts (so that, say, new kids with Einstein’s genome would be expected to average perhaps .8 times as exceptional). However, one could probably identify genomes better than Einstein’s, both because these technologies would let genomes be combined that had unheard of, vastly statistically unlikely amounts of luck, and because e.g. there are likely genomes out there that are substantially better than Einstein (but on folks who had worse environmental luck).]

⁹ [find source]

_____

All references, including the ones used above:

Bainbridge 2006 managing nano-bio-info-cogno innovations
Baum Goertzel Goertzel 2011 how long until human-level ai
Bostrom 2003 ethical issues in advanced artificial intelligence
Legg 2008 machine super intelligence
Caplan 2008 the totalitarian threat
Sandberg & Bostrom 2011 machine intelligence survey
Chalmers 2010 singularity philosophical analysis
Turing 1950 machine intelligence
Good 1965 speculations concerning...
Von neumann 1966 theory of self-reproducing autonomata
Solomonoff 1985 the time scale of artificial intelligence
Vinge 1993 coming technological singularity
Yudkowsky 2001 creating friendly ai
Yudkowsky 2008a negative and positive factor in global risk
Yudkowsky 2008b cognitive biases potentially affecting
Russel Norvig 2010 artificial intelligence a modern approach 3e
Nordman 2007 If and then: a critique of speculative nanoethics
Moore and Healy the trouble with overconfidence
Tversky Kahneman 2002 extensional versus intuitive reasoning, the conjunction fallacy
Nickerson 1998 Confirmation Bias; A Ubiquitous Phenomenon in Many Guises
Dreyfus 1972 what computers can’t do
Rhodes 1995 making of the atomic bomb
Arrhenius 1896 On the Influence of Carbonic Acid in the Air Upon the Temperature
Crawford 1997 Arrhenius’ 1896 model of the greenhouse effect in context
Rasmussen 1975 WASH-1400 report
McGrayne 2011 theory that would not die
Lundstrom 2003 Enhanced: Moore’s law forever?
Tversky and Kahneman 1974 Judgment under uncertainty: Heuristics and biases
Horgan 1997 end of science
Sutton and Barto 1998 reinforcement learning
Hutter 2004 universal ai
Schmidhuber 2007 godel machines
Dewey 2011 learning what to value
Simon 1965 The Shape of Automation for Men and Management
Marcus 2008 kluge
Sandberg Bostrom 2008 whole brain emulation
Kurzweil 2005 singularity is near
Sermanet Lecun 2011 traffic sign recognition with multi-scale convolutional networks
Li et al. 2009 optimizing a medical image analysis system using
Markoff 2011 watson trivial it’s not
Smith 2011 Knowledge networks and nations
Sandberg Bostrom 2009 cognitive enhancement regulatory issues
Groß 2009 Blessing or Curse? Neurocognitive Enhancement by “Brain Engineering”
Williams 2011 prediction markets theory and appilcations
Nielsen 2011 reinventing discovery
Tetlock 2005 expert judgment
Green & Armstrong 2007 The Ombudsman: Value of Expertise for Forecasting
Weinberg et al. 2010 philosophers expert intuiters
Szalay and gray 2006 science in an exponential world
Schmidt Lipson 2009 distilling free-form natural laws from experimental data
King et al. 2009 the automation of science
Sparkes et al. 2010 Towards Robot Scientists for autonomous scientific discovery
Stanovich 2010 rationality and the reflective mind
Lillienfeld, Ammirati, and Landfield 2009 giving debiasing away
Lipman 1983 Thinking Skills Fostered by Philosophy for Children
Fong et al 1986 The effects of statistical training on thinking about everyday problems
Shoemaker (1979). The role of statistical knowledge in gambling decisions
Larrick 2004 debiasing
Gordon 2007 reasoning about the future of nanotechnology
Landeta 2006 Current validity of the delphi method in social sciences
Maddison 2001 the world economy a millenial perspective
Niparko 2009 cochlear implants principles and practices
Bostrom 2002 existential risks
Joyce 2007 moral anti-realism stanford encyclopedia of philosophy
Portmore 2011 commonsense consequentialism
Martin 1971 brief proposal on immortality
Bostrom Cirkovic 2008 global catastrophic risks
National Academy of Sciences 2010 presistent forecasting of disruptive technologies
Donohoe and Needham 2009 Moving best practice forward, Delphi characteristics
Gordon 1994 the delphi method
Kesten, Armstrong, and Graefe 2007 Methods to Elicit Forecasts from Groups
Woudenberg 1991 an evaluation of delphi
Armstrong 2006 Findings from evidence-based forecasting
Armstrong 1985 Long-Range Forecasting: From Crystal Ball to Computer, 2nd edition
Anderson and Anderson-Parente 2011 A case study of long-term Delphi accuracy
Bixby 2002 Solving real-world linear programs: A decade and more of progress
Fox 2011 the limits of intelligence
Friedman 1953 The Methodology of Positive Economics
Schneider 2010 homo economicus, or more like Homer Simpsons
Cartwright 2011 behavioral economics
Bacon and Van Dam 2010 recent progress in quantum algorithms
Rieffel Polak 2011 quantum computing a gentle introduction
Mack 2011 fifty years of moore’s law
Nagy et al. 2010 testing laws of technological progress
Lundstrom 2003 Moore’s law forever
Shulman Sandberg 2010 implications of a software-limited singularity
Moravec 1976 The Role of raw power in intelligence
Halevi et al. 2009 The Unreasonable effectiveness of data
Alberth 2008 forecasting technology costs via the experience curve
Omohundro 2007 the nature of self-improving AI
Kurzban 2011 why everyone (else) is a hypocrite: evolution and the modular mind
Richard Shaw 2004 chips architectures and algorithms
Yudkowsky 2010 timeless decision theory
De Blanc Ontological Crises in Artificial Agents’ Value Systems
Dewey 2011 learning what to value
Halevy, Norvig, and Pereira 2009 the unreasonable effectiveness of data
Ramachandran 2011 the tell-tale brain
van der Velde 2010 Where Artificial Intelligence and Neuroscience Meet
Koene 2011 AGI and neuroscience: Open sourcing the brain (in AGI-11 proceedings)
Lichtenstein, Fischoff, and Phillips 1982 calibration of probabilities the state of the art to 1980
Griffin and Tversky 1992 The weighing of evidence and the determinants of confidence
Yates, Lee, Sieck, Choi, Price 2002 Probability judgment across cultures
Murphy and Winkler 1984 probability forecasting in meteorology
Grove and Meehl 1996 Comparative Efficiency of Informal...
Grove et al. 2000 Clinical versus mechanical prediction: A meta-analysis
Kandel et al. 2000 principles of neural science, 4th edition
Shulman 2010 Omohundro’s “Basic AI Drives” and Catastrophic Risks
Friedman 1993 Problems of Coordination in Economic Activity
Cooke 1991 experts on uncertainty
Yampolskiy forthcoming Leakproofing the Singularity
Lampson 1973 a note on the confinement problem
Schaeffer 1997 one jump ahead
Dolan and Sharot 2011 Neuroscience of preference and choice