Abstract: If human-level AI is eventually created, it may have unprecedented positive or negative
consequences for humanity. It is therefore worth constructing the best possible forecasts
of policy-relevant aspects of AI development trajectories—even though, at this early
stage, the unknowns must remain very large.
We propose that one factor that can usefully constrain models of AI development is
the “intelligibility of intelligence”—the extent to which efficient algorithms for general
intelligence follow from simple general principles, as in physics, as opposed to being
necessarily a conglomerate of special case solutions. Specifically, we argue that estimates
of the “intelligibility of intelligence” bear on:
Whether human-level AI will come about through a conceptual breakthrough,
rather than through either the gradual addition of hardware, or the gradual accumulation
of special-case hacks;
Whether the invention of human-level AI will, therefore, come without much
warning;
Whether, if AI progress comes through neuroscience, neuroscientific knowledge
will enable brain-inspired human-level intelligences (as researchers “see why the
brain works”) before it enables whole brain emulation;
Whether superhuman AIs, once created, will have a large advantage over humans
in designing still more powerful AI algorithms;
Whether AI intelligence growth may therefore be rather sudden past the human
level; and
Whether it may be humanly possible to understand intelligence well enough, and to
design powerful AI architectures that are sufficiently transparent, to create demonstrably
safe AIs far above the human level.
The intelligibility of intelligence thus provides a means of constraining long-term AI
forecasting by suggesting relationships between several unknowns in AI development
trajectories. Also, we can improve our estimates of the intelligibility of intelligence, e.g.
by examining the evolution of animal intelligence, and the track record of AI research
to date.
Related paper: How Intelligible is Intelligence?