Superintelligence 5: Forms of Superintelligence
This is part of a weekly reading group on Nick Bostrom’s book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI’s reading guide.
Welcome. This week we discuss the fifth section in the reading guide: Forms of superintelligence. This corresponds to Chapter 3, on different ways in which an intelligence can be super.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: Chapter 3 (p52-61)
Summary
A speed superintelligence could do what a human does, but faster. This would make the outside world seem very slow to it. It might cope with this partially by being very tiny, or virtual. (p53)
A collective superintelligence is composed of smaller intellects, interacting in some way. It is especially good at tasks that can be broken into parts and completed in parallel. It can be improved by adding more smaller intellects, or by organizing them better. (p54)
A quality superintelligence can carry out intellectual tasks that humans just can’t in practice, without necessarily being better or faster at the things humans can do. This can be understood by analogy with the difference between other animals and humans, or the difference between humans with and without certain cognitive capabilities. (p56-7)
These different kinds of superintelligence are especially good at different kinds of tasks. We might say they have different ‘direct reach’. Ultimately they could all lead to one another, so can indirectly carry out the same tasks. We might say their ‘indirect reach’ is the same. (p58-9)
We don’t know how smart it is possible for a biological or a synthetic intelligence to be. Nonetheless we can be confident that synthetic entities can be much more intelligent than biological entities.
Digital intelligences would have better hardware: they would be made of components ten million times faster than neurons; the components could communicate about two million times faster than neurons can; they could use many more components while our brains are constrained to our skulls; it looks like better memory should be feasible; and they could be built to be more reliable, long-lasting, flexible, and well suited to their environment.
Digital intelligences would have better software: they could be cheaply and non-destructively ‘edited’; they could be duplicated arbitrarily; they could have well aligned goals as a result of this duplication; they could share memories (at least for some forms of AI); and they could have powerful dedicated software (like our vision system) for domains where we have to rely on slow general reasoning.
Notes
This chapter is about different kinds of superintelligent entities that could exist. I like to think about the closely related question, ‘what kinds of better can intelligence be?’ You can be a better baker if you can bake a cake faster, or bake more cakes, or bake better cakes. Similarly, a system can become more intelligent if it can do the same intelligent things faster, or if it does things that are qualitatively more intelligent. (Collective intelligence seems somewhat different, in that it appears to be a means to be faster or able to do better things, though it may have benefits in dimensions I’m not thinking of.) I think the chapter is getting at different ways intelligence can be better rather than ‘forms’ in general, which might vary on many other dimensions (e.g. emulation vs AI, goal directed vs. reflexive, nice vs. nasty).
Some of the hardware and software advantages mentioned would be pretty transformative on their own. If you haven’t before, consider taking a moment to think about what the world would be like if people could be cheaply and perfectly replicated, with their skills intact. Or if people could live arbitrarily long by replacing worn components.
The main differences between increasing intelligence of a system via speed and via collectiveness seem to be: (1) the ‘collective’ route requires that you can break up the task into parallelizable subtasks, (2) it generally has larger costs from communication between those subparts, and (3) it can’t produce a single unit as fast as a comparable ‘speed-based’ system. This suggests that anything a collective intelligence can do, a comparable speed intelligence can do at least as well. One counterexample to this I can think of is that often groups include people with a diversity of knowledge and approaches, and so the group can do a lot more productive thinking than a single person could. It seems wrong to count this as a virtue of collective intelligence in general however, since you could also have a single fast system with varied approaches at different times.
For each task, we can think of curves for how performance increases as we increase intelligence in these different ways. For instance, take the task of finding a fact on the internet quickly. It seems to me that a person who ran at 10x speed would get the figure 10x faster. Ten times as many people working in parallel would do it only a bit faster than one, depending on the variance of their individual performance, and whether they found some clever way to complement each other. It’s not obvious how to multiply qualitative intelligence by a particular factor, especially as there are different ways to improve the quality of a system. It also seems non-obvious to me how search speed would scale with a particular measure such as IQ.
How much more intelligent do human systems get as we add more humans? I can’t find much of an answer, but people have investigated the effect of things like team size, city size, and scientific collaboration on various measures of productivity.
The things we might think of as collective intelligences—e.g. companies, governments, academic fields—seem notable to me for being slow-moving, relative to their components. If someone were to steal some chewing gum from Target, Target can respond in the sense that an employee can try to stop them. And this is no slower than an individual human acting to stop their chewing gum from being taken. However it also doesn’t involve any extra problem-solving from the organization—to the extent that the organization’s intelligence goes into the issue, it has to have already done the thinking ahead of time. Target was probably much smarter than an individual human about setting up the procedures and the incentives to have a person there ready to respond quickly and effectively, but that might have happened over months or years.
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser’s list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
Produce improved measures of (substrate-independent) general intelligence. Build on the ideas of Legg, Yudkowsky, Goertzel, Hernandez-Orallo & Dowe, etc. Differentiate intelligence quality from speed.
List some feasible but non-realized cognitive talents for humans, and explore what could be achieved if they were given to some humans.
List and examine some types of problems better solved by a speed superintelligence than by a collective superintelligence, and vice versa. Also, what are the returns on “more brains applied to the problem” (collective intelligence) for various problems? If there were merely a huge number of human-level agents added to the economy, how much would it speed up economic growth, technological progress, or other relevant metrics? If there were a large number of researchers added to the field of AI, how would it change progress?
How does intelligence quality improve performance on economically relevant tasks?
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about ‘intelligence explosion kinetics’, a topic at the center of much contemporary debate over the arrival of machine intelligence. To prepare, read Chapter 4, The kinetics of an intelligence explosion (p62-77). The discussion will go live at 6pm Pacific time next Monday 20 October. Sign up to be notified here.
- We are headed into an extreme compute overhang by 26 Apr 2024 21:38 UTC; 53 points) (
- Three Kinds of Competitiveness by 31 Mar 2020 1:00 UTC; 36 points) (
- Superintelligence reading group by 31 Aug 2014 14:59 UTC; 31 points) (
- “Wide” vs “Tall” superintelligence by 19 Mar 2023 19:23 UTC; 15 points) (
- Three kinds of competitiveness by 2 Apr 2020 3:46 UTC; 10 points) (EA Forum;
- 3 Sep 2021 19:22 UTC; 3 points) 's comment on Why the technological singularity by AGI may never happen by (
Bostrom flies by an issue that’s very important:
Back up. The population of Europe was under 200 million in 1700, less than a sixth of what it is today. The number of intellectuals was a tiny fraction of the number it is today. And the number of intellectuals in Athens in the 4th century BC was probably a few hundred. Yet we had Newton and Aristotle. Similarly, the greatest composers of the 18th and 19th century were trained in Vienna, one city. Today we may have 1000 or 10,000 times as many composers, with much better musical training than people could have in the days before recorded music, yet we do not have 1000 Mozarts or 1000 Beethovens.
Unless you believe human intelligence has been steadily declining, there is one Einstein per generation, regardless of population. The limiting factor is not the number of geniuses. The number of geniuses, and the amount of effort put into science, is nearly irrelevant to the amount of genius-level work accomplished and disseminated.
The limiting factor is organizational. Scientific activity can scale; recognition or propagation of it doesn’t. If you graphed scientific output over the years in terms of “important things discovered and adopted by the community” / (scientists * dollars per scientist), you’d see an astonishing exponential decay toward zero. I measured science and technology output per scientist using four different lists of significant advances, and found that significant advances per scientist declined by 3 to 4 orders of magnitude from 1800 to 2000. During that time, the number of scientific journals has increased by 3 to 4 orders of magnitude, and a reasonable guess is that so did the number of scientists. Total recognized “significant” scientific output is independent of the number of scientists working!
You can’t just add scientists and money and get anything like proportional output. The scientific community can’t absorb or even be aware of most of the information produced. Nor can it allocate funds or research areas efficiently.
So a critical question when thinking about super-intelligences is, How does the efficiency of intelligence scale with resources? Not linearly. To a first approximation, adding more scientists at this point accomplishes nothing.
On the other hand, merely recognizing and solving the organizational problems of science that we currently have would produce results similar to a fast singularity.
While failure to recognize & propagate new scientific discoveries probably explains some of our apparent deficit of current scientific geniuses, I think a bigger factor is just that earlier scientists ate the low-hanging fruit.
(I have no idea whether a similar effect would kick in for superintelligences and throttle them.)
An upvote is inadequate to express the degree of my agreement with this statement.
When I point out the low-hanging fruit effect to LWers, I do usually get a lot of agreement (and it is appreciated!) but I am starting to wish that someone would dig up some strong contrary evidence.
When the topic of apparent genius deficits and scientific stagnation comes up, people often present multiple explanations, like
intrinsic difficulties in scaling scientific activity
failure to identify/recognize contemporary scientific successes
no more low-hanging fruit
bureaucratization and institutional degradation
but tend to present only anecdotal evidence for each — myself included. And I’m not sure that can be helped; I don’t know of readily available evidence which powerfully discriminates between the different explanations.
PhilGoetz has data on scientific & technological progress, but I get the impression that much of it’s basically time series of counts of inventions & discoveries, which would establish only the whats and not the whys. Likewise, I think I could substantiate my January comment that cohort explains a substantial part of the variation in scientific eminence. And when I scraped together the data, ran the big regression, and found that birth year accounted for (suppose) 30% of the variance in eminence, that wouldn’t refute any of the potential explanations for why cohort correlated with eminence.
A partisan of the scaling hypothesis might say, “Obviously, as science gets bigger over time, it gets less efficient; more recently born scientists just lost the birth year draw”.
Someone arguing that scientific stagnation is illusory might say, “Obviously, this is a side effect of overlooking more recent scientific geniuses; scientists are working as effectively as before but we don’t recognize that thanks to increasing specialization, or our own complacency, or the difficulty of picking out individual drops from the flood of brilliance, or the fact that we only recognize greatness decades after the fact”.
I would say, if I were the kind of person who threw the word “obviously” around willy-nilly, “How many times do you expect general relativity to be invented? Obviously, there are only so many simple but important problems to work on, and when we turn to much harder problems, we make slower and more incremental progress”.
Someone most concerned with institutional degradation might say, “Obviously, as science has become more bureaucratic and centralized, that’s rendered it more careerist, risk-averse & narrow-minded and less ambitious, so of course later generations of scientists would end up being less eminent, because they’re not tackling big scientific questions like they did before”.
And we don’t get anywhere because each explanation is broadly consistent with the observed facts, and each seems obvious to someone.
I would put forth three lines of argument that might help.
First, what we consider a significant development is put in relation to its context. So, we naturally end up picking out the top-level entities and not the second-layer entities, let alone the third, fourth, fifth… modern science may have the same number of top-level discoveries, but these are underpinned by many more layers of discovery than earlier discoveries.
Second, let’s stop thinking about the jump from Aristotle to modern science for a minute. Let’s think about the jump from Novoselov and Geim’s discovery of Graphene to today.
In their first paper, they made graphene, put it on a substrate, hooked a few wires up to it, and did low-temperature transport measurements. Worth a nobel prize. Outside of the insight that led to it, pretty simple. Not everyone could do it, but many could.
In the following years, a bunch of progressively trickier experiments were performed.
As of two years ago, our clearest path to a publishable research paper in this area was to make an enormous pristine sheet of graphene, position a layer of boron nitride on top of it, position another layer of graphene on top of that in such a way that it didn’t short to the first piece, place a bunch of wires in very specific locations on this sandwich, then destroy the substrate that was all sitting on, all done so cleanly that it was smooth on every surface. This was insanely hard. This is also only a little trickier than normal for experiments in the field these days.
The low-hanging fruit has been taken, here. And it’s not simply that other people took it and we’re looking at sour grapes. I took some of that low-hanging fruit. There was a simple experiment to do, I did it, published it, and now I cannot do an experiment that simple again in this sub-field. My next experiment was substantially more complicated. The next experiment after that was far more complicated still.
Yes, we stand on the shoulders of giants, but we are in a progressively rarer atmosphere.
Third, and most critically, let’s look at the predictions of the ‘organizational inefficiency’ theory. if we were to scale back our scientific establishment to 1700-s levels, do you think we’d maintain our current level of scientific progress? That seems to be the implication, here, and it seems VERY dubious to me.
Funding agencies these days fund people who get PhDs.
To get a PhD, 90% of the time you need to generate a meaningful result of some kind within a limited time horizon. Scientists who go big and create nothing do not graduate and do not get funded later.
What some of them do learn to do is to manage several projects at the same time. They diversify, working on some big ideas which may fail, but insuring a steady stream of results by also working through some lesser issues with a higher probability of success.
It’s true that you can have a career stringing together nothing but small wins, but contrary to popular belief, the funding agencies (who rely on PhD scientific peer review committees) do fund many “high-risk, high-reward” projects.
In the private sector, for example, drug discovery projects have a vast failure rate but are funded nonetheless. Science funders do understand the biases toward career safety and are trying (imperfectly) to adjust for them.
I’d love to see that data & analysis! Did you post it somewhere? Can you email it to me at gmail?
I think there was a LW post years ago saying that the word “obviously” is only used to cover up the fact that something isn’t obvious, and I agree with that more every year.
The evidence against the low-hanging fruit idea is that it explains only fame distribution across time, while the “attention and accretion model”, which says that people gain fame in proportion to the fame they already have, and total fame in a field is constant, explains fame distribution at any given moment as well as across time. If you use “attention and accretion” to explain fame distribution in the present, you will end up also explaining its distribution across time, not leaving very much for low-hanging fruit to explain.
Of course it is possible that low-hanging fruit is a strong factor, being cancelled out by some opposing strong factor such as better knowledge and tools. In fact, I think an economic-style argument might say that people work on the highest-return problems until productivity drops below C, then work on tools until it rises just above C, then work on problems, etc. So we should expect rate of return on worked-on problems to be fairly constant over time.
I’m talking about a hypothetical analysis there. I haven’t actually collected the data and put it through the grinder (at least not yet)!
Yeah, I’m trying to install mental klaxons that go off when I unreflectively write (or read) “obvious” or “obviously”.
That’s a fascinating result (although I’d wait for more details about the data & models involved before allocating the bulk of my probability mass to it). Does that mean our perception of fewer geniuses nowadays is merely because older geniuses grabbed most of the fame and left less of it for later geniuses? That’s how it sounds to me but I may be over-interpreting.
Do we perceive there are fewer geniuses nowadays? I think we tend to pick the one thing somebody did in each generation or decade that seems most impressive, and call whoever did it an Einstein, with no idea how hard or easy it really was.
For instance, some people called Watson and Crick the great geniuses of the generation after Einstein, for figuring out the structure of DNA. Yet Watson and Crick were racing people all over the world to find the structure, because they knew anybody with the right tools would be able to figure it out within a few months. It required only basic competence.
(What’s especially interesting about that case is that Watson and Crick both did things that showed genius after they were hailed as geniuses, given genius-level funding and freedom, and expected to do genius things. Were they geniuses all along (a low prior), did they develop genius in response to more-challenging conditions, or is funding and freedom more important than genius?)
Today, we’ve got genuine genius entrepreneurs like Sergei & Larry, Peter Thiel, and Elon Musk, yet the public thinks the great genius of that generation was Steve Jobs. Possibly because Apple spent many (dozen? hundred?) millions of dollars advertising Steve Jobs. Peter Thiel was never on a billboard.
My gut says there are fewer geniuses nowadays, although I don’t really trust it on this one.
As for guts that aren’t mine...Bruce G. Charlton. Gideon Rachman. Dean Keith Simonton, although he simultaneously argues that modern first-rate scientists, “[i]f anything”, need “more raw brains”. Cosma Shalizi, who I think is being serious there, not just florid.
I think there are certainly people who do that. There are people (not sure I can name any, but I’m sure they exist...Ray Kurzweil, maybe?) who are relentlessly upbeat about the march of scientific genius & progress, and people who just like jumping on hype bandwagons. There are also people with gloomier outlooks.
I don’t intuitively think of “genius entrepreneurs” as a natural category...
That advertising (and similar hype) influences whom people think of as geniuses is a good point.
This seems an important issue to me.
Those places were selected for having Newton and Aristotle though.
What leads you to be confident that these are the bottlenecks?
Interesting. Is your research up online?
You mean, we would have a lot more effective research, quickly? Or something more specific?
One important piece of data is the distribution of citations within fields. There have been many studies of this. What you find, generally, is that a field of study has a finite amount of attention available—if it has N researchers, they collectively perform cN paper-readings per year. The distribution of these paper-readings is a power law (a Zipf distribution), so that the number of researchers whose papers get read much grows much more slowly than N. No model based on the expected distribution of the merits of the papers or the scientists makes sense, particularly given how the Zipf distribution changes with the size of the field. The models that make sense are models that say that the odds of somebody reading a paper by person X are proportional to the odds that someone else cited X. That is, if you break down your model of citation distribution into a component to model randomly trawling the literature for citations, and a component to model quality of the papers, you find the random model explains nearly 100% of the data.
No, but check your email.
If we achieved a linear relationship between input and output, we would have maybe 6 orders of magnitude more important scientific and technological advances per year. If we actually achieved “synergy”, that oft-theorized state where the accumulation of knowledge grows at a rate proportional to accumulated knowledge, we would have a fast take-off scenario, just without AI. dk/dt = k, dk/k = dt, ln(k) = t+C, k = Ce^t.
How much should the fact that we do not have a fast take-off of organizations make us more pessimistic about one with AIs being likely?
That’s the question. We should consider the overhead cost of knowledge, and the possibility that we will see a logarithmic increase in knowledge instead of a linear one (or, that we will see a linear one given an exponential explosion in resources).
Much depends on how you measure knowledge. If you count “bits of information”, that’s still growing exponentially. If you count “number of distinctions or predictions you can make in the world”, that probably isn’t.
There is a critical relationship between GDP and the efficiency of science. Until 1970, the money we put into science increased exponentially. Economic growth comes (I believe) exclusively from advances in science and technology. In 1970, we hit the ceiling; fraction of GDP spent on science had grown exponentially until then, when it suddenly flattened, so that now resources spent on science grows only as fast as GDP grows. This should cause a slower growth of GDP, causing a slower increase in scientific results, etc. IIRC there’s a threshold of scientific efficiency below which (theoretically) the area under the curve giving scientific results off to infinity is finite, and another threshold of efficiency above which (theoretically) the curve rises exponentially.
This alone doesn’t seem sufficient to explain the distribution of economic growth between countries. Most science, and most technology more than a generation old, is now public domain. But even if we go two generations back, US GDP/capita was ~$25K, which would still put it in the top quartile of modern countries. The countries at the bottom of the economic lists are often catching up, but not uniformly.
This sounds more like a conflation between the “availability” of S&T versus the “presence” of S&T.
Technology being in the public domain does not mean the remote-savannah nomad knows how to use wikipedia, has been trained in the habit of looking for more efficient production methods, is being incentivized by markets or other factors to raising his productivity, or has at his disposal an internet-connected, modern computer, another business nearby that also optimizes production of one of his raw materials / business requirements, and all the tools and practical manuals and human resources and expertise to use them.
Long story short, there’s a huge difference between “Someone invented these automated farming tools and techniques, and I know they exist” and “I have the practical ability to obtain an automated farming vehicle, construct or obtain a facility complete with tools and materials for adjustment so I can raise livestock, contacts who also have resources like trucks (who in turn have contacts with means to sell them fuel), and contacts who can transform and distribute my products.”
The former is what you have when something is “public domain” and you take the time to propagate all the information about it. The latter, and all the infrastructure and step-by-step work required to get there, is what you need before the economic growth kicks in.
I believe the latter was being referred to by “advances in science and technology”.
Could you more clearly define the “presence” of S&T? Your examples like “automated farming tools are practical to obtain” sounds like a way of restating “both the availability of S&T and the economy are strong”, which does indeed imply that the economy will be strong, but “A ⇒ B” would have been a much more useful theory than “A && B ⇒ B”.
You’re making some argument that you think is implied by what you’ve said, but that I can’t see. I don’t see how the US of 2 generations ago having a high GDP is inconsistent with growth being a result of science and technology, unless you imagine science and technology are the same all over the world at a given time, which would be a strange thing to imagine.
(Side note: “Technology” here include organizational and management techniques.)
The use of science and technology isn’t the same all over the world at a given time, but the availability is remarkably close, don’t you think? What are the less developed countries left out on? ITAR-controlled products, trade secrets, and patents? For everything else they have access to the exact same journals.
Perhaps your side note is what’s critical: are there organizational and management techniques which are available in the United States but which we’ve successfully kept a secret internationally? Are multi-generational trade secrets the critical part of science and technology?
Or would other countries grow much faster if they just fully used public domain technology, but there’s some other factor X which is preventing them from using it? If the latter, then what is X, and wouldn’t it be a better candidate explanation for disparate economic growth?
This is a reasonable observation; yes, it is not obvious why every nation can’t jump straight to modern-nation productivity.
There are plenty of places in Africa where water purification is a great new technology, and plenty of places in China where closed sewage lines would be a great new technology. Why don’t they use them?
The stories I hear from very-low-tech countries usually emphasize cultural resistance. One guy installed concrete toilets in Africa, and people wouldn’t use them because concrete had negative connotations. People have tried plastic-water-bottle solar water purification in southeast Asia, and some concluded (according to Robin Hanson) that people wouldn’t put plastic bottles of water on their roof because they didn’t want the neighbors to know they didn’t have purified water. Another culture wouldn’t heat-sterilize water because their folk medicine was based on notions of what “hot” and “cold” do, and they believed sick people needed cold things, not hot things. There are many cases where people refused to believe there are invisible living things in water. (As Europeans also did at first.)
(Frequent hand-washing and checklists are technologies that could save many thousands of lives every year in US hospitals, but that are very difficult to get doctors to adopt.)
But lots of low-tech countries can’t afford anything that they can’t build themselves. How much of modern technology can be built with materials found on-site without any tools other than machetes, knives, and hammers? Mosquito netting is very valuable in some places, but impossible to manufacture in a low-tech way.
My short answer is that there are a variety of obstacles to applying any technology in a low-tech nation. But growth is only possible either by finding more resources, or by using existing resources more efficiently, and using resources more efficiently = technology.
If it were possible to have growth without technology—let’s say 1% growth every 10 years—then a society with medieval technology, and no technological change, would eventually become as productive per person as today’s modern countries. And that’s physically impossible, just using energy calculations alone. There may be other necessary conditions, but tech improvement is absolutely necessary.
Those are excellent answers; thank you.
Do you really thing that things like Good Governance don’t have anything to do with economic growth? Science doesn’t help you much if a competitor pays a corrupt official to shut your business down.
What do you mean by this? We have plenty of composers and musicians today, and I’d bet that many modern prodigies can do the same kinds of technical tricks that Mozart could at a young age.
Good question, though doing technical tricks at a young age does not make one Mozart. I don’t mean that we don’t have 1000 composers as good as Mozart or Beethoven. I mean we don’t have 1000 composers recognized as being that good. We may very well have 10,000 composers better than Mozart, but we’re unable to recognize that many good composers.
This is conflated with questions of high versus pop art andd accidents of history. Personally, I’m open to the idea that Mozart represents a temporary decline in musical taste—a period between baroque and romantic when people ate up the kind of pleasant, predictable pop music that Mozart churned out. He wrote some great stuff, but I think the bulk of what he wrote is soulless compared to equally-prominent baroque or romantic music.
I’d be really interested in reading more about this.
If you email philgoetz at gmail, I’ll send you a draft.
Er, or not. The number of publications per scientist has risen dramatically, but so has the number of authors per paper. I don’t know if these cancel each other out.
Compare the complexity of F=MA to string theory. The difficulty of science is going up by orders of magnitude as the low hanging fruit are eaten.
We have over 1000 genres of music. Sure, not everyone can be recognised as the best musician of a generation by definition, but I think we could arguably be producing 1000 times more good music than at the time of Mozart.
Don’t forget to compare their usefulness as well X-)
Good post.
First of all, knowledge is partially ordered. A bunch of lesser-known results were required before Einstein could bring together the mathematical tools and physics knowledge sufficient to create relativity. True enough, this finding may have come much later, if not for Einstein, but dozens of others built predecessor results that also required great insight.
Similarly, we should not decry the thousands of biologists who have been cataloging every single protein, its post-translational modifications and its protein-protein interactions in exhaustive detail. Some of this work requires a great deal of cleverness each time.
A portion of the phenomenon you are talking about can be addressed by referencing Kuhn: We have periods of normal science, with many people giving input, and a building tension where twenty pieces fall into place and (frequently interdisciplinary) thinkers visit a problem for the first time.
In other cases the critical breakthrough has to be facilitated using new tools that generate new breakthroughs. When these tools require advances in component technology, you have a large number of engineers, testers and line workers feeding their talents into discoveries for which only a few get credit sometimes.
If these components require days or months of “burn-in” testing to judge their reliability, a superintelligence might have limited advantage over people in reducing the timeline.
Sometimes discovery relies on strings of experiments which by their nature require time and cannot be simulated. Our current knowledge of human biology requires that we follow patients for many years before we know all of the outcomes from a drug treatment.
Initially, at least, a superintelligent drug developer would still have to wait and see what happens when people are dosed the drug over the course of many years.
If a cosmic event can only be observed once in a decade, a superintelligence would not have the data any sooner, short of inventing some faster-than-light physics we do not have today.
I’m confused about Bostrom’s definition of superintelligence for collectives. The following quotes suggest that it is not the same as the usual definition of superintelligence (greatly outperforming a human in virtually all domains), but instead means something like ‘greatly outperforming current collective intelligences’, which have been improving for a long time:
This seems strange, if so. It hasn’t been quite clear why we should care about the threshold of superintelligence in particular, but if it refers to different levels of capability for different kinds of entity, it seems hard for the concept to play an interesting role in our reasoning. Similarly for if it is a moving and relative point.
If we want to claim that something special will happen when AI reaches a certain level of intelligence, it seems we should prima facie expect something similar to happen when organizations reach that level of intelligence. It has been unclear to me from the book so far whether Bostrom thinks organizations are currently superintelligent, by non-collective metrics of superintelligence, yet this seems an important point.
Present-day humanity is a collective intelligence that is clearly ‘superintelligent’ relative to individual humans; yet Bostrom expresses little to no interest in this power disparity, and he clearly doesn’t think his book is about the 2014 human race.
So I think his definitions of ‘superintelligence’ are rough, and Bostrom is primarily interested in the invincible inhuman singleton scenario: the possibility of humans building something other than humanity itself that can vastly outperform the entire human race in arbitrary tasks. He’s also mainly interested in sudden, short-term singletons (the prototype being seed AI). Things like AGI and ems mainly interest him because they might produce an invincible singleton of that sort.
Wal-Mart and South Korea have a lot more generality and optimization power than any living human, but they’re not likely to become invincibly superior to rival collectives anytime soon, in the manner of a paperclipper, and they’re also unlikely to explosively self-improve. That matters more to Bostrom than whether they technically get defined as ‘superintelligences’. I get the impression Bostrom ignores that kind of optimizer more because it doesn’t fit his prototype, and because the short-term risks and benefits prima facie seem much smaller, than because of any detailed analysis of the long-term effects of power-acquiring networks.
It’s important (from Bostrom’s perspective) that the invincible singleton scenario is defined relative to humans at the time it’s invented; if we build an AGI in 2100 that’s superintelligent relative to 2014 humans, but stupid relative to 2100 humans, then Bostrom doesn’t particularly care (unless that technology might lead to an AI that’s superintelligent relative to its contemporaries).
It’s also important for invincible singleton, at least in terms of selecting a prototype case, that it’s some optimizer extrinsic to humanity (or, in the case of ems and biologically super-enhanced humans—which I get the impression are edge cases in Bostrom’s conceptual scheme—the optimizer is at least extrinsic to some privileged subset of humanity). That’s why it’s outside the scope of the book Superintelligence to devote a lot of time to the risks of mundane totalitarianism, the promise of a world government, or the general class of cases where humanity just keeps gradually improving in intelligence but without any (intragenerational) conflicts or values clashes. Even though it’s hard to define ‘superintelligence’ in a way that excludes governments, corporations, humanity-as-a-whole, etc.
(I get the vague feeling in Superintelligence that Bostrom finds ‘merely human’ collective superintelligence relatively boring, except in so far as it affects the likely invincible inhuman singleton scenarios. It’s not obvious to me that Hansonian em-world scenarios deserve multiple chapters while ‘Networks and organizations’ deserve a fairly dismissive page-and-a-half mention; but if you’re interested in invincible singletons extrinsic to humanity, and especially in near-term AI pathways to such, it makes sense to see ems as more strategically relevant.)
Bostrom’s secondary interest is the effects of enhancing humans’ / machines’ / institutions’ general problem-solving abilities relative to ~2014 levels. So he does discuss things other than invincible singletons, and he does care about how human intelligence will change relative to today (much more so than he cares about superintelligence relative to, say, 900 BC). But I don’t think this is the main focus.
Thanks for your thoughts.
I’m curious about what makes organizations unlikely to explosively self-improve, if one thinks other entities that reach super-human intelligence are by default likely to do so. Is it just that organizations have been superintelligent for a while and have not exploded so far? But perhaps this is a better question for the kinetics of intelligence explosions week, when we discuss the reasons for thinking anything will explosively self-improve.
That’s a good question. There are trivial senses in which Wal-Mart could become ‘superintelligent’ if sufficiently powerful emulations or AIs joined the organization. So I gather we’re interested in:
Are there plausible ways for an organization to rapidly become an invincible singleton without recourse to AI or emulations? (And without just attaining de-facto-dominance by peacefully subsuming rivals, like a World Government.)
If certain kinds of organization rapidly became an invincible singleton via AI or emulation technologies, would their peculiarities importantly change the strategic picture?
Re 1, if a single organization invents (and successfully monopolizes) a technology that quickly gives it vastly more wealth (or vastly more destructive power) than the entire rest of the planet, it could attain a dominant advantage even if it isn’t technically exploding in ‘general intelligence’. (The advantage might not count as ‘general’ because it’s a single narrow superpower that just happens to be strong enough to trump every other agency. Or it might not count as ‘intelligence’ because it’s a resource advantage rather than an intrinsic capability.)
Closer to the spirit of ‘intelligence explosion’ would be an organization that comes up with a clever way to biologically enhance its humans (e.g., an amazing new nootropic) or enhance the speed with which humans share information or filter out bad ideas. All of these examples, like the ones in the previous paragraph, rely on there being a huge first-mover advantage—either it’s easy to hide the necessary insights from other organizations, or at some threshold point the insights have an enormous effect on an extremely small timescale, or other organizations for some reason don’t want to mimic the first one. (Perhaps the game-changing technology is extremely taboo, and the invincible singleton arises because only one organization is willing to break the taboo within the first few years of the tech’s availability.)
I think there are two fairly distinct questions: whether an organization is likely to rapidly become much more superintelligent than it is, and whether it is likely to do this without other organizations catching up. I mostly mean to ask about the first.
You mention several improvements an organization could make to their intelligence, however in an ‘intelligence explosion’ presumably there would be lots of improvements one after the other. I’m thinking of the kinds of things you mention, along with improving the nature of interactions and what individual humans do in the organization etc—there seem to be many possibilities. However I don’t mean to reason from the promisingness of any of these to the conclusion that there could be an organizational intelligence explosion. I rather mean to point out that the arguments for an AI intelligence explosion seems to apply just as well to other kinds of entity such as organizations, since they don’t seem make any reference to being a software agent. So if you (reasonably, I think) don’t expect human organizations to undergo an ‘intelligence explosion’ soon, you need a story about how the argument does apply to AI but doesn’t apply to organizations. I don’t think such stories are that hard to come by, but it is good to think about.
What is this word ‘em’ you keep using?
Robin Hanson’s term for software-emulated humans.
Thanks for the very nice post.
Bostrom says that machines can clearly have much better working memory than ours, which can remember a puny 4-5 chunks of information (p60). I’m not sure why this is so clear, except that it seems likely that everything can be much better for machine intelligences given the hardware advantages already mentioned, and given the much broader range of possible machine intelligences than biological ones.
To the extent that working memory is just like having a sheet of paper to one side where you can write things, we more or less already have that, though I agree it could be better integrated. To the extent that working memory involves something more complicated, like the remembered ideas being actively juggled in some fashion in our minds, I see no clear (extra) reason that machines would do a lot better. I personally don’t have a good enough understanding of why our working memories are so small to begin with—clearly we have a lot more storage capacity in some sense, which is used for other memories.
WM raises issues of computational complexity which have so far been ignored. If working memory is the set of concepts that are currently being matched against each other, then the complexity of the matching is probably n^2, If it is the set of concepts all permutations of which are being matched against variables in rules, the complexity is n!. It’s easy to imagine cognitive architectures in which the computational capacity needed to handle 9 items in WM would be orders of magnitude higher than that needed to handle 5. I suspect that’s why our WM is so limited, particularly in light of the fact that WM appears to be highly correlated with intelligence (according to Michael Vassar).
We touched on how important WM is to intelligence last time, too, and there was some dispute. I think we need to find some results in the literature.
To me, this seems like an issue with the definition of ‘chunk of information’. Sure, maybe I can only remember a few at a time, but each chunk has a whole bunch of associations and connected information that I can access fairly easily. As such, my guess is that these chunks actually store a very large number of bits, and that’s why you can’t fit too many of them into short-term memory at once. Of course, you could do even better with better hardware etc., but this seems to just be an instance of the point “humans have finite memory, for every finite number there’s a bigger number, therefore we could make machines with more memory”.
And faster, by many orders of magnitude! Modern PCs already have ram capacities on the order of tens of gigabytes. If we’re talking about simply written words (as opposed to arbitrary drawings, the storage space requirements for which are somewhat trickier), that’s not the equivalent of a page, or a book—it’s on the order of a library. And they can read or overwrite the whole thing within a few seconds.
Yes, in principle you can do the same thing by hand. In practice, writing out even one full ram dump by hand would probably take longer than a current human lifetime.
That by itself would provide a speed-type superintelligence advantage, to the extent that flat (i.e. not associative to the same extent human memory is) memory is a limitation on our intelligence.
Concretely, we could imagine a model in which the brain automatically explored all ways of composing two concepts in working memory (as a kind of automatic architectural feature), or even did something more elaborate (e.g. explored all possible subsets). In this scenario, it would be very expensive to scale up the size of working memory while retaining the same characteristics, though it wouldn’t be an in principle obstruction.
If people were ten times faster, how much faster would economic growth be?
Haven’t read the book, so forgive me if this is covered.
It is easier to imagine this as the laws of physics being 10 times slower, then multiply everything by 10.
[begin slow-physics POV]
Manufacturing and agriculture slow significantly. Everything weighs a tenth of its current weight, but muscles and engines are 10 times weaker. Inertia and viscosity become important. The marginal cost of per km of shipping goes up by about a factor of 10, but the fixed costs of shipping remain about the same. With the fall in trade, specialization decreases. Solar radiation is 10 times weaker, and plants grow 10 times slower. Renewable energy sources become 1/10th as appealing (non-renewables are better). Smelting, tempering, and other time-bound processes proceed at 1/10th the current rate. Capital intensive manufacturing steps become 10 times as capital intensive. Real wages fall.
The service sector (which is a majority of GDP) is not nearly as slowed. Since the day is now 240 hours, sleep cycles and work schedules are no longer connected with the sun. The whole planet can coordinate on a work schedule, maybe with 8 30-hour sleep cycles per earth rotation. Computers are as fast as in mid 2000s. Email and other telecommunication is virtually unaffected (voice data increases by 10x, but voice data is small these days). People type slower, but we can easily invent less laborious keyboards. Commutes are difficult, as cars race on highways at 10 km/h but are only as maneuverable as cars moving at 100 km/h in our world. People move closer their offices and telecommuting becomes much more commonplace.
All in all, it seems like a slowdown in physics would cause a slowdown in our economy, but we would rapidly adapt.
[end slow-physics POV]
The rapid change in comparative advantages would immediately expose much low-hanging fruit (moving to more labor-intensive manufacturing techniques, for example). Growth would increase rapidly as industry re-optimizes for the new high-speed labor force. Growth slows after this low-hanging fruit is plucked.
On this subject, I’d like to link Stanovich on intelligence amplification.
Why are these smart people making all of these mistakes?
-Intelligence testing fails to incorporate your chance of falling prey to cognitive bias.
-Intelligence testing does not test the quality of your information sources
-Intelligence testing does not test how well you control your response to negative stimulus.
It sounds like we’re having to add in a series of other improvements to get an acceptable ubermenchen, not just more speed or higher IQ.
I wonder if it is because the mistakes are so extreme that the original failures that lead to them can’t just be ones of poor problem-solving. If an IQ test asks what can be logically inferred from some evidence, I’d expect smarter people to do better. If an IQ test asks whether astrology can be logically inferred from no evidence, I expect very low levels of intelligence to be required to figure this one out. So to the extent that people answer ‘yes’, it is for some non-intellectual reason which are more or less uncorrelated with intelligence.
Do you have evidence that intelligence isn’t correlated with all those three? IIRC Kahneman’s research has pointed out that intelligence seems to protect against certain types of cognitive biases.
Very smart people, like everyone else, have emotional responses to situations, then fit a set of rationalizations to what their emotions tell them in the first place.
I would like to give better answers in the aggregate-maybe we can gather some more evidence, but I’ll just give a few well-known examples of people who would’ve done very well on IQ tests:
Ayatollah Khomeini Hermann Goering Alan Turing and Friedrich Nietzsche, both of whom sadly ended their own lives. Unfortunately, that is fairly common among the highest IQ scorers.
Intelligence is not the only personality trait to consider.
Difficult question. Do you mean also ten times faster to burn out? 10x more time to rest? Or due to simulation not rest, just reboot?
Or permanently reboot to drug boosted level of brain emulation on ten times quicker substrate? (I am afraid of drugged society here)
And I am also afraid that ten time quicker farmer could not have ten time summer per year. :) So economic growth could be limited by some botlenecks. Probably not much faster.
What about ten time faster philosophic growth?
Breaking this down a bit:
-Ten times faster does not help people driving vehicles that much, unless they can use the time to multi-task.
-For people who carry things around to complete their job, or manipulate objects, we have to decide whether they are physically able to do those actions faster, or are they just able to think faster.
Presumably, we’re holding physical running, walking and carrying speeds the same, and people are thinking a lot faster. Thus, people can plan their farming enterprise much more effectively, but they still need a lot of people to actually pick the crops.
In this scenario, we would end up with a lot of people who have an expert-level mastery of many professions. Since it does not take much time to learn anymore, a lot of people would be MD, JD, MBA, triple PhDs, know how to operate dozens of pieces of equipment, know ten languages and be able to design and build their own house or office.
Apparently, all of that happens for some people before their 10th birthday-but what did this speed-up mean for their emotional development? Do they start working at 5 years old? Do we end up with very expert people who still cannot manage their temperament, thereby making them very effective threats?
One good question is the degree to which speed translates into quality-how many more people would be able to write a coherent 100-page document with this new ability? Not clear.
Some would use all of the additional time to play 10x more video games, watch 10x more soap operas or to delve deeper into Sufi mysticism. A segment of the population would use these gifts well, others would not.
Team leaders frequently would do much more of the work of their projects themselves, since after their interdisciplinary education they would not require specialists for many task. However, when necessary, these leaders could have fifty people reporting directly to them, rather than just 5.
Ten times faster communication between people would allow them to correct many social mistakes prior to damage, to make very detailed business contracts, and sometimes to avoid armed conflict. Critical relationships where both parties have an incentive toward preserving the relationship would improve.
There would be more time to make sure that you were buying the right product, and more time to make sure that your sales pitch was as effective as possible. However, sometimes that would only mean 20% more sales resulting in a 15% better product purchased.
The queues in store check-out lines evaporate, and wait times on phone interactions with businesses and governments also fall. An array of telepresent services becomes possible.
At the same time, however, our ability to search for new relationships would also improve.
In the event of business competition, some strategic advantages would multiply, others would be neutralized. The business who can make purchases for 3% less has a much greater advantage and defeats equal competition much more quickly.
Identifying your adversaries before they identify you would become extremely important, because you could set up a multi-step plan to defeat them quite quickly, or serve ten people with just a slightly better product where you otherwise would’ve only gotten one.
I think there are two better explanations.
First, assuming that philosophical questions have answers, the tools needed to find those answers will be things like evolutionary psychology, artificial intelligence, statistics, linguistics, cultural anthropology, and not in the topics addressed in undergraduate philosophy courses. Graduate courses emphasize logic, which is better, but 20th century philosophy showed mainly how logic fails when applied to philosophical questions. Philosophers (or, to paraphrase Aristotle, “meta-physicists”) should be meta-scientists, trained in all branches of science.
Second, as time goes on, we have to try harder and harder not to see the answers to the “eternal problems” lying in front of our noses, because we’re still hoping to find different answers.
What would explain all the questions to which we are unwilling to accept the answers falling in the domain of philosophy? Or are these merely the ones where we are not forced yet to accept them?
What are some possible but non-realized cognitive talents that an artificial intelligence could have, analogous to our talent for interpreting visual scenes? (p57)
Eliezer Yudkowsky has written about the idea of a “codic cortex”; that is, a specialized mental module for modelling the behavior of executable code.
And something like that would be really useful! For instance, there’s fundamentally no good reason to have any implementation bugs when writing code, or to not easily notice them when reading it. The techniques for how to prove code correctness are well known; but in practice, for a human programmer to actually use them is so expensive (in terms of productivity) that it’s usually more efficient to skip them, and find and fix bugs after the fact.
This is despite the fact that humans already have (in absolute terms) very good models of how the code they write works; if we didn’t, we couldn’t do nontrivial programming at all. But those models are sloppy, and there’s some details that are easy to miss for humans. If we built the models instead from a formal and precise analysis of the code on hand, we’d get much better predictions out of them.
A lot of programming language/environment development has been concerned with having the compiler or runtime handle certain things (translating high-level structures to machine code, garbage collection, type safety, etc.) so that the programmer doesn’t need to worry about getting them right—both so they don’t end up getting them wrong, and so that having to painstakingly get them right doesn’t drain their productivity. But such approaches usually come with some performance cost, and in the end they’re crutches to deal with the fact that humans are no good at programming. None of them would be necessary for an intelligence that had a decent specialized module to handle code modelling.
When programming, I frequently find myself kind of guessing at what code will work, pumping some input through the function I write and checking to see if the output is consistent with my expectations.
I do not always bother to figure out whether a loop should start at zero or one before I just try it.
Yes, this can cause problems, but that process seems to run counter to what is said here.
As you say, there are differences between individual human programmers in just how detailed their models about code behavior are (and indeed, for the same human based on mental state; lack of concentration can lead to sloppiness in that activity as anywhere else).
Even so, I maintain that if your software works at all, you have had a much-better-than-nothing model in your mind (and conversely, if you ever have implementation bugs, that model is not perfect). You might not precisely model the right way to start a loop, but you probably had a good reason to put a loop there—as opposed to a triple pointer dereference or something—and to make it dependent on certain specific data (even if the precise dependence isn’t clear in your mind without experimentation), as opposed to just trying out random variables you have lying around as targets for (approximate) loop iteration count.
This is in contrast to an unintelligent process like e.g. natural evolution, which would try entirely random things and simply look at how well they perform. You couldn’t reasonably program anything in that manner with human typing speeds; the number of attempts would make it infeasible.
Autonomous systems are advancing toward being able to create high-precision, 3d renderings of indoor or outdoor locations in milliseconds, then recognize the objects in the location. This technology has room to improve rapidly, and we may suddenly find that it dramatically exceeds our own abilities without a corresponding level of progress toward what people are calling AGI.
I recommend Goertzel “Kinds of minds”, Chapter 2 (pp 14 ff) in The Hidden Pattern, on this topic.
As pointed out in note 14, humans can solve all computable problems, because they can carry out the steps of running a Turing machine (very slowly), which we know/suspect can do everything computable. It would seem then that a quality superintelligence is just radically faster than a human at these problems. Is it different to a speed superintelligence?
If you continuously improve a system’s speed, then the speed with which each fixed task can be accomplished will be continuously reduced. However, if you continuously improve a system’s quality, then you may see discontinuous jumps in the time required to accomplish certain tasks. So if we think about these dimensions as possible improvements rather than types of superintelligence, it seems there is a distinction.
This is something which we see often. For example, I might improve an approximation algorithm by speeding it up, or by improving its approximation ratio (and in practice we see both kinds of improvements, at least in theory). In the former case, every problem gets 10% faster with each 10% improvement. In the latter case, there are certain problems (such as “find a cut in this graph which is within 15% of the maximal possible size”) for which the running time jumps discontinuously overnight.
You see a similar tradeoff in machine learning, where some changes improve the quality of solution you can achieve (e.g. reducing the classification error) and others let you achieve similar quality solutions faster.
This seems like a really important distinction from the perspective of evaluating the plausibility of a fast takeoff. One quesiton I’d love to see more work on is exactly what is going on in normal machine learning progress. In particular, to what extent are we really seeing quality improvements, vs. speed improvements + an unwillingness to do fine-tuning for really expensive algorithms? The latter model is consistent with my knowledge of the field, but has very different implications for forecasts.
If we push ourselves a bit, I think we can establish the plausibility of a fast takeoff. We have to delve into the individual components of intelligence deeply, however.
Thinking about discontinuous jumps: So, improving a search algorithm from order n squared to order n log(n) is a discontinuous jump. It appears to be a jump in speed...
However, using an improved algorithm to search a space of possible designs, plans or theorems an order of magnitude faster could seem indistinguishable from a jump in quality.
Reducing error rates seems like an improvement in quality, yet it may be possible to reduce error rates, for example, by running more trials of an experiment. Here, speed seems to have produced quality.
Going the other way around, switching a clinical trial from a frequentist design to an adaptive Bayesian design seems like an improvement in quality-yet the frequentist trial can be made just as valid if we make more trials. An apparent improvement in quality is overcome by speed.
I think that statement is misleading, here. To solve a real-world problem on a TM, you do need to figure out an algorithm that solves your problem. If a Dark Lord showed up and handed me a (let’s say ridiculously fast compared to any computer realizable on what looks like our physics) UTM—and I then gave that UTM to a monkey—the monkey may have a fairly good idea of what it’d want (unlimited bananas! unlimited high-desirability sex partners!), but it wouldn’t have any idea of how to use the UTM to get it.
If I tried to use that UTM myself, my chances would probably be better—I can think of some interesting and fairly safe uses to put a powerful computer to—but it still wouldn’t easily allow me to change everything I’d want changed in this world, or even give me an easy way to come up with a really good strategy to doing so. In the end, my mental limits on how to decide on algorithms to deal with specific real-world issues would still be very relevant.
Humans cannot simulate a Turing machine because they are too inaccurate.
If humans merely fail at any particular mechanical operation with 5% probability, then of course you could implement your computations in some form that was resistant to such errors. Even if you had a more complicated error pattern, where you might e.g. fail in a byzantine way during interval of power-law distributed length, or failed at each type of task with 5% probability (but would fail at the task every time if you repeated it), then it seems not-so-hard to implement turing machines in a robust way.
In one of my classes I emulated a Turing machine.
Based on that experience, I am going to say that a massive team of people would have a hard time with the task.
If you want to understand the limits of human accuracy in this kind of task, you can look at how well people do double-entry bookkeeping. It’s a painfully slow and error-prone process.
Error rates are a fundamental element of intelligence, whether we are taking a standardized test or trying to succeed in a practical environment like administering health care or driving.
The theoretical point is interesting, but I am going to argue that error rates are fundamental to intelligence. I would like some help with the nuances.
There may be a distinction to be made between an agent who could do any intellectual task if they carried out the right procedure, and an agent who can figure out for themselves which procedure to perform. While most humans could implement a turing machine of some kind if they were told how, and wanted to, it’s not obvious they could arrange this from their current state.
That’s a separate topic from error rate, which I still want help with, but also interesting.
Figuring out what procedure to perform is a kind of design task.
Designing includes:
-Defining goals and needs -Defining a space of alternatives -Searching a space of alternatives, hopefully with some big shortcuts -Possibly optimizing -Testing and iteration
Design is something that people fail at, over and over. They are successful enough of the time to build civilizations.
I feel that design is a fundamental element of quality and collective intelligence. I would love to sort through it in more detail.
Bostrom offers the skills of isolated hunter-gatherer bands as support for the claim that the achievements of humans are substantially due to our improved cognitive architecture over that of other sophisticated animals, rather than due to our participation in a giant collective intelligence (p57). However as he notes in footnote 13, this is fairly hard to interpret because isolated hunter-gatherer tribes are still part of substantially larger groups—at a minimum, including many earlier generations, who passed down information to them via language. If humans merely had unusually good abilities to accumulate knowledge from one another, it seems then unclear that isolated hunter gatherer bands should have fewer capabilities than they do. Do I infer too little from this evidence? Is there not better evidence for this thesis e.g. based on chimpanzee performance on intellectual tasks?
There’s interesting work on how culture accumulates in function of distinct kinds of imitation. Richerson and Boyd have done much of the theorizing. Tim Tyler who frequently publishes on Lesswrong wrote a book on Memetics which might give some hints. Finally Alex Mesoudi has specifically studies levels of optimality in different copying strategies.
Are there forms of superintelligence Bostrom missed?
Define intelligence how you like, there are a few capabilities not in this list which will significantly influence the capabilities of autonomous systems, including how beneficial, threatening or capable they are:
Sensory systems:
Both machines, and future collective intelligences of people and machines, are benefiting from an exponential explosion the ability to sense. Everything is improving at a vast pace: telescopes, microscopes, detection in every wavelength, sound, chemistry, DNA/RNA sequencing, mass spectrometry, surveillance in thousands of locations, internet of things, the list goes on and on.
If we want to understand the future of intelligence, we have to incorporate the explosion of sensory input into our thinking.
I am not ready to claim that spirituality is irrelevant to the discussion. The difference between factual knowledge, calculation speed and “wisdom” does seem relevant, as others have pointed out earlier in the thread.
However, we’ll have to re-frame questions about spiritual issues in some way to bring it in...
Suppose one aspires to become a Bodhisatva-a being who is capable of entering into an unfettered celestial existence, but who instead stays behind to help other people and sentient beings find their way. (If we are a bit less ambitious, perhaps we might, depending on our tradition, aspire to become a tzadik, a marja, or saintly but not a supernatural savior).
Among other things, the Bodhisatvas have removed themselves from ego and forms of physical desire. They are spiritually superior and ready to move outside of the cycle of reincarnation as lower beings. Yet, when they take the form of a human, they would eat as an instrumental goal toward accomplishing their end.
The Bodhisatva’s kind of spiritual superiority does seem to differ from what Bostrom and the rest of us call “Superintelligence.” It may, however, relate in some ways...
Just a thought experiment...
For example, a peculiar version of a Bodhisatva might be said to have a utility function-in fact a marvelous, selfless utility function. Also, it will succeed at bringing more people out of the darkness if it engages in recursive self-improvement of its capabilities.
One aspires to become a Buddha or a Bodhisatva, and the claim is made by some that they, others who they know, or historical figures have reached this condition. For the most part, however, this goal is aspirational in nature and works to improve people’s behavior toward one another, and toward other sentient beings, in this life.
Based on what we now know about neurophysiology, the human brain is going to have a lot of trouble reaching any form of true enlightenment on its own. Emotion and desire re-surface even in the most virtuous or contemplative among us. Seemingly, we are never going to be truly saintly for the rest of our lives, although perhaps we can manage to have some pretty virtuous days interspersed among our failures.
I am not ready to make a proposal, but if enlightenment or spiritual purification is our goal, we might have to resort to some kind of augmentation to make it to the next step in that direction. Perhaps a lot of meditation or ethical contemplation is one approach, perhaps we may think of others...
Strictly an augmentation of intelligence, however, is not necessarily going to bring about this outcome.
The ability for autonomous systems to manipulate objects in the real world may relate to other forms of intelligence, but understanding future threats and benefits requires that we break this set of skills out and consider it separately.
We already live in an age of autonomous vehicles. To understand the future, we have to forecast how this revolution will play in with other advances in intelligence.
The forward progress of robotics is different from the intelligences in this chapter. Robotics may benefit from AI planning and object recognition, but robots are hardware/software solutions.
Superintelligence will seem a lot more “super” if fundamental issues relating to object recognition, navigation indoors and object manipulation are solved first.
The analysis improves if we separate out and take a closer look at the differences between memory and reasoning.
How strongly does the fact that neurons fire ten million times less frequently than rates of modern microprocessors suggest that biological brains are radically less efficient than artificial minds could be? (p59)
A critical question with Neurons is how to account for the amount of internal state they contain.
A cell can be in a huge number of internal states. Simulating a single cell in a satisfactory way will be impossible for many years. What portion of this detail matters to cognition, however? If we have to consider every time a gene is expressed or protein gets phosphorylated as an information processing event, an awful lot of data processing is going on within neurons, and very quickly.
It appears that vastly simplifying all of this detail in simulation may work out pretty well-but there is a big argument between Markram and IBM’s neuromorphic people about this issue.
We really need to delve deep on this and get all of the latest thinking in one place.
I agree not only with this sentence, but with this entire post. Which of the many, many degrees of freedom of a neuron, are “housekeeping” and don’t contribute to “information management and processing” (quotes mine, not SteveG’s) is far from obvious, and it seems likely to me that, even with a liberal allocation of the total degrees of freedom of a neuron to some sub-partitiioned equivalence class of “mere” (see following remarks for my reason for quotes) housekeeping, there are likely to be many, many remaining nodes in the directed graph of that neuron’s phase space that participate in the instantiation and evolution of an informational state of the sort we are interested in (non-housekeeping).
And, this is not even to mention adjacent neuroglia, etc, that are in that neuron’s total phase space, actively participating in the relevant (more than substrate-maintenance) set of causal loops—as I argued in my post that WBE is not well-defined, a while back.
Back to what SteveG said about the currently unknown level of detail that matters (to the kind of information processing we are concerned with … more later about this very, very important point); for now: we must not be too temporally-centric, i.e. thinking that the dynamically evolving information processing topology that a neuron makes relevant contributions to, is bounded, temporally, with a window beginning with: dendritic and membrane level “inputs” (receptor occupation, prevailing ionic environment, etc), and ending with: one depolarization -- exocytosis and/or the reuptake and clean-up shortly thereafter.
The gene expression-suppression and the protein turnover within that neuron should, arguably, also be thought of as part of the total information processing action of the cell… leaving this out is not describing the information processing act completely. Rather, it is arbitrarily cutting off our “observation” right before and after a particular depolarization and its immediate sequelae.
The internal modifications of genes and proteins that are going to effect future, information processing (no less than training of ANNs effects future behavior of of the ANN witin that ANNs information ecology) should be thought of, perhaps, as a persistent type of data structure itself. LTP of the whole ecology of the brain may occur on many levels beyond canonical synaptic remodeling.
We don’t know yet which ones we can ignore—e ven after agreeing on some others that are likely substrate maintenance only.
Another way of putting this or an entwined issue is: What are the temporal bounds of an information processing “act”? In a typical Harvard architecture substrate design, natural candidates would be, say, the time window of a changed PSW (processor status word), or PC pointer, etc.
But at a different level of description, it could be the updating of a Dynaset, a concluded SIMD instruction on a memory block representing a video frame, or anything in between.
It depends, ie, on both the “application” and aspects of platform archiceture.
I think it productive, at least, to stretch our horizons a bit (not least because of the time dilation of artificial systems relative to biological ones—but again, this very statement itself has unexamined assumptions about the window—spatial and temporal—of a processed / processable information “packet” in both systems, bio and synthetic) and remain open about assumptions about what must be actively and isomorphically simulated, and what may be treated like “sparse brain” at any given moment.
I have more to say about this, but it fans out into several issues that I should put in multiple posts.
One collection of issues deals with: is “intelligence” a process (or processes) actively in play; is it a capacity to spawn effective, active processes; is it a state of being, like occurrently knowing occupying a subject’s specious present, like one of Whitehead’s “occasions of experience?”
Should we get right down to, and at last stop finessing around the elephant in the room: the question of whether consciousness is relevant to intelligence , and if so, when should we head-on start looking aggressively and rigorously at retiring the Turing Test, and supplanting it with one that enfolds consciousness and intelligence together, in their proper ratio? (This ratio is to be determined, of course, since we haven’t even allowed ourselves to formally address the issue with both our eyes—intelligenge and consciousness—open. Maybe looking through both issues, confers insight—like depth vision, to push the metaphor of using two eyes. )
Look, if interested, for my post late tomorrow, Sunday, about the three types of information (at least) in the brain. I will title it as such, for anyone looking for it.
Personally, I think this week is the best thus far, in its parity with my own interests and ongoing research topics. Especially the 4 “For In-depth Ideas” points at the top, posted by Katja. All 4 are exactly what I am most interested in, and working most actively on. But of course that is just me; everyone will have their own favorites.
It is my personal agony (to be melodramatic about it) that I had some external distractions this week, so I am getting a late start on what might have been my best week.
But I will add what I can, Sunday evening (at least about the three types of information, and hopefully other posts. I will come back here even after the “kinetics” topic begins, so those persons in here who are interested in Katja’s 4 In-depth issues, might wish to look back here later next week, as well as Sunday night or Monday morning, if you are interested in those issues as much as I am.
I am also an enthusiast for plumbing the depths of the quality idea, as well as, again, point number one on Katja’s “In-depth Research” idea list for this week, which is essentially the issue of whether we can replace the Turing Test with—now my own characterization follows, not Katja’s, so “blame me” (or applaud if you agree) -- something much more satisfactory, with updated conceptual nuance representative of cognitive sciences and progressive AI as they are (esp the former) in 2015, not 1950.
By that I refer to theories, less preemptively suffocated by the legacy of logical positivism, which has been abandoned in the study of cognition and consciousness by mainstream cognitive science researchers; physicists doing competent research on consciousness; neuroscience and physics-literate philosophers; and even “hard-nosed” neurologists (both clinical and theoretical) who are doing down and detailed, bench level neuroscience.
As an aside, a brief look around confers the impression that some people on this web site still seem to think that being “critical thinkers” is somehow to be identified with holding (albeit perhaps semi-consciously) the scientific ontology of the 19th century, and subscribing to philosophy-of-science of the 1950′s.
Here’s the news, for those folks: the universe is made of information, not Rutherford-style atoms, or particles obeying Newtonian mechanics. Ask a physicist: naive realism is dead. So are many brands of hard “materialism” in philosophy and cognitive science.
Living in the 50′s is not being “critical”, is is being uninformed. Admitting that consciousness exists, and trying to ferret out its function, is not new-agey, it is realistic. Accepting reality is pretty much a necessary condition of being “less wrong.”
And I think it ought to be one of the core tasks we never stray too far from, in our study of, and our pursuit of the creation of, HLAI (and above.)
Okay, late Saturday evening, and I was loosening my tie a bit… and, well, now I’ll to get back to what contemporary bench-science neurologists have to say, to shock some of us (it surprised me) out of our default “obvious* paradigms, even our ideas about what the cortex does.
I’ll try to post a link or two in the next day or two, to illustrate the latter. I recently read one by neurologists (research and clinical) who study children born en-cephalic (basically, just a spinal column and medulla, with an empty cavity full of CS fluid, in the rest of their cranium.) You won’t believe what the team in this one paper presents, about consciousness in these kids. Large database of patients over years of study. And these neurologists are at the top of their game. It will have you rethinking some ideas we all thought were obvious, about what the cortex does. But let me introduce that paper properly, when I post the link, in a future message.
Before that, I want to talk about the three kinds of information in the brain -- maybe two, maybe 4, but important categorical differences (thermodynamic vs. semantic-referential, for starters), and what it means to those of us interested in minds and their platform-independent substrates, etc. I’ll try to have something about that up, here, Sunday night sometime.
No, information ontology isn’t a done deal.
Well, I ran several topics together in the same post, and that was perhaps careless planning. And, in any case I do not expect slavish agreement just because I make the claim.
And, neither should you, just by flatly denying it, with nary a word to clue me in about your reservations about what has, in the last 10 years, transitioned from a convenient metaphor in quantum physics, cosmology, and other disciplines, to a growing consensus about the actual truth of things. (Objections to this growing consensus, when they actually are made, seem to be mostly arguments from guffaw, resembling the famous “I refute you thus” joke about Berkeleyan idealism.)
By the way, I am not defending Berkeleyan idealism, still less the theistic underpinning that kept popping up in his thought (I am an atheist.)
Rather, as for most thinkers, who cite the famous joke about someone kicking a solid object as a “proof” that Berkeley’s virtual phenomenalism was self-evidently foolish, the point of my usage of that joke is to show it misses the point. Of course it seems phenomenologically, like the world is made of “stuff”.
And information doesn’t seem to be “real stuff.” (The earth seems flat, too. So what?)
Had we time, you and I could debate the relative merits of an information-based, scientifically literate metaphysics, with whatever alternate notion of reality you subscribe to in its place, as your scientifically literate metaphysics.
But make no mistake, everyone subscribes to some kind of metaphysics, just as everyone has a working ontology—or candidate, provisional set of ontologies.
Even the most “anti-metaphysical” theorists are operating from a (perhaps unacknowledged) metaphysics and working ontology; it is just that they think theirs, because it is invisible to them, is beyond need of conceptual excavation and clarification, and beyond the reach of critical, rational examination—whereas other people’s metaphysics is acutally a metaphysics (argh), and thus carries an elevated burden of proof relative to their ontology.
I am not saying you are like this, of course. I don’t know your views. As I say, it could be the subject of a whole forum like this one. So I’ll end by saying disagreement is inevitable, especially when I just drop in a remark as I did, about a topic that is actually somewhat tangential (though, as I will try to argue as the forum proceeds, not all that tangential.)
Yes, Bostrom explicitly says he is not concerned with the metaphysics of mind, in his book. Good for him. It’s his book, and he can write it any way he chooses.
And I understand his editorial choice. He is trained as a philosopher, and knows as well as anyone that there are probably millions of pages written about the mind body problem, with more added daily. It is easy to understand his decision to avoid getting stuck in the quicksand of arguing specifics about consciousness, how it can be physically realized.
This book obviously has a different mission. I have written for publication before, and I know one has to make strategic choices (with one’s agent and editor.)
Likewise, his book is also not about “object-level” work in AI—how to make it, achieve it, give it this or that form, give it “real mental states”, emotion, drives. Those of us trying to understand how to achieve those things, still have much to learn from Bostrom’s current book, but will not find intricate conceptual investigations of what will lead to the new science of sentience design.
Still, I would have preferred if he had found a way to “stipulate” Conscious AI, along with speed AI, Quality AI, etc, as one of the flavors that might arise. Then we could address quesions under 4 headings, 4 possible AI worlds (not necessarily mutually exclusive, just as the three from this week are not mutually exclusive.)
The question of the “direct reach” of conscious AI, compared to the others, would have been very interesting.
It is a meta-level book about AI, deliberately ambiguous about consciousness. I think that makes the discussion harder, in many areas.
I like Bostrom. I’ve been reading his papers for 10 or 15 years.
But avoiding or proscribing the question of whether we have consciousness AND intelligence (vs simply intelligent behavior sans consciousness) thus pruning away, preemptively, issues that could depend on: whether they interact; whether the former increases causal powers—or instability or stability—in the exercise of the latter; and so on, keeps lots of questions inherently ambiguous.
I’ll try to make good on that last claim, one way or another, during the next couple of weekly sessions.
A growing consensus isn’t a done deal.
It’s a matter if fact that information ontology i isn’t the established consensus in the way that evolution is. You are entitled to opinions, but not to pass off opinions as fact. There is enough confusion about physics already.
You bring in the issue of objections to information ontology The unstated argument seems to be that since there are no valid objections, there is nothing to stop it becoming the established consensus, so it is as good as.
What would a universe in which information is not fundamental look like, as opposed to one where it is? I would expect a universe where information is not fundamental to look like one where information always requires some physical, material or energetic, medium or carrier—a sheet of paper,, radio wave,a train of pulses going down T1 line. That appears to be the case.
I am not sure why you brought Bostrom in. For what it’s worth, I don’t think a Bostrom style mathematical universe is quite the same as a single universe information ontology.
I don’t know who you think is doing that, .or why you brought it in. Do you think .IO helps with the mind body problem? I think you need to do more than subtract the stuffiness from matter. If we could easily see how a rich conception of consciousness could supervene on pure information, we would easily be able to see how computers could have qualia, which we can’t. We need more in our ontology, not less.
I have to confess that I might be the one person in this business who never really understood the concept of supervenience—either “weak supervenience” or “strong supervenience.” I’ve read Chalmers, Dennett, the journals on the concept… never really “snapped-in” for me. So when the term is used, I have to just recuse myself and let those who do understand it, finish their line of thought.
To me, supevenience seems like a fuzzy way to repackage epiphenomenalism, or to finesse some kind of antinomy (for them), like, “can’t live with eliminative materialism, can’t live with dualism, can’t live with type—type identity theory, and token-token identity theory is untestable and difficult even to give logical nec and sufficient conditions for, so… lets have a new word.”
So, (my unruly suspicion tells me) let’s say mental events (states, processes, whatever) “supervene” on physiological states (events, etc.)
As I say, so far, I have just had to suspend judgement and wonder if some day “supervene” will snap-in and be intuitively penetrable to me. I push all the definitions, and get to the same place—a “I don’t get it” place, but that doesn’t mean I believe the concept is itself defective. I just have to suspend judgement (like, for the last 25 years of study or so.)
I actually believe that, too… but with a unique take: I think we all operate with a logical ontology … not in the sense of modus ponens, but in the sense that a memory space can be “logical”, meaning in this context, detached from physical memory.
Further, the construction of this logical ontology is, I think, partly culturally influenced, partly influenced by the species’ sensorium and equipment, party influenced / constructed by something like Jeff Hawkins’ prediction-expectation memory model… constructed, bequeathed culturally, and in several additional related, ways that also tune the idealized, logical ontology.
Memetics influences (in conjunction with native—although changeable—abilities in those memes’ host vectors) the genesis, maintenance, and evolution of this “logical ontology”, also. This is feed foward and feed backward. Memetics influences the logical ontology, which crystalizes into additional memetic templates that are kept, tuning further the logical ontology.
Once “established” (and it constantly evolves), this “logical” ontology is the “target” that, over time, a new (say, human, while growing up, growing old) has as the “target” data structures that it creates a virtual, phenomenological analog simulation of, and as the person gains experience, the person’s virtual reality simulation of the world converges on something that is in some way consistently isomorphically related to this “logical” idealized ontology.
So (and there is lots of neurology research that drives much of this, though it may all sound rather speculative) for me, there are TWO ontologies, BOTH of them constructed, and those are in addition to the entangled “outside world” quantum substrate, which is by definition inherently both sub-ontological (properly understood) and not sensible, (It is sub-ontological because of its nature, but is interrogatable, giving feedback helping to form boundary conditions for the idealized logical ontology (or ontologies, in different species.)
I’ll add that I think the “logical ontology” is also species dependent, unsurprisingly.
I think you and I got off on the wrong foot, maybe you found my tone too declaratory when it should have been phrased more subjunctively. I’ll take your point. But since you obviously have a philosophy competence, you will know what the following means:-- one can say my views resemble somewhat an updated quasi-Kantian model, supplemented with the idea that noumena are the inchoate quantum substrate.
Or perhaps to correct that, in my model there are two “noumenal” realms: one is the “logical ontology” I referred to, a logical data structure, and the other is the one below that, and below ALL ontologies, which is the quantum substrate, necessarily “subontological.”
But my theory (there is more than I have just shot through quickly right now) handles species-relative qualia and the species-relative logical ontologies across species.
Remaining issues include : how qualia are generated. And the same question for the sense of self. I have ideas how to solve these, and the indexical 1st person problem, connected with the basis problem. Neurology studies of default mode network behavior and architecture, its malfunction, and metacognition, epilepsy, etc, help a lot.
Think this is speculative? You should read neruologists these days, especially the better, data driven ones. (Perhaps you already know, and you will thus see where I derive some of my supporting research.)
Anyway, always, always, I am trying to solve all this in the general case—first, across biological conscious species (a bird has a different “logical” ontology than people, as well as a different phenomenological reality that, to varying degrees of precision, “represents” or maps to, or has a recurrent resonance with that species’ logical ontology) -- and then trying to solve it for any general mind in mind space. that has to live in this universe.
It all sounds like hand waving, perhaps. But this is scarcely an abstract. There are many puzzle pieces to the theory, and every piece of it has lots of specific research. It all is progressively falling together into an integrated system. I need geffen graphs, white boards, to explain it, since its a whole theory, so I can’t squeeze it into one post. Besides, this is Bostrom’s show.
I’ll write my own book when the time comes—not saying it is right, but it is a promising effort so far, and it seems to work better, the farther I push it.
When it is far enough along, I can test it on a vlog, and see if people can find problems. If so, I will revise, backtrack, and try again. I intend to spend the rest of my life doing this, so discovered errors are just part of revision and refinement.
But first I have to finish, then present it methodically and carefully, so it can be evaluated by others. No space here for that.
Thanks for your previous thoughts, and your caution against sounding too certain. I am really NOT that certain, of course, of anything. I was just thinking out loud, as they say.
this week is pretty much closed..… cheers...
Supervenience is not a claim like epiphenonenalism, it is a set of constraints that represent some broad naturalists conclusions.
The first part of the sentence compares brains with current computers, the second part with theoretical possible artificial minds. Modern computers are less efficient than brains in pretty much every other respect.
Really? Because a 2kg laptop seems better than a brain at recognizing songs, identifying authors of writing samples, diagnosing diseases from symptoms, playing Jeopardy...
EDIT No, seriously, I don’t understand in what respects modern computer hardware is less efficient than brain hardware. I see how computer software is deficient, and I see some ways that brain hardware seems better (simulating neurons, healing, possibly heat dissipation), but it’s not at all obvious to me in what facets superior brain hardware causes superior results.
And the brain is better at writing songs,text, medical research…
In general, humans still beat machines at most mental tasks, except for ones which require monotony/speed.
My primary point is that the ‘processing power’ of the brain is estimated at somewhere between 10^15 and 10^19 flops. The lower bound has already been surpassed by supercomputers, but these computers are huge, orders or magnitude bigger and more power hungry than brains, so brains are more efficient. And yes, many algorithms run better on small numbers of fast processors, but the best known machine learning algorithms are massively parallel.
How much have ‘collective intelligences’ been improved by communication channels speeding up, from letters and telegrams to instant messaging?
I was quite interested in the distinction that Bostrom made in passing between intelligence and wisdom. What does everyone think about it?
Wisdom is goal-oriented thinking. Intelligence is the type of calculations that should go into optimization towards goals but are often misdirected. You can optimize steadily but weakly towards a goal, or you can make calculations and make decisions with high power, but shooting in all directions.
Neither Bostrum nor our discussion properly distinguishes between fluid and crystallized intelligence.
Howard Gardners concept of multiple intelligences is focussing on aquiring crystallized intelligence:
(Gardner 2006 Multiple Intelligences—New Horizons p6)
The word wisdom implicates experience and higher age. Wisdom does not need to be universal knowledge. Years of experiences in some restricted domains together with high inter- and intrapersonal intelligence build wisdom.
For our discussion on engineering seed AIs it is important to know that this AI needs wisdom in engineering. This is: high fluid intelligence, high creativity (a concept different from intelligence), and a lot of practical experience and theoretical knowledge about hard- and software engineering.
What did you find least persuasive in this week’s reading?
Mea Culpa for falling behind.
Bostrom mentions ( p55 ) that the increase in human population and organizational complexity since the Pleistocene suggests that we have created “superintelligence relative to a Pleistocene baseline”. And we have radically changed the world, in an accelerating fashion, from that baseline. Yet it seems plausible that Pleistocene tribes could have “slipped through the cracks”, perhaps in the Amazon or Pappa New Guinea, and continued living much the same style of existance since then. Since this seems less plausible for what we conventionally refer to as superintelligence, this suggests there is some sort of dis-analogy that is being overlooked here.
Did you change your mind about anything as a result of this week’s reading? Did you learn anything interesting or surprising?
His point about organizations being good at parallel, independantly verifiable tasks was interesting. It seems that much of human progress has consisted of
A very clever person working out how to do something once.
Another clever person working out how to break it down into 1000 steps that can be outsourced to ordinary people, enabling mass production at reasonable cost.
now we are beginning to replace the second step with
Another clever person works out how to code it up, enabling mass production at very low cost
but perhaps AGI will make the second step redundant. This would not only reduce “time to market”, but also allow a whole new class of innovations to be realized on a significant scale.
The way I might understand it is that you can be good at baking a cake yourself, or you can be good at leveraging other people’s talents to bake cakes. Similarly, a collective superintelligence is smart by virtue of figuring out how to solve a hard problem using moderately smart things.
We have not to underestimate slow superintelligences. Our judiciary is also slow. So some acts we could do are very slow.
Humanity could be overtaken also by slow (and alien) superintelligence.
It does not matter if you would quickly see that it is in wrong way. You still could slowly lose step by step your rights and power to act… (like slowly loosing pieces in chess game)
If strong entities in our world will (are?) driving by poorly designed goals—for example “maximize profit” then they could really be very dangerous to humanity.
I really dont want to spoil our discussion with politics rather I like to see rational discussion about all existential threats which could raise from superintelligent beings/entities.
We have not underestimate any form and not underestimate any method of our possible doom.
With bigdata comming, our society is more and more ruled by algorithms. And algorithms are smarter and smarter.
Algorithms are not independent from entities which have enough money or enough political power to use it.
BTW. Bostrom wrote (sorry not in chapter we discussed yet) about possible perverse instantiation which could be done due to not well designed goal by programmer. I am afraid that in our society it will be manager or politician who will/is design goal. (we have find way that there be also philosopher and mathematician)
In my oppinion first (if not singleton) superintelligence will be (or is) most probably ‘mixed form’. Some group of well organized people (dont forget lawyers) with big database and supercomputer.
Next stages after intelligence explosion could have any other forms.
What’s the difference between intelligence being ‘higher quality’, and being more ‘general’?
A higher quality intelligence than us might, among other things, use better heuristics and more difficult analytical concepts than we can, recognize more complex relationships than we can, evaluate its expected utility in a more consistent and unbiased manner than we can, envision more deeply nested plans and contingencies than we can, possess more control over the manner in which it thinks than we can, and so on.
A more general intelligence than us might simply have more hardware dedicated to general computation, regardless of what it does with that general ability.
I am trying to turn this concept of Quality Intelligence into something more precise.
Here are some items from history which most people will think of as improvements in quality intelligence.
I am thinking about quality with the context of collective intelligence. The concept of AGI = the intelligence of a single human I do not find useful for predicting a recursively improving system, for reasons we can look at later.
Development of symbolic language from pictographs Development of the number zero Development of set theory Invention of calculus Development of Newton’s method for approximating functions Invention of Bayes’ Rule Matrix theory Closed-form solutions to many kinnds of partial differential equations Procedural programming languages Approximations to vast numbers of functions using Newton’s method on computers (Quality or Quantity?) These are advances in reasoning and improve intelligence quality.
I am not sure whether to chalk up the following to advances in quality intelligence, or not: Formulation of gravity Development of the periodic table General relativity Demonstration of nuclear fission Development of the transistor Discovery of DNA Development of the microprocessor (Quality or quantity, or both?) Mechanisms of transcription and translation within the cell.
Certainly, figuring all of these things out about the real world advanced our ability to solve practical problems. I am inclined to consider the distinction between them and the discoveries in logic, computer programming and applied math somewhat arbitrary.
In practice for “quality” we normally have a yardstick for performance which is being improved continuously (e.g. success probability, quality of a solution to an optimization problem, ability to win in a game), while for generality there is often no such yardstick. At best this seems like a difference of degrees rather than a difference in kind though.
I don’t know if there is some more convincing distinction. I can’t think of any arguments that depend on a distinction.
Perhaps a more general intelligence can do well in a wider variety of circumstances, whereas a higher quality intelligence can do better in those circumstances, by seeing better solutions etc rather than just being faster.
Goertzel 2006, p43
Bostrum unsatisfactorily defines quality superintelligence in a self referencing circle by vastly qualitatively smarter (p56). It would have been better to name the ability to solve problems of vastly higher complexity.
Intelligence of higher generality covers more domains.
You can improve in intelligence by generalizing (‘My intelligence improved in generality’), or by further investing in what you’re good at (‘My intelligence improved without improving in generality’). It seems like we could mean two different things by ‘generalizing’.
Suppose four skills exist, A,B,C,D; and my skill level can either be low (0), mediocre (1), high (2), or very high (3). If I start off with A=0, B=1, C=2, D=2, then ‘generalizing’ might mean improving A or B more than I improve C or D. Alternatively, ‘generalizing’ might mean improving in more skills, rather than in just one. On the former conception, ‘raise A to 2’ increases my intelligence’s generality more than ‘raise C to 3 and D to 3’; on the latter conception, the reverse is true. There’s plug-the-gaps generalization, where you try to get rid of your weak points; but there’s also spread-the-love generalization, where you try to find self-improvements that will impact your problem-solving ability in as diverse a range of problems as possible.
‘Qualitative intelligence improvements’ seems like a grab-bag for ‘all the kinds of intelligence improvements that we don’t usually measure in any simple and direct way’. We routinely talk about, e.g., the speed, number, and computing power of computers, in terms of simple numerical values; we don’t routinely do the same for computers’ language-processing abilities, so that goes in the ‘qualitative’ bag, at least for the moment. Improving in qualitative intelligence could take almost any form; it seems like a less natural category than ‘generality’.
We can make progress if we break down “Quality Intelligence” into component parts. I started working on it, but before I go first does anyone care to take a try?
Katja mentioned some sources to look into. Also the field of Evolutionary Psychology as well as Textbooks on Cognitive Neuroscience will be divided by different kinds of intelligence that seem (to the psychologists and neuroscientists studying them) distinct. Goertzel has interesting perspectives.
Bostrom argues that the existence of people who are generally functional but have specific deficits—e.g. in social cognition or in the ability to recognize or hum simple tunes (congenital amusia) - demonstrates that these cognitive skills are performed with specialized neural circuitry, not just using general intelligence (p57). Do you agree? What are other cognitive skills that are revealed to have dedicated neural circuitry in this way?
Three types of information in the brain (and perhaps other platforms), and (coming soon) why we should care
Before I make some remarks, I would recommend Leonard Susskind’s (for those who don’t know him already – though most folks in here probably do—he is a physicist at the Stanford Institute for Theoretical Physics) very accessible 55 min YouTube presentation called “The World as Hologram.” It is not as corny as it might sound, but is a lecture on the indestructibility of information, black holes (which is a convenient lodestone for him to discuss the physics of information and his debate with Hawking), types of information, and so on. He makes the seemingly point that, “…when one rules out the impossible, then what is left, however improbable, is the best candidate for truth.”
One interesting side point that comes out is his take on why computers that are more powerful have to shed more “heat”. Here is the talk: http://youtu.be/2DIl3Hfh9tY
Okay, my own remarks. One of my two or three favorite ways to “bring people in” to the mind-body problem, is with some of the ideas I am now presenting. This will be in skeleton form tonight and I will come back and flesh it out more in coming days. (I promised last night to get something up tonight on this topic, and in case anyone cares and came back, I didn’t want to have nothing. I actually have a large piece of theory I am building around some of this, but for now, just the three kinds of information, in abbreviated form.
Type One information is the sort dealt with, referred to, and treated in thermodynamics and entropy discussions. This is dealt with analytically in Newton’s Second Law of Thermodynamics. Here is one small start, but most will know it: en.wikipedia.org/wiki/Second_law_of_thermodynamics
Heat, energy, information, the changing logical positions within state spaces of entities or systems of entities, all belong to what I am calling category one information in the brain. We can also call this “physical” information. The brain is pumped—not closed—with physical information, and emits physical information as well.
Note that there is no semantic, referential, externally cashed-out content, defined for physical, thermodynamic information, qua physical information. It is—though possibly thermodynamically open an otherwise closed universe of discourse, needing nothing logically or ontologically external to analytically characterize it.
Type Two information in the brain (please assign no significance to my ordering, just yet) is functional. It is a carrier, or mediator, of causal properties, in functionally larger physical ensembles, like canonical brain processes. The “information” I direct attention to here must be consistent with (i.e. not violate principles of) Category One informational flow, phase space transitions, etc., in the context of the system, but we cannot derive Category Two information content (causal loop xyz doing pqr) from dynamical Category One data descriptions themselves.
In particular, imagine that we deny the previous proposition. We would need either an isomorphism from Cat One to Cat Two, or at least an “onto” function from Cat One to Cat Two (hope I wrote that right, it’s late.) Clearly, Cat one configurations to Cat Two configurations are many-many, not isomorphic, nor many to one. (And one to many transformations from cat one sets to cat two sets, would be intuitively unsatisfactory if we were trying to build an “identity” or transform to derive C2 specifics, from C1 specifics .
It would resemble replacing type-type identity with token-token identity, jettisoning both sides of the Leibniz Law bi-conditional (“Identity of indiscernibles” and “Indiscernibility of Identicals” --- applied with suitable limits so as not to sneak anything in by misusing sortal ranges of predicates or making category errors in the predications.)
Well, this is a stub, and because of my sketchy presentation, this might be getting opaque, so let me move on to the next information type, just to get all three out.
Type Three information, is semantic, or intentional content, information. If I am visualizing very vibrantly a theta symbol, the intentional content of my mental state is the theta symbol on whatever background I visualize it against. A physical state of, canonically, Type Two information – which is a candidate, in a particular case, to be the substrate-instantiation or substrate-realization of this bundle of Type Three information (probably at least three areas of my brain, frequency coupled and phase offset locked, until a break in my concentration occurs) is also occuring.
A liberal and loose way of describing Type Three info (that will raise some eyebrows because it has baggage, so I use it only under duress: temporary poverty of time and the late hour, to help make the notion easy to spot) is that a Type Three information instance is a “representation” of some element, concept, or sensible experience of the “perceived” ontology (of necessity, a virtual, constructed ontology, in fact, but for this sentence, I take no position about the status of this “perceived”, ostensible virtual object or state of affairs.)
The key idea I would like to encourage people to think about is whether the three categories of information are (a) legitimate categories, and mainly (b) whether they are collapsible, inter-translatable, or are just convenient shorthand level-of-description changes. I hope the reader will see, on the contrary, that one or more of them are NOT reducible to a lower one, and that this has lessons about mind-substrate relationships that point out necessary conceptual revisions—and also opportunities for theoretical progress.
It seems to me that reducing Cat Two to Cat One is problematic, and reducing Cat 3 to Cat 2 is problematic, given the usual standards of “identity” used in logic (e.g. i. Leibniz Law; ii. modal logic’s notions of identity across possible worlds, and so on.)
Okay, I need to clean this up. It is just a stub. Those interested should come back and see it better written, and expanded to include replies to what I know are expected objections, questions, etc., C2 and C3 probably sound like the “same old thing” the m-b problem about experience vs neural correlate. Not quite. I am trying to get at something additional, here. Hard without diagrams.
Also, I have to present much of this without any context… like presenting a randomly selected lecture from some course, without building up the foundational layers. (That is why I am putting together a YouTube channel of my own, to go from scratch, to something like this, after about 6 hours of presentation… then on to a theory of which this is one puzzle piece.
Of course, we are here to discuss Bostrom’s ideas, but this “three information type” idea, less clumsily expressed, does tie straightforwardly to the question of indirect reach, and “kinds of better” that different superintelligences can embrace.
Unfortunately I will have to establish that conceptual link when I come back and clean this up, since it is getting so late. Thanks to those who read this far...
Two of those types are what type of “better” an intelligence can be, and the rest are concerned with implementation details, so it’s a bit confusing to read. Though one could replace “collective intelligence” with “highly parallel intelligence” and end up with three types of better.
Do you have further interesting pointers to material relating to this week’s reading?
The Superorganism is humanity’s best tackle so far at understanding collective intelligence made out of pretty dumb little creatures (ants, bees, wasps). It should be given more value for it’s insights than it currently is within the LW community.
I hate to spoil the party but the author has redefined superintelligence. whilst the possibilities are there to go further deeper and broader in scope real superintelligence is raising the paradigms and boundary thresholds of current intelligence. To be super is to be in another level of cognitively boosted consciousness. If the mind is in toto resonating at level 1 then superintelligence has to resonate above that. The closest humanity has come to that is external and wrapped in ancient Hindu mythologies of their gods and godesses. Anything less is simply where we are now.
Once thing going on here is that later discussion will be focused on recursively self-improving autonomous systems. We would like to know, for instance, whether and when software will be able to program other useful software.
I am not ready to claim that spirituality is irrelevant to the discussion. The difference between factual knowledge, calculation speed and “wisdom” does seem relevant, as others have pointed out earlier in the thread.
However, we’ll have to re-frame questions about spiritual issues in some way to bring it in...
Suppose one aspires to become a Bodhisatva-a being who is capable of entering into an unfettered celestial existence, but who instead stays behind to help other people and sentient beings find their way. (If we are a bit less ambitious, perhaps we might, depending on our tradition, aspire to become a tzadik, a marja, or saintly but not a supernatural savior).
Among other things, the Bodhisatvas have removed themselves from ego and forms of physical desire. They are spiritually superior and ready to move outside of the cycle of reincarnation as lower beings. Yet, when they take the form of a human, they would eat as an instrumental goal toward accomplishing their end.
The Bodhisatva’s kind of spiritual superiority does seem to differ from what Bostrom and the rest of us call “Superintelligence.” It may, however, relate in some ways...
Just a thought experiment...
For example, a peculiar version of a Bodhisatva might be said to have a utility function-in fact a marvelous, selfless utility function. Also, it will succeed at bringing more people out of the darkness if it engages in recursive self-improvement of its capabilities.
One aspires to become a Buddha or a Bodhisatva, and the claim is made by some that they, others who they know, or historical figures have reached this condition. For the most part, however, this goal is aspirational in nature and works to improve people’s behavior toward one another, and toward other sentient beings, in this life.
Based on what we now know about neurophysiology, the human brain is going to have a lot of trouble reaching any form of true enlightenment on its own. Emotion and desire re-surface even in the most virtuous or contemplative among us. Seemingly, we are never going to be truly saintly for the rest of our lives, although perhaps we can manage to have some pretty virtuous days interspersed among our failures.
I am not ready to make a proposal, but if enlightenment or spiritual purification is our goal, we might have to resort to some kind of augmentation to make it to the next step in that direction. Perhaps a lot of meditation or ethical contemplation is one approach, perhaps we may think of others...
Strictly an augmentation of intelligence, however, is not necessarily going to bring about this outcome.