Superintelligence via whole brain emulation
Most planning around AI risk seems to start from the premise that superintelligence will come from de novo AGI before whole brain emulation becomes possible. I haven’t seen any analysis that assumes both uploads-first and the AI FOOM thesis (Edit: apparently I fail at literature searching), a deficiency that I’ll try to get a start on correcting in this post.
It is likely possible to use evolutionary algorithms to efficiently modify uploaded brains. If so, uploads would likely be able to set off an intelligence explosion by running evolutionary algorithms on themselves, selecting for something like higher general intelligence.
Since brains are poorly understood, it would likely be very difficult to select for higher intelligence without causing significant value drift. Thus, setting off an intelligence explosion in that way would probably produce unfriendly AI if done carelessly. On the other hand, at some point, the modified upload would reach a point where it is capable of figuring out how to improve itself without causing a significant amount of further value drift, and it may be possible to reach that point before too much value drift had already taken place. The expected amount of value drift can be decreased by having long generations between iterations of the evolutionary algorithm, to give the improved brains more time to figure out how to modify the evolutionary algorithm to minimize further value drift.
Another possibility is that such an evolutionary algorithm could be used to create brains that are smarter than humans but not by very much, and hopefully with values not too divergent from ours, who would then stop using the evolutionary algorithm and start using their intellects to research de novo Friendly AI, if that ends up looking easier than continuing to run the evolutionary algorithm without too much further value drift.
The strategies of using slow iterations of the evolutionary algorithm, or stopping it after not too long, require coordination among everyone capable of making such modifications to uploads. Thus, it seems safer for whole brain emulation technology to be either heavily regulated or owned by a monopoly, rather than being widely available and unregulated. This closely parallels the AI openness debate, and I’d expect people more concerned with bad actors relative to accidents to disagree.
With de novo artificial superintelligence, the overwhelmingly most likely outcomes are the optimal achievable outcome (if we manage to align its goals with ours) and extinction (if we don’t). But uploads start out with human values, and when creating a superintelligence by modifying uploads, the goal would be to not corrupt them too much in the process. Since its values could get partially corrupted, an intelligence explosion that starts with an upload seems much more likely to result in outcomes that are both significantly worse than optimal and significantly better than extinction. Since human brains also already have a capacity for malice, this process also seems slightly more likely to result in outcomes worse than extinction.
The early ways to upload brains will probably be destructive, and may be very risky. Thus the first uploads may be selected for high risk-tolerance. Running an evolutionary algorithm on an uploaded brain would probably involve creating a large number of psychologically broken copies, since the average change to a brain will be negative. Thus the uploads that run evolutionary algorithms on themselves will be selected for not being horrified by this. Both of these selection effects seem like they would select against people who would take caution and goal stability seriously (uploads that run evolutionary algorithms on themselves would also be selected for being okay with creating and deleting spur copies, but this doesn’t obviously correlate in either direction with caution). This could be partially mitigated by a monopoly on brain emulation technology. A possible (but probably smaller) source of positive selection is that currently, people who are enthusiastic about uploading their brains correlate strongly with people who are concerned about AI safety, and this correlation may continue once whole brain emulation technology is actually available.
Assuming that hardware speed is not close to being a limiting factor for whole brain emulation, emulations will be able to run at much faster than human speed. This should make emulations better able to monitor the behavior of AIs. Unless we develop ways of evaluating the capabilities of human brains that are much faster than giving them time to attempt difficult tasks, running evolutionary algorithms on brain emulations could only be done very slowly in subjective time (even though it may be quite fast in objective time), which would give emulations a significant advantage in monitoring such a process.
Although there are effects going in both directions, it seems like the uploads-first scenario is probably safer than de novo AI. If this is the case, then it might make sense to accelerate technologies that are needed for whole brain emulation if there are tractable ways of doing so. On the other hand, it is possible that technologies that are useful for whole brain emulation would also be useful for neuromorphic AI, which is probably very unsafe, since it is not amenable to formal verification or being given explicit goals (and unlike emulations, they don’t start off already having human goals). Thus, it is probably important to be careful about not accelerating non-WBE neuromorphic AI while attempting to accelerate whole brain emulation. For instance, it seems plausible to me that getting better models of neurons would be useful for creating neuromorphic AIs while better brain scanning would not, and both technologies are necessary for brain uploading, so if that is true, it may make sense to work on improving brain scanning but not on improving neural models.
But what research improves brain imaging but not DL… One thing to point out about whole brain emulation vs ‘de novo’ AI is that it may be, in practice, nearly impossible to get WBEs without having already, much earlier, kickstarted ‘de novo’ AI.
If you can scan and run successfully a single whole brain, you got there by extensive brain imaging and brain scanning of much smaller chunks of many brains, and it seems like there is a lot of very transferable knowledge from the structure and activities of a human brain to artificial neural networks, which I dub “brain imitation learning”. Not only do ANNs turn out to have fairly similar activation patterns as human brains in some respects (primarily visual cortex stuff), the human brain’s activation patterns encode a lot of knowledge about how visual representations work which can be used to learn & generalize. (A particularly interesting example from this month is “Self-Supervised Natural Image Reconstruction and Rich Semantic Classification from Brain Activity”, Gaziv et al 2020.) You might consider this a version of the pretraining paradigm or lexical hypothesis—the algorithms of general intelligence, and world knowledge, are encoded in the connectivity and activation patterns of a human brain and so training on large corpuses of such data to imitate the connectivity & activation patterns will provide an extremely powerful prior/initialization à la GPT-3 pretraining on large text datasets.
So, it is entirely possible that by the time you get to BCIs or whole-brain scanning apparatuses, these are providing high-volume data embeddings or structural/architectural constraints which help push deep learning approaches over the finish line to AGI by providing informative priors & meta-learning capabilities by conditioning on <100% data from many brains. (In fact, if you believe this won’t happen, you have to explain what on earth is being done with all this extremely expensive data for decades on end, as it slowly ramps up from scanning insect-sized chunks to full monkey brains before finally an entire human brain is scanned 100% & they flip the giant red switch to make Mr John Smith, test subject #1918, wake up inside a computer. What is everyone doing before that?)
Whatever these DL systems may be, they won’t be a single specific person, and they won’t come with whatever safety guarantees people think an upload of Mr John Smith would come with, but they will come years or decades before.
All that is indeed possible, but not guaranteed. The reason I was speculating that better brain imaging wouldn’t be especially useful for machine learning in the absence of better neuron models is that I’d assume that the optimization pressure that went into the architecture of brains was fairly heavily tailored to the specific behavior of the neurons that those brains are made of, and wouldn’t be especially useful relative to other neural network design techniques that humans come up with when used with artificial neurons that behave quite differently. But sure, I shouldn’t be too confident of this. In particular, the idea of training ML systems to imitate brain activation patterns, rather than copying brain architecture directly, is a possible way around this that I hadn’t considered.
The obvious solution would be to use cryopreserved brains. Perhaps this would be necessary anyway, because of all the moral and legal problems with slicing up living person’s brain to take SEM images and map the connectome. This suggests that an extremely effective EA cause would be to hand out copies of Bostrom’s Superintelligence at Cryonics conventions.
It’s not clear whether the cryonics community would be more or less horrified by defective spurs than the average person, though. Perhaps EAs could request to be revived early, at increased risk of information-theoretic death, if digital uploading is attempted and if self-modifying AI is a risk. Perhaps the ideal would be to have a steady stream of FAI-concerned volunteers in the front of the line, so that the first successes are likely to be cautious about such things. Ideally, we wouldn’t upload anyone not concerned with FAI until we had a FAI in place, but that may not be possible if there is a coordination problem between several groups across the planet. A race to the bottom seems like a risk, if Moloch has his say.
I ordinarily wouldn’t make such a minor nitpick, (because of this) but it might be an important distinction, so I’ll make an exception: People who worry about FAI are likely to also be enthusiastic about uploading, but I’m not sure if the average person who is enthusiastic about uploading is worried about FAI. For most people, “AI safety” means self driving cars that don’t hit people.
Right, that’s why I said it would probably be a smaller source of selection, but the correlation is still strong, and goes in the preferred direction.
Ah, understood. We’re on the same page, then.
It seems to me that embryo selection could be one way to increase the intelligence of the first generation of uploads without (initially) producing a great deal of defective minds of the sort that you mention in the article. WBE may take several decades to achieve, the embryos will take some time to grow into adult humans anyway, and it seems to me that a great deal of the defective minds in your scenario wouldn’t be created because we’re selecting for genes that go on to produce minds, as opposed to selecting directly for mind designs. (Please do correct me if you were only talking about selecting for genes as opposed to mind designs.)
It’s worth noting that embryo selection seems to me a much less extreme version of intelligence amplification than what you have suggested, and even with embryo selection it seems that we run into some old questions about how IQ and rationality are related. As argued elsewhere, it may be that finding ways to amplify intelligence without understanding this relation between intelligence and rationality could actually increase risk, as opposed to mitigating it.
As a side note, something I’ve always wondered about is how unusually long periods of subjective time and potential relative social isolation would affect the mental health of uploads of modern humans.
Yes, embryo selection and other non-WBE intelligence amplification techniques would be useful in similar ways as applying evolutionary algorithms to emulations. I’d expect non-WBE intelligence amplification to typically have much lower risks, but also smaller effect size, and would be useful independently of the eventual arrival of WBE technology.
I’m fairly confident that intelligence enhancement would be good for our chances of future survival. I’m not convinced by the case for fast economic growth increasing risk much, and FAI is probably a more IQ-intensive problem than AGI is, so intelligence enhancement would likely accelerate FAI more than AGI even if it doesn’t result in increased rationality as well (although it would be a bigger plus if it did).
I doubt it would be a problem. Forager bands tended to be small, and if hardware to run uploads on is not the limiting factor to first creating them, then it will be feasible to run small groups of uploads together as soon as it is feasible to run a single upload.
I realize that Luke considered economic growth a crucial consideration, but I was really relying on Keith Stanovich’s proposed distinction between intelligence and rationality. It seems possible that you can increase someone’s raw capability without making them reflective enough to get all of the important answers right. This would mean that rationality is not just a bonus. On the other hand, these things might go hand in hand. It seems worth investigating to me and relevant to comparing ‘AI-first’ and ‘IA-first’ risk mitigation strategies.
Also, I think the language of ‘critical levels’ is probably better than the language of ‘acceleration’ in this context. It seems safe to assume that FAI is a more difficult problem than AGI at this point, but I don’t think it follows only from that that IA will accelerate FAI more than it accelerates AGI. That depends on many more facts, of which problem difficulty is just one. I have no problem with ceteris paribus clauses, but it’s not clear what we’re holding equal here. The identity and size of the party in control of the IA technology intuitively seems to me like the biggest consideration besides problem difficulty. A large part of why I consider IA-first an alternative worth thinking about is not because I think it’s likely to differentially affect technological development in the obvious way, but because we may be below some critical threshold of intelligence necessary to build FAI and thus IA-first would be preferable to AI-first because AI-first would almost certainly fail. This also is not a sure thing and I think also warrants investigation.
Forgive me if I’m starting to ramble but, something I find interesting about this is, unless you have other reasons to reject the relevance of this point, it seems to me you have also implied that, if there is no hardware overhang, and WBE begins as a monopoly in the way that nuclear weapons began, then the monopolist may have to choose between uploading a lone human, psychological effects be damned, and delaying the use of an IA technique that is actually available to them. It seems that you could allow the lone emulation to interact with biological humans, and perhaps even ‘pause’ itself so that it experiences a natural amount of subjective time during social interaction, but if you abuse this too much for the sake of maintaining the emulation’s mental health, then you sacrifice the gains in subjective time. Sacrificing subjective time is perhaps not so bad as it might seem because speed intelligence can be useful for other reasons, some of which you outlined in the article. Nonetheless, this seems like a related problem where you have to ask yourself what you would actually do with an AI that is only good for proving theorems. There often seems to be a negative correlation between safety and usefulness. Still, I don’t know what I would choose if I could choose between uploading exactly one extraordinary human right now and not doing so. My default is to not do it and subsequently think very hard, because that’s the reversible decision, but that can’t be done forever.
Even if intelligence doesn’t help at all for advancing FAI relative to AGI except via rationality, it still seems pretty unlikely that intelligence amplification would hurt, even if it doesn’t lead to improvements in rationality. It’s not like intelligence amplification would decrease rationality.
I disagree. The hypothesis that it is literally impossible to build FAI (but not AGI) without intelligence amplification first is merely the most extreme version of the hypothesis that intelligence amplification accelerates FAI relative to AGI, and I don’t see why it would be more plausible than less extreme versions.
If you can run an emulation at much faster than human speed, then you don’t have a hardware overhang. The hardware to run 10 emulations at 1⁄10 the speed should cost about the same amount. If you really have a hardware overhang, then the emulations are running slower than humans, which also decreases how dangerous they could be. Alternatively, it’s possible that no one bothers running an emulation until they can do so at approximately human speed, at which point the emulation would be able to socialize with biological humans.
I guess I would ask: Considering that there are probably a great many discernible levels of intelligence above that of our own species, and that we were not especially designed to build FAI, do you have reasons to think that the problem difficulty and human intelligence are within what seems to me to be a narrow range necessary for success?
To expand, I agree that we can imagine these hypotheses on a continuum. I feel that I misunderstood what you were saying so that I don’t stand behind what I said about the language, but I do have something to say about why we might consider the most extreme hypothesis more plausible than it seems at first glance. If you just imagine this continuum of hypotheses, then you might apply a sort of principle of indifference and not think that the critical level for FAI being far above biological human intelligence should be any more plausible than the many other lower critical levels of intelligence that are possible, as I think you are arguing. But if we instead imagine all possible pairs of FAI problem difficulty and intelligence across some fully comprehensive intelligence scale, and apply a sort of principle of indifference to this instead, then it seems like it would actually be a rather fortunate coincidence that human intelligence was sufficient to build FAI. (Pairs where actual human intelligence far exceeds the critical level are ruled out empirically.) So I think trying to evaluate plausibility in this way depends heavily on how we frame the issue.
Well, for me it depends on the scope of your statement. If all goes well, then it seems like it couldn’t make you less rational and could only make you more intelligent (and maybe more rational thereby), but if we assume a wider scope than this, then I’m inclined to bring up safety considerations about WBE (like, maybe it’s not our rationality that is primarily keeping us safe right now, but our lack of capability; and other things), although I don’t think I should bring this up here because what you’re doing is exploratory and I’m not trying to argue that you shouldn’t explore this.
Good point; I didn’t think about this in enough detail.
Humans have already invented so many astounding things that it seems likely that for most things that are possible to build in theory, the only thing preventing humans from building them is insufficient time and attention.
Have you read The Age of Em? Robin Hanson thinks that mind uploading is likely to happen before de novo AI, but also the reasons why that’s the case mean that we won’t get much in the way of modifications to ems until the end of the Em era.
(That is, if you can just use ‘evolutionary algorithms’ to muck around with uploads and make some of them better at thinking, it’s likely you understand intelligence well enough to build a de novo AI to begin with.)
I’ve read Age of Em. IIRC, Robin argues that it will be likely be difficult to get a lot of progress from evolutionary algorithms applied to emulations because brains are fragile to random changes, and we don’t understand brains well enough to select usefully nonrandom changes, so all changes we make are likely to be negative.
But brains actually seem to be surprisingly resilient, given that many people with brain damage or deformed brains are still functional, including such dramatic changes as removing a hemisphere of the brain, and there are already known ways to improve brain performance in narrow domains with electric stimulation (source), which seems similar. So it seems fairly likely to me that getting significant improvements from evolutionary algorithms is possible.
Also, I talked with Robin about this, and he didn’t actually seem very confident about his prediction that evolutionary algorithms would not be used to increase the intelligence of emulations significantly, but he did think that such enhancements would not have a dramatic effect on em society.
Hanson makes so many assumptions that defy intuition. He’s talking a civilization with the capacity to support trillions of individuals, in which these individuals are largely entirely disposable and can be duplicated at a moment’s notice, and he doesn’t think evolutionary pressures are going to come into play? We’ve seen random natural selection significantly improve human intelligence in as few as tens of generations. With Ems, you could probably cook up tailor-made superintelligences in a weekend using nothing but the right selection pressures. Or, at least, I see no reason to be confident in the converse proposition.
He claims we don’t know enough about the brain to select usefully nonrandom changes, yet assumes that we’ll know enough to emulate them to high fidelity. This is roughly like saying that I can perfectly replicate a working car but I somehow don’t understand anything about how it works. What about the fact that we already know some useful nonrandom changes that we could make, such as the increased dendritic branching observable in specific intelligence-associated alleles?
It doesn’t matter. Deepmind is planning to have a rat-level AI before the end of 2017 and Demis doesn’t tend to make overly optimistic predictions. How many doublings is a rat away from a human?
He actually does think evolutionary pressures are going to be important, and in fact, in his book, he talks a lot about which directions he expects ems to evolve in. He just thinks that the evolutionary pressures, at least in the medium-term (he doesn’t try to make predictions about what comes after the Em era), will not be so severe that we cannot use modern social science to predict em behavior.
Source? I’m aware of the Flynn effect, but I was under the impression that the consensus was that it is probably not due natural selection.
To emulate a brain, you need to have a good enough model of neurons and synapses, be able to scan brains in enough detail, and have the computing power to run the scan. Understanding how intelligent behavior arises from the interaction of neurons is not necessary.
If that actually happens, I would take that as significant evidence that AGI will come before WBE. I am kind of skeptical that it will, though. It wouldn’t surprise me that much if Deepmind produces some AI in 2017 that gets touted as a “rat-level AI” in the media, but I’d be shocked if the claim is justified.
“Random natural selection” is almost a contradiction in terms. Yes, we’ve seen dramatic boosts in Ashkenazi intelligence on that timescale, but that’s due to very non-random selection pressure.
Mutations occur randomly and environmental pressure perform selection on them.
Obviously, but “natural selection” is the non-random part of evolution. Using it as a byword for evolution as a whole is bad terminology.
Fair enough. My lazy use of terminology aside, I’m pretty sure you could “breed” an Em via replication-with-random-variation followed by selection according to performance-based criteria.
is the idea that you cannot scan a brain if you don’t know what needs to be scanned, and so that’s why you need a model of neurons before you can upload? that you think “scanning everything” and waiting to figure out how it works before emulating the scanned mind is impracal?
No. Scanning everything and then waiting until we have a good enough neuron model might work fine; it’s just that the scan wouldn’t give you a brain emulation until your neuron model is good enough.
got it!
I wrote a summary of Hansons’s The Age of Em, in which I focus on the bits of information that may be policy-relevant for effective altruists. For instance, I summarize what Hanson says about em values and also have a section about AI safety.
If the artificial intelligence from emulation is accomplished through tweaking an emulation and/or piling on computational resources, why couldn’t it be accomplished before we start emulating humans?
Other primates, for example. Particularly in the case of the destructive-read and ethics-of-algorithmic-tweaks, animal testing will surely precede human testing. To the extent a human brain is just a primate brain with more computing power, another primate with better memory and clock speed should serve almost as effectively.
What about other mammals with culture and communication, like a whales or dolphins?
Something not a mammal at all, like Great Tits?
I consider modified uploads much more likely to result in outcomes worse than extinction. I don’t even know what you could be imagining when you talk about intermediate outcomes, unless you think a ‘slight’ change in goals would produce a slight change in outcomes.
My go-to example of a sub-optimal outcome better than death is (Spoilers!) from Friendship is Optimal—the AI manipulates everyone into becoming virtual ponies and staying under her control, but otherwise maximizes human values. This is only possible because the programmer made an AI to run a MMORPG, and added the goal of maximizing human values within the game. You would essentially never get this result with your evolutionary algorithm; it seems overwhelmingly more likely to give you a mind that still wants to be around humans and retains certain forms of sadism or the desire for power, but lacks compassion.
It depends what sorts of changes. Slight changes is what subgoals are included in the goal result in much larger changes in outcomes as optimization power increases, but slight changes in how much weight each subgoal is given relative to the others in the goal can even result in smaller changes in outcomes as optimization power increases if it becomes possible to come close to maxing out each subgoal at the same time. It seems plausible that one could leave the format in which goals are encoded in the brain intact while getting a significant increase in capabilities, and that this would only cause the kinds of goal changes that can lead to results that are still not too bad according to the original goal.
seems kind of ludicrous if we’re talking about empathy and sadism.
Most pairs of goals are not directly opposed to each other.
Not quite related, but can I pet peeve a bit here?
Whenever I hear the “we didn’t invent an airplane by copying a bird so we won’t invent AI by copying a brain” line I always want to be like “actually we farmed part of the plane’s duties out to the brain of the human who pilots it. AI’s will be similar. They will have human operators, to which we will farm out a lot of the decision making. AlphaGo doesn’t choose who it plays, some nerd does.”
Like, why would AI be fully autonymous right off the back? Surely we can let it use an operator for a sanity check while we get the low hanging fruits out of the way.
I think that superAI via uploading in inherent safe solution. https://en.wikipedia.org/wiki/Inherent_safety It also could go wrong in many ways, but it is not its default mode.
Even if it kill all humans, it will be one human which will survive.
Even if his values will evolve it will be natural evolution of human values.
As most human beings don’t like to be alone, he would create new friends that is human simulations. So even worst cases are not as bad as paper clip maximiser.
It is also feasible plan which consist of many clear steps, one of them is choosing and educating right person for uploading. He should be educated in ethics, math, rationality, brain biology etc. I think he is reading LW and this comment))
This idea could be upgraded to be even more safe. One way is to upload a group of several people which will be able to control each other and also produce mutual collective intelligence.
Another idea is broke Super AI into center and circumference. In the centre we put uploaded mind of very rational human, which make important decisions and keep values, and in periphery we put many Tool AIs, which do a lot of dirty work.
Unless it self-modifies to the point where you’re stretching any meaningful definition of “human”.
Again, for sufficiently broad definitions of “natural evolution”.
If we’re to believe Hanson, the first (and possibly only) wave of human em templates will be the most introvert workaholics we can find.
Its evolution could go wrong from our point of view, but older generation always thinks that younger ones are complete bastards. When I say “natural evolution” I meant complex evolution of values based on their previous state and new experience, and it is rather typical situation for any human being who’s values are evolving from childhood, and also under influence of experiences, texts and social circle.
This idea is very different from Hanson’s em world. Here we deliberately upload only one human, who is trained to become are a core of future friendly AI. He knows that he is going to make some self-improvements but he knows dangers of unlimited self-improvement. His loved ones are still in flesh. He is trained to be not a slave as in the Hanson’s Em world, but a wise ruler.