Other agents are dangerous to me to the extent that (1) they don’t share my values/goals, and (2) they are powerful enough that in pursuing their own goals, they have little need to take game theoretic consideration of my values. ANN based AI will be similar to other humans in (1), and regarding (2) they are likely to be more powerful than humans since they’ll be running on faster, more capable hardware than human brains, and probably have better algorithms as well.
Schmidhuber’s best case scenario for superintelligence is that they take no interest in humanity, colonize space and leave us to survive on Earth. What’s your best case scenario? Does it seem not much worse to you than the best case scenario for FAI (i.e., if humanity could coordinate to solve the cosmic tragedy of the commons problem and wait until we know how to safely build an AGI that shares some compromise, e.g., weighted average, of all human values)?
Other agents are dangerous to me to the extent that (1) they don’t share my values/goals, and (2) they are powerful enough that in pursuing their own goals, they have little need to take game theoretic consideration of my values. ANN based AI will be similar to other humans in (1), and regarding (2) they are likely to be more powerful than humans since they’ll be running on faster, more capable hardware than human brains, and probably have better algorithms as well.
Your points 1 and 2 are true but only in degrees. Humans vary significantly in terms of altruism (1) and power (2). Hitler—from what I’ve read—is a good example of a powerful, non-altruistic human. Martin Luther King and Ghandi are examples of highly altruistic humans (the first patterned directly after Jesus, the second patterned after Jesus and Bhudda). Now, it could be the case that these two were more selfish than they appear at first, because they were motivated by reward in the afterlife. Well perhaps to a degree, but that line of argument mostly fails as a complete explanation (and even if true, could also potentially become a strategy).
Finally, brain inspired ANNs != human brains. We can take inspiration from the best examples of human capabilities and qualities while avoiding the worst, and then extrapolate to superhuman dimensions.
Altruism can be formalized by group decision/utility functions, where the agent’s utility function implements some approximation of the ideal aggregate of some vector of N individual functions (ala mechanism design, and clarke tax style policies in particular).
What’s your best case scenario?
We explore AGI mind space and eventually create millions and then billions of super-wise/smart/benevolent AI’s. This leads to a new political system—perhaps based on fast cryptoprotocols and new approximations of ideal group decision policies from mechanism design. Operating systems as we know them are replaced with AIs which eventually become something like mental twins, friends, trusted advisers, and political representatives. The main long term objective of the new AI governance is universal resurrection—implemented perhaps in a 100 years or so by turning the moon into a large computing facility. Well before that, existing humans begin uploading into the metaverse.
The average person alive today becomes a basically immortal sim but possesses only upper human intelligence. Those who invested wisely and get in at the right time become entire civilizations unto themselves (gods) - billions or trillions of times more powerful. The power/wealth gap grows without bound. It’s like Jesus said: “To him who has is given more, and from him who has nothing is taken all.”
However, allocating all of future wealth based on however much wealth someone had on the eve of the singularity is probably sub-optimal. The best case would probably also involve some sort of social welfare allocation policy, where the AIs spend a bunch of time evaluating and judging humans to determine a share of some huge wealth allocation. All the dead people who are recreated as sims will need wealth/resources, so decisions need to be made concerning how much wealth each person gets in the afterlife. There are very strong arguments for the need for wealth/money as an intrinsic component of any practical distributed group decision mechanism.
Perhaps the strongest argument against UFAI likelihood is sim-anthropic: the benevolent posthuman civs/gods (re)create far more historical observers than the UFAIs, as part of universal resurrection. Of course, this still depends on us doing everything in our power to create FAI.
Thanks for the clear explanation of your views. What do you see as the main obstacles to achieving this?
Martin Luther King and Ghandi are examples of highly altruistic humans
I’m really worried that mere altruism isn’t enough. If the other agent is more powerful, any subtle differences in values or philosophical views between myself and the other agent could be disastrous, as they optimize the universe according to their values/views which may turn out to be highly suboptimal for me. Consider the difference between average and total utilitarianism, or different views on whether we should assume the universe must be computable, what prior/measure to put on the multiverse, or how to deal with anthropics, e.g. simulation argument.
But I don’t want them to blindly accept my current values/views either, since they may be wrong. Humans seem to have some sort of general problem solving / error correcting algorithm which we call “doing philosophy”, and maybe we can teach that to ANN-based AI more easily than we could program it by hand, so in that sense maybe ANN-based AI actually could be less “nasty” than other approaches.
To me, achieving a near optimal outcome is difficult but not impossible, given enough time, but I don’t see how to get the time. The current leaders in ANN-based AI don’t seem to appreciate the magnitude of the threat, or the difficulty of solving the problem. (Besides Schmidhuber, who apparently does see the threat but is ok with it? Now that Bostrom’s book has been out for a year and presumably most people who are ever going to read it has already read it, I’m not sure what’s going to change their minds.) Perhaps ANN-based AI could be considered more “nasty” in this sense because it seems easier to be complacent about it, thinking that when the time comes, we’ll just teach them our values, whereas trying to design a de novo AGI brings up a bunch of issues like exactly what utility function to give it, or what decision theory or prior, that perhaps makes it easier to see the larger problem.
(The other main obstacle I see is the strong economic and psychological incentives to achieve AGI ASAP, but that’s the same whether we’re talking about ANN-based AI or other kinds of AI.)
Thanks for the clear explanation of your views. What do you see as the main obstacles to achieving this?
My optimistic scenario above assumes not only that we solve the technical problems but also that the current political infrastructure doesn’t get in the way—and in fact just allows itself to be dissolved.
In reality of course I dont think it will be that simple.
There are technical problems like value learning, and then there are socio-political problems. AGI is likely to cause systemic unemployment and thus a large recession which will force politics to get involved. The ideal scenario may be a shift to increased progressive/corporate tax combined with UBI or something equivalent. In the worst cases we have full scale depression and political instability.
Related to that will be the legal decisions concerning rights for AGI (or lack thereof). AGI rights seem natural, but they will also be difficult to enforce. AGI will be hard to define, and a poor definition can easily lead to strange perverse incentives.
Then there are the folks who don’t believe in machine consciousness, or uploading, and basically will view all this as a terrible disaster. It’s probably good that we’ve raised AI risk awareness amongst academics and elites, but AI may now have mainstream branding issues.
One question/concern I have been monitoring for a while now is the response from conservative Christianity. It’s not looking good. Google “Singularity image of the beast” to get an idea.
In terms of risk, it could be that most of the risk lies in an individual or group using AGI to takeover the world, not from failure of value learning itself. Many corporations are essentially dictatorships or nearly so—there is no reason for a selfish CEO to encode anyone else’s values into the AGI they create. Human risk rather than technical.
I’m really worried that mere altruism isn’t enough. If the other agent is more powerful, any subtle differences in values or philosophical views between myself and the other agent could be disastrous, as they optimize the universe according to their values/views which may turn out to be highly suboptimal for me.
You already live in a world filled with a huge sea of agents which have values different than your own. We create new generations of agents all the time, and eventually infuse them with power and responsibility. We don’t need to achieve ‘perfect’ value alignment (and that is probably non coherent regardless). We need only to align value distributions.
That being said, I do believe that the AGI we create will be far more aligned with our values than our children are.
The real fear is perhaps that of being left behind. The only solution to that really is to use AGI to accelerate the development of uploading.
The current leaders in ANN-based AI don’t seem to appreciate the magnitude of the threat, or the difficulty of solving the problem.
From what I see, they have a wide spectrum of opinions. Schmidhuber is also unusual in that—for whatever reasons—he’s pretty open about his views on the long term future, whereas many other researchers are more reserved.
Also, most of the top ANN researchers do not see a clear near term path to AGI—or else they would be implementing it already. They are focused on extending out from current solutions. Value learning comes later, in terms of natural engineering dependencies.
thinking that when the time comes, we’ll just teach them our values,
Well yes—in the ANN approach that is the most likely solution. And actually its the most likely solution regardless, because designing a human complexity utility/value function by hand is just not workable.
One question/concern I have been monitoring for a while now is the response from conservative Christianity. It’s not looking good. Google “Singularity image of the beast” to get an idea.
What kind of problems do you think this will lead to, down the line?
You already live in a world filled with a huge sea of agents which have values different than your own. We create new generations of agents all the time, and eventually infuse them with power and responsibility.
This is true, but:
I’m not comparing ANN-based AGI to the status quo, but to a future with some sort of near-optimal FAI.
The new agents we currently create aren’t much more powerful than ourselves, and cannot take over the universe and foreclose the possibility of a better outcome.
Humans or humanity as a whole seem capable of making moral and philosophical progress, and this capability is likely to persist in future generations. I’m not sure the same will be true of ANN-based AGIs.
That being said, I do believe that the AGI we create will be far more aligned with our values than our children are.
I look forward to your post explaining this, but again my fear is that since to a large extent I don’t know what my own values are (especially when it comes to post-Singularity problems like how to reorganize the universe on a large scale, i.e., whether we should run it according to Eliezer’s Fun Theory, or convert it to hedonium, or what sort of hedonium exactly, or to spend most of the resources available to me on some sort of attempt to break out of any potential simulations we might be in, or run simulations of my own), straightforward approaches at value learning won’t work when it comes to people like me, and there won’t be time to work out how to teach the AGI to solve these and other philosophical problems.
The real fear is perhaps that of being left behind. The only solution to that really is to use AGI to accelerate the development of uploading.
Because we care about preserving our personal identities whereas many AGIs probably won’t, AGIs will be faced with fewer constraints when it comes to improving themselves or designing new generations of AGIs, and along with a time advantage that is likely quite large in subjective time, this probably means that AGIs will always have a large advantage in intelligence until they reach the maximum feasible level in this universe and human uploads slowly catch up. Are you not worried that during this time, the AGIs will take over the universe and reorganize it according to their imperfect understanding of our values, which will look disastrous when we become superintelligences ourselves and figure out what we really want?
One question/concern I have been monitoring for a while now is the response from conservative Christianity. It’s not looking good. Google “Singularity image of the beast” to get an idea.
What kind of problems do you think this will lead to, down the line?
Hopefully none—but the conservative protestant faction seems to have considerable political power in the US, which could lead to policy blunders. Due to that one stupid book (revelations), the xian biblical worldview is almost programmed to lash out at any future system which offers actual immortality. The controversy over stem cells and cloning is perhaps just the beginning.
On the other hand, out of all religions, liberal xtianity is perhaps closest to transhumanism, and could be its greatest ally.
As an example, consider this quote:
It is a serious thing to live in a society of possible gods and goddesses, to remember that the dullest and most uninteresting person you talk to may one day be a creature which, if you saw it now, you would be strongly tempted to worship.
This sounds like something a transhumanist might say, but it’s actually from C.S. Lewis:
The command Be ye perfect is not idealistic gas. Nor is it a command to do the impossible. He is going to make us into creatures that can obey that command. He said (in the Bible) that we were “gods” and He is going to make good His words. If we let Him—for we can prevent Him, if we choose—He will make the feeblest and filthiest of us into a god or goddess, dazzling, radiant, immortal creature, pulsating all through with such energy and joy and wisdom and love as we cannot now imagine, a bright stainless mirror which reflects back to God perfectly (though, of course, on a smaller scale) His own boundless power and delight and goodness. The process will be long and in parts very painful; but that is what we are in for. Nothing less. He meant what He said.
Divinization or apotheosis is one of the main belief currents underlying xtianity, emphasized to varying degrees across sub-variations and across time.
..
[We alread create lots of new agents with different beliefs …]
This is true, but:
I’m not comparing ANN-based AGI to the status quo, but to a future with some sort of near-optimal FAI.
The practical real world FAI that we can create is going to be a civilization that evolves from what we have now—a complex system of agents and hierarchies of agents. ANN-based AGI is a new component, but there is more to a civilization than just the brain hardware.
The new agents we currently create aren’t much more powerful than ourselves, and cannot take over the universe and foreclose the possibility of a better outcome.
Humanity today is enormously more powerful than our ancestors from say a few thousand years ago. AGI just continues the exponential time-acceleration trend, it doesn’t necessarily change the trend.
From the perspective of humanity of a thousand years ago, friendliness mainly boils down to a single factor: will the future posthuman civ ressurrect them into a heaven sim?
Humans or humanity as a whole seem capable of making moral and philosophical progress, and this capability is likely to persist in future generations. I’m not sure the same will be true of ANN-based AGIs.
Why not?
One of the main implications of the brain being a ULM is that friendliness is not just a hardware issue. There is a hardware component in terms of the value learning subsystem, but once you solve that, it is mostly a software issue. It’s a culture/worldview/education issue. The memetic software of humanity is the same software that we will instill into AGI.
That being said, I do believe that the AGI we create will be far more aligned with our values than our children are.
I look forward to your post explaining this, but again my fear is that since to a large extent I don’t know what my own values are (especially when it comes to post-Singularity problems like how to reorganize the universe on a large scale . .
I don’t see how that is a problem. You may not know yourself completely, but have some estimation or distribution over your values. As long as you continue to exist into the future, and as long as you have a significant share in the future decision structure (ie wealth or voting rights), then that should suffice—you will have time to figure out your long term values.
Are you not worried that during this time, the AGIs will take over the universe and reorganize it according to their imperfect understanding of our values, which will look disastrous when we become superintelligences ourselves and figure out what we really want?
This is a potential worry, but it can probably be prevented.
The brain is reasonably efficient in terms of intelligence per unit energy. Brains evolved from the bottom up, and biological cells are near optimal nanocomputers (near optimal in terms of both storage density in DNA, and near optimal in terms of energy cost per irreversible bit op in DNA copying and protein computations). The energetic cost of computation in brains and modern computers alike is dominated by wire energy dissipation in terms of bits/J/mm. Moore’s law is approaching it’s end which will result in hardware that is on par a little better than the brain. With huge investments into software cleverness, we can close the gap and achieve AGI. In 5 years or so, lets say that 1 AGI runs amortized on 1 GPU (neuromorphics doesn’t change this picture dramatically). That means an AGI will only require 100 watts of energy and say $1,000/year. That is about a 100x productivity increase, but in a pinch humans can survive on only $10,000 a year.
Today the foundry industry produces about 10 million mid-high end GPUs per year. There are about 100 million human births per year, and around 4 million per year in the US. Of course if we consider only humans with IQ > 135, then there are only 1 million high IQ humans born per year. This puts some constraints on the likely transition time, and it is likely measured in years.
We don’t need to instill values so perfectly that we can rely on our AGI to solve all of our problems until the end of time—we just need AGI to be similar enough to us that it can function as at least a replacement for future human generations and fulfill the game theoretic pact across time of FAI/god/resurrection.
liberal xtianity is perhaps closest to transhumanism, and could be its greatest ally
There’s some truth in the first half of that, but I’m not so sure about the second. Expecting that God will at some point transform us into something beyond present-day humanity is a very different thing from planning to make that transformation ourselves. That whole “playing God” accusation probably gets worse, rather than better, if you’re actually expecting God to do the thing in question on his own terms and his own schedule.
For a far-from-perfect analogy, consider the interaction between creationism and climate change. You might say: Those who fear that human activity might lead to disastrous changes in the climate, including serious harm to humanity, should find their greatest allies in those who believe that in the past God brought about a disastrous change in the earth’s climate and wrought serious harm to humanity. But, no, of course it doesn’t actually work that way; what actually happens is that creationists say “human activity can’t harm the climate much; God promised no more worldwide floods” or “the alleged human influence on climate is on a long timescale, and God will be wrapping everything up soon anyway”.
Expecting that God will at some point transform us into something beyond present-day humanity is a very different thing from planning to make that transformation ourselves.
Not necessarily. There is this whole idea that we are god or some aspect of god—as Jesus famously said ” Is it not written in your law, I said, Ye are gods?”. There is also the interesting concept in early xtianity that christ became a sort of distributed mind—that the church is literally the risen christ. Teilhard de Chardin gave a modern spin on that old idea. See also the assimilation saying. Paul thought something similar when he said things like ” It is no longer I who live, but Christ who lives in me”. So there is this strong tradition that Christ is something that can inhabit people. In that tradition (which really is the most authentic ) god builds the kingdom through humans. Equating the ‘kingdom’ with a positive singularity is a no brainer.
Yes the literalist faction will always wait for some external event, and to them Christ is a singular physical being, but that isn’t the high IQ faction of xtianity.
For a far-from-perfect analogy, consider the interaction between creationism and climate change.
Creationists are biblical literalists—any hope for an ally is in the more sophisticated liberal variants.
Different configurations of artificial neurons (e.g., RNNs vs CNNs) are better at learning different things. If you build an AGI and don’t test whether it can learn to do philosophy, it may not be able to learn to do philosophy very well. In the rush to build AGIs in order to reap the economic benefits, people probably won’t have time to test for this.
The memetic software of humanity is the same software that we will instill into AGI.
I’m guessing that AGIs will have a very different distribution of capabilities from humans (e.g., they’ll have much more working memory, and be able to do complex calculations instantaneously and with very low error, but bad at certain things that we neglect to optimize for when building them) so they’ll probably develop a different set of memetic software that’s more optimal for them.
As long as you continue to exist into the future, and as long as you have a significant share in the future decision structure (ie wealth or voting rights), then that should suffice—you will have time to figure out your long term values.
I guess that could potentially work while AGIs are maxed out at human level or slightly beyond and costing $1000/year, but if I’m not very optimistic that any social structure we come up with could preserve our share of the universe as the AGIs improve themselves and become more powerful. For example, if an AGI or a group of AGIs figures out a way to colonize the universe using resources under their sole control, why would they give the rest of us a share?
Today the foundry industry produces about 10 million mid-high end GPUs per year.
Surely there are lots of foundries (Intel’s for example) that could be retooled to build GPUs if it became profitable to do so?
This puts some constraints on the likely transition time, and it is likely measured in years.
The hope is that we use this time to develop the necessary social structures to prevent AGIs from taking over the universe (without giving us a significant share of it)?
If you build an AGI and don’t test whether it can learn to do philosophy, it may not be able to learn to do philosophy very well.
AGI to me is synonymous with a universal learning machine, and in particular with a ULM that learns at human capability. Philosophy is highly unlikely to require any specialized structures—because humans do philosophy with the same general cortical circuitry that’s used for everything else.
In the rush to build AGIs in order to reap the economic benefits, people probably won’t have time to test for this.
This is a potential problem, but the solution comes naturally if you—do the unthinkable for LWers—and think of AGI as persons/citizens. States invest heavily into educating new citizens beyond just economic productivity, as new people have rights and control privileges, so it’s important to ensure a certain level of value alignment with the state/society at large.
In particular—and this is key—we do not allow religions or corporations to raise people with arbitrary values.
I’m not very optimistic that any social structure we come up with could preserve our share of the universe as the AGIs improve themselves and become more powerful.
Yeah—but we only need to manage the transition until human uploading. Uploading has enormous economic value—it is the killer derived app for AGI tech, and brain inspired AGI in particular. It seems far now mainly because AGI still seems far, but given AGI then change will happen quickly: first there will be a large wealth transfer to those who developed AGI and or predicted it, and consequently uploading will become up-prioritized.
Surely there are lots of foundries (Intel’s for example) that could be retooled to build GPUs if it became profitable to do so?
Yeah—it could be pumped up to 10x current output fairly easily, and perhaps even 100x given a few years.
The hope is that we use this time to develop the necessary social structures to prevent AGIs from taking over the universe (without giving us a significant share of it)?
I expect that individual companies will develop their own training/educational protocols. Government will need some significant prodding to get involved quickly, otherwise they will move very slowly. So the first corps or groups to develop AGI could have a great deal of influence.
One variable of interest—which I am uncertain of—is the timetable involved in forcing a key decision through the court system. For example—say company X creates AGI. Somebody then sues them on behalf of their AGIs for child neglect or rights violation or whatever—how long does it take the court decide if and what types of software could be considered citizens? The difference between 1 year and say 10 could be quite significant.
At the moment it looks like the most straightforward route to having high leverage over the future is to be involved in the creation of AGI.
AGI to me is synonymous with a universal learning machine, and in particular with a ULM that learns at human capability. Philosophy is highly unlikely to require any specialized structures—because humans do philosophy with the same general cortical circuitry that’s used for everything else.
I also have some hope that philosophy ability essentially comes “for free” with general intelligence, but I’m not sure I want to bet the future of the universe on it. Also, an AGI may be capable of learning to do philosophy, but isn’t motivated to do it, or isn’t motivated to follow the implications of its own philosophical reasoning. A lot of humans for example don’t seem to have much interest in philosophy, but instead things like maximizing wealth and status.
This is a potential problem, but the solution comes naturally if you—do the unthinkable for LWers—and think of AGI as persons/citizens.
Do you have detailed ideas of how that would work? For example if in 2030, we can make a copy of an AGI for $1000 (cost of a GPU) and that cost keeps decreasing, do we give each of them an equal vote? How do we enforce AGI rights and responsibilities if eventually anyone could buy a GPU card, download some open source software and make a new AGI?
Yeah—but we only need to manage the transition until human uploading. Uploading has enormous economic value—it is the killer derived app for AGI tech, and brain inspired AGI in particular.
I argued in a previous comment that it’s unlikely that uploads will be able to match AGIs in intelligence until AGIs reach the maximum feasible level allowed by physics and uploads catch up, but I don’t thinking you responded to that argument. If I’m correct in this, it doesn’t seem like the development of uploading tech will make any difference. Why do you think it’s a crucial threshold?
how long does it take the court decide if and what types of software could be considered citizens? The difference between 1 year and say 10 could be quite significant.
Even 10 years seem too optimistic to me. I think a better bet, if we want to take this approach, would be to convince governments to pass laws ahead of time, or prepare them to pass the necessary laws quickly once we get AGIs. But again, what laws would you want these to be, in detail?
Why do you believe this? Do you think that brain inspired ANN based AI is intrinsically more ‘nasty’ or dangerous than human brains? Why?
Other agents are dangerous to me to the extent that (1) they don’t share my values/goals, and (2) they are powerful enough that in pursuing their own goals, they have little need to take game theoretic consideration of my values. ANN based AI will be similar to other humans in (1), and regarding (2) they are likely to be more powerful than humans since they’ll be running on faster, more capable hardware than human brains, and probably have better algorithms as well.
Schmidhuber’s best case scenario for superintelligence is that they take no interest in humanity, colonize space and leave us to survive on Earth. What’s your best case scenario? Does it seem not much worse to you than the best case scenario for FAI (i.e., if humanity could coordinate to solve the cosmic tragedy of the commons problem and wait until we know how to safely build an AGI that shares some compromise, e.g., weighted average, of all human values)?
Your points 1 and 2 are true but only in degrees. Humans vary significantly in terms of altruism (1) and power (2). Hitler—from what I’ve read—is a good example of a powerful, non-altruistic human. Martin Luther King and Ghandi are examples of highly altruistic humans (the first patterned directly after Jesus, the second patterned after Jesus and Bhudda). Now, it could be the case that these two were more selfish than they appear at first, because they were motivated by reward in the afterlife. Well perhaps to a degree, but that line of argument mostly fails as a complete explanation (and even if true, could also potentially become a strategy).
Finally, brain inspired ANNs != human brains. We can take inspiration from the best examples of human capabilities and qualities while avoiding the worst, and then extrapolate to superhuman dimensions.
Altruism can be formalized by group decision/utility functions, where the agent’s utility function implements some approximation of the ideal aggregate of some vector of N individual functions (ala mechanism design, and clarke tax style policies in particular).
We explore AGI mind space and eventually create millions and then billions of super-wise/smart/benevolent AI’s. This leads to a new political system—perhaps based on fast cryptoprotocols and new approximations of ideal group decision policies from mechanism design. Operating systems as we know them are replaced with AIs which eventually become something like mental twins, friends, trusted advisers, and political representatives. The main long term objective of the new AI governance is universal resurrection—implemented perhaps in a 100 years or so by turning the moon into a large computing facility. Well before that, existing humans begin uploading into the metaverse.
The average person alive today becomes a basically immortal sim but possesses only upper human intelligence. Those who invested wisely and get in at the right time become entire civilizations unto themselves (gods) - billions or trillions of times more powerful. The power/wealth gap grows without bound. It’s like Jesus said: “To him who has is given more, and from him who has nothing is taken all.”
However, allocating all of future wealth based on however much wealth someone had on the eve of the singularity is probably sub-optimal. The best case would probably also involve some sort of social welfare allocation policy, where the AIs spend a bunch of time evaluating and judging humans to determine a share of some huge wealth allocation. All the dead people who are recreated as sims will need wealth/resources, so decisions need to be made concerning how much wealth each person gets in the afterlife. There are very strong arguments for the need for wealth/money as an intrinsic component of any practical distributed group decision mechanism.
Perhaps the strongest argument against UFAI likelihood is sim-anthropic: the benevolent posthuman civs/gods (re)create far more historical observers than the UFAIs, as part of universal resurrection. Of course, this still depends on us doing everything in our power to create FAI.
Thanks for the clear explanation of your views. What do you see as the main obstacles to achieving this?
I’m really worried that mere altruism isn’t enough. If the other agent is more powerful, any subtle differences in values or philosophical views between myself and the other agent could be disastrous, as they optimize the universe according to their values/views which may turn out to be highly suboptimal for me. Consider the difference between average and total utilitarianism, or different views on whether we should assume the universe must be computable, what prior/measure to put on the multiverse, or how to deal with anthropics, e.g. simulation argument.
But I don’t want them to blindly accept my current values/views either, since they may be wrong. Humans seem to have some sort of general problem solving / error correcting algorithm which we call “doing philosophy”, and maybe we can teach that to ANN-based AI more easily than we could program it by hand, so in that sense maybe ANN-based AI actually could be less “nasty” than other approaches.
To me, achieving a near optimal outcome is difficult but not impossible, given enough time, but I don’t see how to get the time. The current leaders in ANN-based AI don’t seem to appreciate the magnitude of the threat, or the difficulty of solving the problem. (Besides Schmidhuber, who apparently does see the threat but is ok with it? Now that Bostrom’s book has been out for a year and presumably most people who are ever going to read it has already read it, I’m not sure what’s going to change their minds.) Perhaps ANN-based AI could be considered more “nasty” in this sense because it seems easier to be complacent about it, thinking that when the time comes, we’ll just teach them our values, whereas trying to design a de novo AGI brings up a bunch of issues like exactly what utility function to give it, or what decision theory or prior, that perhaps makes it easier to see the larger problem.
(The other main obstacle I see is the strong economic and psychological incentives to achieve AGI ASAP, but that’s the same whether we’re talking about ANN-based AI or other kinds of AI.)
My optimistic scenario above assumes not only that we solve the technical problems but also that the current political infrastructure doesn’t get in the way—and in fact just allows itself to be dissolved.
In reality of course I dont think it will be that simple.
There are technical problems like value learning, and then there are socio-political problems. AGI is likely to cause systemic unemployment and thus a large recession which will force politics to get involved. The ideal scenario may be a shift to increased progressive/corporate tax combined with UBI or something equivalent. In the worst cases we have full scale depression and political instability.
Related to that will be the legal decisions concerning rights for AGI (or lack thereof). AGI rights seem natural, but they will also be difficult to enforce. AGI will be hard to define, and a poor definition can easily lead to strange perverse incentives.
Then there are the folks who don’t believe in machine consciousness, or uploading, and basically will view all this as a terrible disaster. It’s probably good that we’ve raised AI risk awareness amongst academics and elites, but AI may now have mainstream branding issues.
One question/concern I have been monitoring for a while now is the response from conservative Christianity. It’s not looking good. Google “Singularity image of the beast” to get an idea.
In terms of risk, it could be that most of the risk lies in an individual or group using AGI to takeover the world, not from failure of value learning itself. Many corporations are essentially dictatorships or nearly so—there is no reason for a selfish CEO to encode anyone else’s values into the AGI they create. Human risk rather than technical.
You already live in a world filled with a huge sea of agents which have values different than your own. We create new generations of agents all the time, and eventually infuse them with power and responsibility. We don’t need to achieve ‘perfect’ value alignment (and that is probably non coherent regardless). We need only to align value distributions.
That being said, I do believe that the AGI we create will be far more aligned with our values than our children are.
The real fear is perhaps that of being left behind. The only solution to that really is to use AGI to accelerate the development of uploading.
From what I see, they have a wide spectrum of opinions. Schmidhuber is also unusual in that—for whatever reasons—he’s pretty open about his views on the long term future, whereas many other researchers are more reserved.
Also, most of the top ANN researchers do not see a clear near term path to AGI—or else they would be implementing it already. They are focused on extending out from current solutions. Value learning comes later, in terms of natural engineering dependencies.
Well yes—in the ANN approach that is the most likely solution. And actually its the most likely solution regardless, because designing a human complexity utility/value function by hand is just not workable.
What kind of problems do you think this will lead to, down the line?
This is true, but:
I’m not comparing ANN-based AGI to the status quo, but to a future with some sort of near-optimal FAI.
The new agents we currently create aren’t much more powerful than ourselves, and cannot take over the universe and foreclose the possibility of a better outcome.
Humans or humanity as a whole seem capable of making moral and philosophical progress, and this capability is likely to persist in future generations. I’m not sure the same will be true of ANN-based AGIs.
I look forward to your post explaining this, but again my fear is that since to a large extent I don’t know what my own values are (especially when it comes to post-Singularity problems like how to reorganize the universe on a large scale, i.e., whether we should run it according to Eliezer’s Fun Theory, or convert it to hedonium, or what sort of hedonium exactly, or to spend most of the resources available to me on some sort of attempt to break out of any potential simulations we might be in, or run simulations of my own), straightforward approaches at value learning won’t work when it comes to people like me, and there won’t be time to work out how to teach the AGI to solve these and other philosophical problems.
Because we care about preserving our personal identities whereas many AGIs probably won’t, AGIs will be faced with fewer constraints when it comes to improving themselves or designing new generations of AGIs, and along with a time advantage that is likely quite large in subjective time, this probably means that AGIs will always have a large advantage in intelligence until they reach the maximum feasible level in this universe and human uploads slowly catch up. Are you not worried that during this time, the AGIs will take over the universe and reorganize it according to their imperfect understanding of our values, which will look disastrous when we become superintelligences ourselves and figure out what we really want?
Hopefully none—but the conservative protestant faction seems to have considerable political power in the US, which could lead to policy blunders. Due to that one stupid book (revelations), the xian biblical worldview is almost programmed to lash out at any future system which offers actual immortality. The controversy over stem cells and cloning is perhaps just the beginning.
On the other hand, out of all religions, liberal xtianity is perhaps closest to transhumanism, and could be its greatest ally.
As an example, consider this quote:
This sounds like something a transhumanist might say, but it’s actually from C.S. Lewis:
Divinization or apotheosis is one of the main belief currents underlying xtianity, emphasized to varying degrees across sub-variations and across time.
..
The practical real world FAI that we can create is going to be a civilization that evolves from what we have now—a complex system of agents and hierarchies of agents. ANN-based AGI is a new component, but there is more to a civilization than just the brain hardware.
Humanity today is enormously more powerful than our ancestors from say a few thousand years ago. AGI just continues the exponential time-acceleration trend, it doesn’t necessarily change the trend.
From the perspective of humanity of a thousand years ago, friendliness mainly boils down to a single factor: will the future posthuman civ ressurrect them into a heaven sim?
Why not?
One of the main implications of the brain being a ULM is that friendliness is not just a hardware issue. There is a hardware component in terms of the value learning subsystem, but once you solve that, it is mostly a software issue. It’s a culture/worldview/education issue. The memetic software of humanity is the same software that we will instill into AGI.
I don’t see how that is a problem. You may not know yourself completely, but have some estimation or distribution over your values. As long as you continue to exist into the future, and as long as you have a significant share in the future decision structure (ie wealth or voting rights), then that should suffice—you will have time to figure out your long term values.
This is a potential worry, but it can probably be prevented.
The brain is reasonably efficient in terms of intelligence per unit energy. Brains evolved from the bottom up, and biological cells are near optimal nanocomputers (near optimal in terms of both storage density in DNA, and near optimal in terms of energy cost per irreversible bit op in DNA copying and protein computations). The energetic cost of computation in brains and modern computers alike is dominated by wire energy dissipation in terms of bits/J/mm. Moore’s law is approaching it’s end which will result in hardware that is on par a little better than the brain. With huge investments into software cleverness, we can close the gap and achieve AGI. In 5 years or so, lets say that 1 AGI runs amortized on 1 GPU (neuromorphics doesn’t change this picture dramatically). That means an AGI will only require 100 watts of energy and say $1,000/year. That is about a 100x productivity increase, but in a pinch humans can survive on only $10,000 a year.
Today the foundry industry produces about 10 million mid-high end GPUs per year. There are about 100 million human births per year, and around 4 million per year in the US. Of course if we consider only humans with IQ > 135, then there are only 1 million high IQ humans born per year. This puts some constraints on the likely transition time, and it is likely measured in years.
We don’t need to instill values so perfectly that we can rely on our AGI to solve all of our problems until the end of time—we just need AGI to be similar enough to us that it can function as at least a replacement for future human generations and fulfill the game theoretic pact across time of FAI/god/resurrection.
There’s some truth in the first half of that, but I’m not so sure about the second. Expecting that God will at some point transform us into something beyond present-day humanity is a very different thing from planning to make that transformation ourselves. That whole “playing God” accusation probably gets worse, rather than better, if you’re actually expecting God to do the thing in question on his own terms and his own schedule.
For a far-from-perfect analogy, consider the interaction between creationism and climate change. You might say: Those who fear that human activity might lead to disastrous changes in the climate, including serious harm to humanity, should find their greatest allies in those who believe that in the past God brought about a disastrous change in the earth’s climate and wrought serious harm to humanity. But, no, of course it doesn’t actually work that way; what actually happens is that creationists say “human activity can’t harm the climate much; God promised no more worldwide floods” or “the alleged human influence on climate is on a long timescale, and God will be wrapping everything up soon anyway”.
Not necessarily. There is this whole idea that we are god or some aspect of god—as Jesus famously said ” Is it not written in your law, I said, Ye are gods?”. There is also the interesting concept in early xtianity that christ became a sort of distributed mind—that the church is literally the risen christ. Teilhard de Chardin gave a modern spin on that old idea. See also the assimilation saying. Paul thought something similar when he said things like ” It is no longer I who live, but Christ who lives in me”. So there is this strong tradition that Christ is something that can inhabit people. In that tradition (which really is the most authentic ) god builds the kingdom through humans. Equating the ‘kingdom’ with a positive singularity is a no brainer.
Yes the literalist faction will always wait for some external event, and to them Christ is a singular physical being, but that isn’t the high IQ faction of xtianity.
Creationists are biblical literalists—any hope for an ally is in the more sophisticated liberal variants.
Different configurations of artificial neurons (e.g., RNNs vs CNNs) are better at learning different things. If you build an AGI and don’t test whether it can learn to do philosophy, it may not be able to learn to do philosophy very well. In the rush to build AGIs in order to reap the economic benefits, people probably won’t have time to test for this.
I’m guessing that AGIs will have a very different distribution of capabilities from humans (e.g., they’ll have much more working memory, and be able to do complex calculations instantaneously and with very low error, but bad at certain things that we neglect to optimize for when building them) so they’ll probably develop a different set of memetic software that’s more optimal for them.
I guess that could potentially work while AGIs are maxed out at human level or slightly beyond and costing $1000/year, but if I’m not very optimistic that any social structure we come up with could preserve our share of the universe as the AGIs improve themselves and become more powerful. For example, if an AGI or a group of AGIs figures out a way to colonize the universe using resources under their sole control, why would they give the rest of us a share?
Surely there are lots of foundries (Intel’s for example) that could be retooled to build GPUs if it became profitable to do so?
The hope is that we use this time to develop the necessary social structures to prevent AGIs from taking over the universe (without giving us a significant share of it)?
AGI to me is synonymous with a universal learning machine, and in particular with a ULM that learns at human capability. Philosophy is highly unlikely to require any specialized structures—because humans do philosophy with the same general cortical circuitry that’s used for everything else.
This is a potential problem, but the solution comes naturally if you—do the unthinkable for LWers—and think of AGI as persons/citizens. States invest heavily into educating new citizens beyond just economic productivity, as new people have rights and control privileges, so it’s important to ensure a certain level of value alignment with the state/society at large.
In particular—and this is key—we do not allow religions or corporations to raise people with arbitrary values.
Yeah—but we only need to manage the transition until human uploading. Uploading has enormous economic value—it is the killer derived app for AGI tech, and brain inspired AGI in particular. It seems far now mainly because AGI still seems far, but given AGI then change will happen quickly: first there will be a large wealth transfer to those who developed AGI and or predicted it, and consequently uploading will become up-prioritized.
Yeah—it could be pumped up to 10x current output fairly easily, and perhaps even 100x given a few years.
I expect that individual companies will develop their own training/educational protocols. Government will need some significant prodding to get involved quickly, otherwise they will move very slowly. So the first corps or groups to develop AGI could have a great deal of influence.
One variable of interest—which I am uncertain of—is the timetable involved in forcing a key decision through the court system. For example—say company X creates AGI. Somebody then sues them on behalf of their AGIs for child neglect or rights violation or whatever—how long does it take the court decide if and what types of software could be considered citizens? The difference between 1 year and say 10 could be quite significant.
At the moment it looks like the most straightforward route to having high leverage over the future is to be involved in the creation of AGI.
I also have some hope that philosophy ability essentially comes “for free” with general intelligence, but I’m not sure I want to bet the future of the universe on it. Also, an AGI may be capable of learning to do philosophy, but isn’t motivated to do it, or isn’t motivated to follow the implications of its own philosophical reasoning. A lot of humans for example don’t seem to have much interest in philosophy, but instead things like maximizing wealth and status.
Do you have detailed ideas of how that would work? For example if in 2030, we can make a copy of an AGI for $1000 (cost of a GPU) and that cost keeps decreasing, do we give each of them an equal vote? How do we enforce AGI rights and responsibilities if eventually anyone could buy a GPU card, download some open source software and make a new AGI?
I argued in a previous comment that it’s unlikely that uploads will be able to match AGIs in intelligence until AGIs reach the maximum feasible level allowed by physics and uploads catch up, but I don’t thinking you responded to that argument. If I’m correct in this, it doesn’t seem like the development of uploading tech will make any difference. Why do you think it’s a crucial threshold?
Even 10 years seem too optimistic to me. I think a better bet, if we want to take this approach, would be to convince governments to pass laws ahead of time, or prepare them to pass the necessary laws quickly once we get AGIs. But again, what laws would you want these to be, in detail?