This is precisely what we need to engineer! Unless your claim is that there is noNash equilibrium in which humanity survives, which seems like a fairly hopeless standpoint to assume. If you are correct, we all die. If you are wrong, we abandon our only hope of survival.
What I am saying is that if you roll a bunch of random superintelligences, superintelligences that don’t care in the slightest about humanity in their utility function, then selection of a Nash equilibria is enough to get a nice future. It certainly isn’t enough if humans are doing the selection and we don’t know what the AI’s want or what technologies they will have. Will one superintelligence be sufficiently transparent to another superintelligence that they will be able to provide logical proofs of their future behaviour to each other? Where does the armsrace of stealth and detection end up? What about
If at least some of the AI’s have been deliberately designed to care about us, then we might get a nice future.
From the article you link to
After the initial euphoria of the 1970s, a collapse in world metal prices, combined with relatively easy access to minerals in the developing world, dampened interest in seabed mining.
On the other hand, people do drill for oil in the ocean. It sounds to me like deep seabed mining is unprofitable or not that profitable, given current tech and metal prices.
I suspect such a Nash equilibrium involves multiple AIs competing with strong norms against violence and focus on positive-sum trades.
If you have a tribe of humans, and the tribe has norm then everyone is expected to be able to understand the norms. The norms have to be fairly straightforward to humans. Don’t do X except for [100 subtle special cases] gets simplified to don’t do X. This happens even when everyone would be better off with the special cases. When you have big corporations with legal teams, the agreements can be more complicated. When you have super-intelligences, the agreements can be Far more complicated. Humans and human organisations are reluctant to agree to a complicated deal that only benefits them slightly, from the overhead cost of reading and thinking about the deal.
Whatsmore, the Nash equilibria that humanity has been in has changed with technology and society. If a Nash equilibria is all that protects humanity, if an AI comes up with a way to kill off all humans and distribute their reasources equally, without any AI being able to figure out who killed the humans, then the AI will kill all humans. Nash equilibria are fragile to details of situation and technology. If one AI can build a spacecraft and escape to a distant galaxy, which will be over the cosmic event horizon before the other AI’s can do anything, that changes the equilibrium. In a dyson swarm, one AI deliberately letting debris fly about might be able to Kessler syndrome the whole swarm, mutually assured destruction, but the debris deflection tech might improve and change the Nash equilibria.
My point is, I’m not sure that aligned AI (in the narrow technical sense of coherently extrapolated values) is even a well-defined term. Nor do I think it is an outcome to the singularity we can easily engineer, since it requires us to both engineer such an AI and to make sure that it is the dominant AI in the post-singularity world.
We need an AI that in some sense wants the world to be a nice place to live. If we were able to give a fully formal exact definition of this, we would be much further on at AI alignment. Saying that you want an image that is “beautiful and contains trees” is not a formal specification of the RGB values of each pixel. However, there are images that are beautiful and contain trees. Likewise saying you want an “aligned AI” is not a formal description of every byte of source code, but there are still patterns of source code that are aligned AI’s.
Suppose someone figured out alignment and shared the result widely. Making your AI aligned is straightforward. Almost all the serious AI experts agree that AI risks are real and alignment is a good idea. All the serious AI research teams are racing to build an Aligned AI.
Scenario 2. Aligned AI is a bit harder than unaligned AI. However, all the worlds competent AI experts realise that aligned AI would benefit all, and that it is harder to align an AI when you are in a race. They come together into a single worldwide project to build aligned AI. They take their time to do things right. Any competing group is tiny and hopeless, partly because they make an effort to reach out to and work with anyone competent in the field.
What I am saying is that if you roll a bunch of random superintelligences, superintelligences that don’t care in the slightest about humanity in their utility function, then selection of a Nash equilibria is enough to get a nice future. It certainly isn’t enough if humans are doing the selection and we don’t know what the AI’s want or what technologies they will have. Will one superintelligence be sufficiently transparent to another superintelligence that they will be able to provide logical proofs of their future behaviour to each other? Where does the armsrace of stealth and detection end up? What about
If at least some of the AI’s have been deliberately designed to care about us, then we might get a nice future.
From the article you link to
On the other hand, people do drill for oil in the ocean. It sounds to me like deep seabed mining is unprofitable or not that profitable, given current tech and metal prices.
If you have a tribe of humans, and the tribe has norm then everyone is expected to be able to understand the norms. The norms have to be fairly straightforward to humans. Don’t do X except for [100 subtle special cases] gets simplified to don’t do X. This happens even when everyone would be better off with the special cases. When you have big corporations with legal teams, the agreements can be more complicated. When you have super-intelligences, the agreements can be Far more complicated. Humans and human organisations are reluctant to agree to a complicated deal that only benefits them slightly, from the overhead cost of reading and thinking about the deal.
Whatsmore, the Nash equilibria that humanity has been in has changed with technology and society. If a Nash equilibria is all that protects humanity, if an AI comes up with a way to kill off all humans and distribute their reasources equally, without any AI being able to figure out who killed the humans, then the AI will kill all humans. Nash equilibria are fragile to details of situation and technology. If one AI can build a spacecraft and escape to a distant galaxy, which will be over the cosmic event horizon before the other AI’s can do anything, that changes the equilibrium. In a dyson swarm, one AI deliberately letting debris fly about might be able to Kessler syndrome the whole swarm, mutually assured destruction, but the debris deflection tech might improve and change the Nash equilibria.
We need an AI that in some sense wants the world to be a nice place to live. If we were able to give a fully formal exact definition of this, we would be much further on at AI alignment. Saying that you want an image that is “beautiful and contains trees” is not a formal specification of the RGB values of each pixel. However, there are images that are beautiful and contain trees. Likewise saying you want an “aligned AI” is not a formal description of every byte of source code, but there are still patterns of source code that are aligned AI’s.
Suppose someone figured out alignment and shared the result widely. Making your AI aligned is straightforward. Almost all the serious AI experts agree that AI risks are real and alignment is a good idea. All the serious AI research teams are racing to build an Aligned AI.
Scenario 2. Aligned AI is a bit harder than unaligned AI. However, all the worlds competent AI experts realise that aligned AI would benefit all, and that it is harder to align an AI when you are in a race. They come together into a single worldwide project to build aligned AI. They take their time to do things right. Any competing group is tiny and hopeless, partly because they make an effort to reach out to and work with anyone competent in the field.