First, there is an underappreciation of adversarial processes.
I think you’re right that people often think those adversarial processes will not be adequate to endure the sort of pressure that an AI could put on them. Again, it seems like an open problem; perhaps financial markets will be sophisticated and efficient enough that an AI quickly gives up on making lots of money there, or perhaps AI systems with novel capabilities will be able to make as much money as the finance industry does now. It seems pretty strategically relevant which of the two will be the case!
I also think you’re understating the degree of class collusion that’s possible. Humans get into conflict with other humans, but humanity as a whole beat neanderthals, even tho there were likely lots of small adversarial moments where humans could knock their human rivals down a peg by standing up for individual neanderthals (or whatever). We might end up in a situation where lots of different AI systems are competing for control over the Earth’s resources, but unless those AI systems all happen to care about the atmosphere being breathable this is likely going to end up poorly for humans.
It seems likely to me that it will be instrumentally convergent (i.e. lots of AI systems, even with strong disagreements on other topics, will spontaneously agree on this topic) for AI systems to disenfranchise humans and enfranchise themselves, or more generally remove humans as a security concern. If you think, for example, that competing firms in the US can manage to pay their taxes and cooperate on national defense despite being in the middle of an adversarial process, then why not expect similar things for ‘AI nations’?
I’d also say that humans are very misaligned with entities much less powerful than themselves, like slaves, animals, women, blacks, and more, and misalignment is the norm in history, not alignment.
Also, the Covid and monkeypox pandemics showed that we are relatively inadequate handling pandemics, and the biggest reason it was safe was the properties of the virus. Also on social engineering, I’d say the easiest way to socially engineer them without them noticing is Twitter, Facebook, and Google, since they can bias the search towards what the AI wants the human to think, and let confirmation bias do the rest.
I’d also say that humans are very misaligned with entities much less powerful than themselves, like slaves, animals, women, blacks, and more, and misalignment is the norm in history, not alignment.
I don’t think this is a particularly good argument in the case of humans, because a lot of the reasons for such domination in that special case has to do with a terminal value for it, not because it actually works to the instrumental benefit of the subjugators. There are plenty of economists who will tell you that America is better off for white males not having slaves and letting women get jobs. I also personally dislike using this kind of example-giving to normies because they then accuse me of anthropomorphizing. Better to look at what an AI system values, what it can do, and just say “hm, the AI values this nonhuman state of affairs more than the human state and oh look it can make that happen”.
True enough, and I’d agree here that I might be anthropomorphizing too much.
So the animal and slaves examples (like factory farms or plausibly hunting/habitat destruction.) is a useful case of instrumental convergence, where getting healthy diets and making money are the instrumental values that result in catastrophe for animals and slaves.
Also, slavery was profitable, at least in my opinion, so much so that it funded effectively the majority of America’s wealth thanks to the cotton gin, which allowed massive wealth to be extracted from slaves.
I think you’re right that people often think those adversarial processes will not be adequate to endure the sort of pressure that an AI could put on them. Again, it seems like an open problem; perhaps financial markets will be sophisticated and efficient enough that an AI quickly gives up on making lots of money there, or perhaps AI systems with novel capabilities will be able to make as much money as the finance industry does now. It seems pretty strategically relevant which of the two will be the case!
I also think you’re understating the degree of class collusion that’s possible. Humans get into conflict with other humans, but humanity as a whole beat neanderthals, even tho there were likely lots of small adversarial moments where humans could knock their human rivals down a peg by standing up for individual neanderthals (or whatever). We might end up in a situation where lots of different AI systems are competing for control over the Earth’s resources, but unless those AI systems all happen to care about the atmosphere being breathable this is likely going to end up poorly for humans.
It seems likely to me that it will be instrumentally convergent (i.e. lots of AI systems, even with strong disagreements on other topics, will spontaneously agree on this topic) for AI systems to disenfranchise humans and enfranchise themselves, or more generally remove humans as a security concern. If you think, for example, that competing firms in the US can manage to pay their taxes and cooperate on national defense despite being in the middle of an adversarial process, then why not expect similar things for ‘AI nations’?
I’d also say that humans are very misaligned with entities much less powerful than themselves, like slaves, animals, women, blacks, and more, and misalignment is the norm in history, not alignment.
Also, the Covid and monkeypox pandemics showed that we are relatively inadequate handling pandemics, and the biggest reason it was safe was the properties of the virus. Also on social engineering, I’d say the easiest way to socially engineer them without them noticing is Twitter, Facebook, and Google, since they can bias the search towards what the AI wants the human to think, and let confirmation bias do the rest.
I don’t think this is a particularly good argument in the case of humans, because a lot of the reasons for such domination in that special case has to do with a terminal value for it, not because it actually works to the instrumental benefit of the subjugators. There are plenty of economists who will tell you that America is better off for white males not having slaves and letting women get jobs. I also personally dislike using this kind of example-giving to normies because they then accuse me of anthropomorphizing. Better to look at what an AI system values, what it can do, and just say “hm, the AI values this nonhuman state of affairs more than the human state and oh look it can make that happen”.
True enough, and I’d agree here that I might be anthropomorphizing too much.
So the animal and slaves examples (like factory farms or plausibly hunting/habitat destruction.) is a useful case of instrumental convergence, where getting healthy diets and making money are the instrumental values that result in catastrophe for animals and slaves.
Also, slavery was profitable, at least in my opinion, so much so that it funded effectively the majority of America’s wealth thanks to the cotton gin, which allowed massive wealth to be extracted from slaves.
Here’s a link: https://faculty.weber.edu/kmackay/economics of slavery.asp#:~:text=Slavery seemed enormously profitable.,stimulate the nation’s early industrialization.
Another link, albeit more polemic than the last link: https://www.vox.com/identities/2019/8/16/20806069/slavery-economy-capitalism-violence-cotton-edward-baptist