Believable near-term AI disaster
johnswentworth’s post about AGI vs Humanity (https://www.lesswrong.com/posts/KTbGuLTnycA6wKBza/) caught my attention in ways that most discussion of AI takeover and the prevention of such does not. Most of the discussion is very abstract, and leaves unstated some pretty deep assumptions about the path and results of smarter-than-human imperfectly-aligned AI. This preserves generality, but interferes with engagement (for me, at least). I think the fundamental truth remains that we can’t actually predict the path to unaligned AI, nor the utility function(s) of the AI(s) who take over, so please treat this as “one possible story”, not the totality of bad things we should worry about.
40-odd years ago, Douglas Adams wrote:
There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another theory which states that this has already happened.
I suspect it will be similar with AI takeover—by the time anyone discovers exactly what the AI wants and how it’s going about achieving it, it will be replaced by something even weirder. More importantly, it will be discovered in retrospect, not prospect. Expect to read about (or be involved with publishing) articles that sound like “oh! THAT’s how we lost control!” And counter-articles explaining that “we” never had control in the first place.
My story is that conspiracy theories about Illuminati, shadow-governments, and other incredibly powerful but non-obvious control structures for human civilization become true with the advancing capabilities of AI. We don’t (and perhaps can’t) fully know the AI’s utility function, but we know that power and influence are instrumentally valuable for most goals.
This feels like AI-assisted cultural shift, and AI-influenced resource usage much more than it feels like “takeover”. Unless the AI somehow inherits human egotism, it won’t particularly care if humans declare fealty or act subservient—the AI isn’t seeking office, only seeking to direct resources. The most important resource on earth today is human action. We are incredibly versatile and powerful machines, and the absolute cheapest way to manipulate physical resources. And we’re easily motivated by things the AI has deep influence over, especially our perceptions of other humans.
“Late-stage capitalism” is a term used to remind us that cultural and economic norms are mutable and ALREADY don’t align with many of our deeper values. The vast majority of our day-to-day behaviors are constrained by group dynamics that we don’t really understand and don’t have the ability to change. Corporations and governments have become large and abstracted enough that it’s easy to think of them as agents independent of their executives and employees—the humans are subservient to the group, not the group being a strict sum of the humans. It’s NO stretch to see a path from “corporations use AI to get better at their fairly nihilistic goals of power and long-term cash flows” to “AI discovers some additional goals that make the corporation even better at this” to “corporation delegates more and more control to AI” to “there’s simply no relevant difference between corporate dominance and AI takeover”.
I don’t yet have a good story of AI pluralism—how will different AIs negotiate and cooperate, and how quickly and completely they will merge their utility functions VS killing each other. But even here, corporate behavior may be a clue.
This is the basic path of “humans want to take over other humans” → “Humans compromise on goals to increase power (plus are naturally un-aligned with each other)” → “AI-assisted human takeover” → human-assisted AI control” → AI control (of humans).
It doesn’t require political or government takeover in any visible way—the AI doesn’t care about prestige or what humans think is their government form, it only cares what physically happens in the universe. This requires manipulating humans who are great at manipulating matter. It does mean the governments become somewhat less relevant—the corporations/AIs have all the actual control of resource allocation (including human resources).
There is another theory which states that this has already happened.
The part I disagree with is the “this has already happened” part. I think it’s pretty clear that corporations aren’t completely in control of the world yet; also, corporations aren’t AIs and are more inherently safe / aligned to human values than I expect AIs to be by default. (Even though, yes, corporations are dangerous and unaligned to human values)
They may be more in control than you think. For many people in the US or EU, they’d be more harmed by Amazon, Google, and Apple banning their account (and preventing future accounts) or by a twitter campaign to discredit them than they would be by a short stint in jail. And the majority of concentrated compute power is in MS, Google, and Amazon data centers.
Still, I agree that it’s not a done deal. It COULD be the way AI takes over, but I don’t think it’s happened yet—today’s corporations haven’t exhibited the competence and ability to optimize their control to the degree they could with true AGI.
The degree of misalignment is also definitely arguable. One of my main points in posting this is that “inescapable dystopia” doesn’t require an AI that is so obviously misaligned as to cackle evilly while developing grey goo for paperclips. It can be very bad with only a mildly-divergent-but-powerful optimizer.
Douglas Adams was a brilliant observer of human things. He did make predictions about the form of future AI through the Sirius Cybernetics Corporation. Their products included robots with genuine people personalities. He also included the hindsight warning that nobody really wanted their products and that they were the first up against the wall when the revolution came. Maybe the way to stave off ‘AI’ dystopia is to stop using AI products now?