dr_s comments on AGI deployment as an act of aggression

dr_s 5 Apr 2023 21:35 UTC
3 points
1
Thanks for the answer! To be honest when I wrote this I mostly had in mind the kind of winner-takes-all intelligence explosion scenarios that are essentially the flip side of Eliezer’s chosen flavour of catastrophe: fast take-off by a FAI, pivotal act or what have you, and essentially world conquest. I think if the choice boiled down to those two things (and I’m not sure how can we know unless we manage to have a solid theory of intelligence scaling before we have AGI), or rather, if states believed it did, then it’d really be a lose-lose scenario. Everyone would just want not only FAI, but their FAI, probably couldn’t really tell an UFAI from a FAI anyway (you have to trust your enemy to not having screwed up essentially), so whatever the outcome, nukes fly at the first sign of anomalous activity.

If we tone it down to a slower take off I agree with you the situation may not be so dramatic—yes, everything that gives one country a huge asymmetrical advantage in productivity would translate also into strategic dominance, but it’s a fact that (be it out of rational self-preservation over dedication to the superorganism country, or be it out of irrational behaviour dominated by a forlorn hope that if you wait it out there’ll be better times to do something later) usually even enemies don’t straight up launch deadly attacks in response to just that.

I think however there is a third fundamental qualitative difference between AGI and any other technology, including specialised AI tools such as what we have now. Specialised tools only amplify the power of humans, leaving all value decisions in their hands. A single human has still limited physical and cognitive power, and we need to congregate in groups to gain enough force for large scale action. Since different people have different values and interests, groups need negotiation and a base of minimum shared goals to coalesce around. This produces some disunity that requires management, puts a damper on some of the most extreme behaviours (as they may splinter the moderates from the radicals), and overall defines all the dynamics of every group and organisation. Armed with spears or with guns, an army still runs on its morale, still needs trust in its leaders and its goals, still can mutiny.

By comparison, an army (or a company) of aligned AGIs can’t mutiny. It is almost the same as an extension of the body and mind of its master (to whom we assume it is aligned). The dynamics of this are radically different, akin to the power differential introduced by having people with superpowers or magic suddenly appear. Individuals commanding immense power, several orders of magnitude above that of a single human, with perfect fidelity, no lossiness.

I agree that technology has been good by improving standards of life, but I have to wonder—how much of this was because we truly held human flourishing as a terminal value, and how much simply because it was instrumental to other values (e.g. we needed educated workers and consumers so that current mass industrialised society could work at all)? After all, “setting up incentives so that human flourishing is an instrumental value to achieving selfish terminal values” is kind of the whole sleight of hand of capitalism and the free market—“it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own self-interest”. With AGI (and possibly robotics, which we might expect would follow if you put a million instances of AGI engineers on the case) human flourishing and economic self-interest of their owners might be entirely and fundamentally decoupled. If that were the case, then the classic assumption that technology equals flourishing could stop being true. That would make things very painful, and possibly deadly, even if AGI itself happened to not be the threat.

(This argument only applies to modest slowdowns, that remain small relative to consumers’ impatience. If you are considering 6 months of slowdown the social cost can easily be offset by small governance improvements; if you are considering 10 years I think you can only justify the cost by appealing to potential catastrophic or irreversible harms.)

I’m not really thinking of a slowdown as much as a switch in focus (though of course right now that would also mean a slowdown, since we’d have to rewind the clock a bit). More focus on building systems bottom-up instead of top-down and more focus on building systems that are powerful but specialised rather than general and agentic. Things like protein folding prediction AIs would be the perfect example of this sort of tool. Something that enables humans workers to do their work better and faster, surpassing the limits of their cognition, without making them entirely redundant. That way we both don’t break the known rules of technological innovation and we guarantee that values remain human-centred as always. It might be less performant than fully automated AGI loops but it vastly gains in safety. Though of course, seeing how even GPT-4 appears mildly agentic just when rigged into a self-loop with LangChain, I have to wonder whether you can ever keep those two areas separate enough.