“human-level AI” is a confusing term for at least a couple reasons: first, there is a gigantic performance range even if you consider only the top 1% of humanity and second it’s not clear that human-level general learning systems won’t be intrinsically superhuman because of things like scalable substrate and extraordinarily high bandwidth access (compared to eyes, ears, and mouths) to lossless information. That these apparent issues are not more frequently enumerated in the context of early AGI is confusing.
As far as I’m aware all serious attempts to take over the world have been by brute force. Historically there are messaging, travel, logistics etc latencies that make this very difficult within one’s lifetime even if potentially world-owning force is available or capable of being mustered. So the window for a single entity (human-level) to take over the world within its lifetime has probably only opened recently, and the number of externalities and internal abilities needed to line up to have a predictably large shot at success are probably many. Accordingly, even situations like Hitler sitting in control of a very powerful Reich which nominally might appear to enable a chance of world ownership are still too fraught with an unoptimized distribution of enabling factors to have any realistic chance of world ownership. There is also a grey area of whether an individual or some collective is responsible for the attempt. One might argue that trends ongoing for at least a few decades suggest that the USA is in a great position to take over the world if China (or someone else) doesn’t “break out” first. But with the way the USA is structured it may be difficult for any “human-level” individual entity to take credit for, or enjoy a firm grasp of the fruits of this conquest.
A conventional approach might lead one to consider that inside the LW / AI safety bubble it borders on taboo to discount the existential threat posed by unaligned AI, but this is almost an inversion of the outside world, even if limited to to 25⁄75 of what LW users might consider “really impressive people.”
This is one gateway to one collection of problems associated with spreading awareness of AI alignment, but let’s go in a different direction: somewhere more personal.
Fundamentally, it seems a mistake to frame alignment as an AI issue. While unaligned AGI appears to be rapidly approaching and we have good reasons to believe this will probably result in the extinction of our species, there is another, more important alignment problem that underlies, and somewhat parallels the AI alignment problem. Of course, this larger issue is the alignment problem as faced by humanity at large.
Humans are famously unaligned on many levels: with respect to the self, interpersonally, and micro / macro-socially. No good solution to any tier of this problem has been discovered over thousands of years of inquiry. In the 20th century, humans developed technology useful for acquiring a great deal of information about the universe beyond our world, and “coincidentally” our capability of concentrated destruction increased in effectiveness by orders of magnitude, to the scale where killing at least large portions of the species in a short time is plausible. Thus, the question of why we don’t see others like us even though there appears to be ample space tended to find answers along the lines of intelligent life destroying itself. Of course, this is the result of an alignment “problem.”
Dull humans forecasted that nuclear arms would end the world and slightly smarter humans suggested that we might wait for antimatter, nanotech, genetically engineered pathogens or some other high-impact dangerous technology. As we’re seeing now, these problems are difficult. What appears to be less difficult is AGI.
So, even though it’s not in the interest of the continuity of the species, humanity can’t help but to race redundantly at breakneck pace toward this new technological capability, embodying a slightly disguised, concentrated and lethal version of one of the oldest and most fundamental problems our species has ever faced. That AI alignment is not taken more seriously could be seen as a reflection of “really impressive people” actually not having paid much mind to the alignment problems embedded in and endemic to who we are.
Should one introduce really impressive people to AI alignment? Maybe, but one must remember that magic appears unavailable and that for various reasons, it is predictably the case that most people, even “really impressive” people, will not consider the problem to be more than an abstract curiosity with even the best presentation. So to evangelize about AI alignment seems most useful as a fulfillment of one’s personal / social interests rather than much of a useful tool to increase work to save the species.
Full disclosure: it’s not clear that alignment is a meaningful concept, it’s not clear that humans have meaningful or consistent values, it’s very much not clear that continuing the human species is a good thing (at any point in our history, past, present or future) from an S-risk perspective, and it’s not clear that humans have any business rationally evaluating the utility in survival and reproduction as these are goals we’re apparently optimized for. So it should be the case that this post is written with less motivation to evangelize.