I do think there’s a noticeable extent to which I was trying to list difficulties more central than those
Probably people disagree about which things are more central, or as evhub put it:
Every time anybody writes up any overview of AI safety, they have to make tradeoffs [...] depending on what the author personally believes is most important/relevant to say
Now FWIW I thought evhub was overly dismissive of (4) in which you made an important meta-point:
EY: 4. We can’t just “decide not to build AGI” because GPUs are everywhere, and knowledge of algorithms is constantly being improved and published; 2 years after the leading actor has the capability to destroy the world, 5 other actors will have the capability to destroy the world. The given lethal challenge is to solve within a time limit, driven by the dynamic in which, over time, increasingly weak actors with a smaller and smaller fraction of total computing power, become able to build AGI and destroy the world. Powerful actors all refraining in unison from doing the suicidal thing just delays this time limit—it does not lift it [...]
evhub: This is just answering a particular bad plan.
But I would add a criticism of my own, that this “List of Lethalities” somehow just takes it for granted that AGI will try to kill usall without ever specifically arguing that case. Instead you just argue vaguely in that direction, in passing, while making broader/different points:
an AGI strongly optimizing on that signal will kill you, because the sensory reward signal was not a ground truth about alignment (???)
All of these kill you if optimized-over by a sufficiently powerful intelligence, because they imply strategies like ‘kill everyone in the world using nanotech to strike before they know they’re in a battle, and have control of your reward button forever after’. (I guess that makes sense)
If you perfectly learn and perfectly maximize the referent of rewards assigned by human operators, that kills them. (???)
Perhaps you didn’t bother because your audience is meant to be people who already believe this? I would at least expect to see it in the intro: “-5. unaligned superintelligences tend to try to kill everyone, here’s why <link>.… −4. all the most obvious proposed solutions to (-5) don’t work, here’s why <link>”.
Probably people disagree about which things are more central, or as evhub put it:
Now FWIW I thought evhub was overly dismissive of (4) in which you made an important meta-point:
But I would add a criticism of my own, that this “List of Lethalities” somehow just takes it for granted that AGI will try to kill us all without ever specifically arguing that case. Instead you just argue vaguely in that direction, in passing, while making broader/different points:
Perhaps you didn’t bother because your audience is meant to be people who already believe this? I would at least expect to see it in the intro: “-5. unaligned superintelligences tend to try to kill everyone, here’s why <link>.… −4. all the most obvious proposed solutions to (-5) don’t work, here’s why <link>”.