Anthropic should take a humanist/cosmopolitan stance on risks from AGI in which risks related to different people having different values are very clearly deprioritized compared to risks related to complete human disempowerment or extinction, as worry about the former seems likely to cause much of the latter
Can you say more about the section I’ve bolded or link me to a canonical text on this tradeoff?
OpenAI, Anthropic, and xAI were all founded substantially because their founders were worried that other people would get to AGI first, and then use that to impose their values on the world.
In-general, if you view developing AGI as a path to godlike-power (as opposed to a doomsday device that will destroy most value independently of who gets their first), it makes a lot of sense to rush towards it. As such, the concern that people will “do bad things with the AI that they will endorse, but I won’t” is the cause of a substantial fraction of worlds where we recklessly race past the precipice.
Thanks for the clarification — this is in fact very different from what I thought you were saying, which was something more like “FATE-esque concerns fundamentally increase x-risk in ways that aren’t just about (1) resource tradeoffs or (2) side-effects of poorly considered implementation details.”
I mean, it’s related. FATE stuff tends to center around misuse. I think it makes sense for organizations like Anthropic to commit to heavily prioritize accident risk over misuse risk, since most forms of misuse risk mitigation involve getting involved in various more zero-sum-ish conflicts, and it makes sense for there to be safety-focused institutions that are committed to prioritizing the things that really all stakeholders can agree on are definitely bad, like human extinction or permanent disempowerment.
Can you say more about the section I’ve bolded or link me to a canonical text on this tradeoff?
OpenAI, Anthropic, and xAI were all founded substantially because their founders were worried that other people would get to AGI first, and then use that to impose their values on the world.
In-general, if you view developing AGI as a path to godlike-power (as opposed to a doomsday device that will destroy most value independently of who gets their first), it makes a lot of sense to rush towards it. As such, the concern that people will “do bad things with the AI that they will endorse, but I won’t” is the cause of a substantial fraction of worlds where we recklessly race past the precipice.
Thanks for the clarification — this is in fact very different from what I thought you were saying, which was something more like “FATE-esque concerns fundamentally increase x-risk in ways that aren’t just about (1) resource tradeoffs or (2) side-effects of poorly considered implementation details.”
I mean, it’s related. FATE stuff tends to center around misuse. I think it makes sense for organizations like Anthropic to commit to heavily prioritize accident risk over misuse risk, since most forms of misuse risk mitigation involve getting involved in various more zero-sum-ish conflicts, and it makes sense for there to be safety-focused institutions that are committed to prioritizing the things that really all stakeholders can agree on are definitely bad, like human extinction or permanent disempowerment.