To pick out a couple of specific examples from your list, Wei Dai:
14. Human-controlled AIs causing ethical disasters (e.g., large scale suffering that can’t be “balanced out” later) prior to reaching moral/philosophical maturity
This is a serious long-term concern if we don’t kill ourselves first, but it’s not something I see as a premise for “the priority is for governments around the world to form an international agreement to halt AI progress”. If AI were easy to use for concrete tasks like “build nanotechnology” but hard to use for things like CEV, I’d instead see the priority as “use AI to prevent anyone else from destroying the world with AI”, and I wouldn’t want to trade off probability of that plan working in exchange for (e.g.) more probability of the US and the EU agreeing in advance to centralize and monitor large computing clusters.
After someone has done a pivotal act like that, you might then want to move more slowly insofar as you’re worried about subtle moral errors creeping in to precursors to CEV.
30. AI systems end up controlled by a group of humans representing a small range of human values (ie. an ideological or religious group that imposes values on everyone else)
I currently assign very low probability to humans being able to control the first ASI systems, and redirecting governments’ attention away from “rogue AI” and toward “rogue humans using AI” seems very risky to me, insofar as it causes governments to misunderstand the situation, and to specifically misunderstand it in a way that encourages racing.
If you think rogue actors can use ASI to achieve their ends, then you should probably also think that you could use ASI to achieve your own ends; misuse risk tends to go hand-in-hand with “we’re the Good Guys, let’s try to outrace the Bad Guys so AI ends up in the right hands”. This could maybe be justified if it were true, but when it’s not even true it strikes me as an especially bad argument to make.
To pick out a couple of specific examples from your list, Wei Dai:
This is a serious long-term concern if we don’t kill ourselves first, but it’s not something I see as a premise for “the priority is for governments around the world to form an international agreement to halt AI progress”. If AI were easy to use for concrete tasks like “build nanotechnology” but hard to use for things like CEV, I’d instead see the priority as “use AI to prevent anyone else from destroying the world with AI”, and I wouldn’t want to trade off probability of that plan working in exchange for (e.g.) more probability of the US and the EU agreeing in advance to centralize and monitor large computing clusters.
After someone has done a pivotal act like that, you might then want to move more slowly insofar as you’re worried about subtle moral errors creeping in to precursors to CEV.
I currently assign very low probability to humans being able to control the first ASI systems, and redirecting governments’ attention away from “rogue AI” and toward “rogue humans using AI” seems very risky to me, insofar as it causes governments to misunderstand the situation, and to specifically misunderstand it in a way that encourages racing.
If you think rogue actors can use ASI to achieve their ends, then you should probably also think that you could use ASI to achieve your own ends; misuse risk tends to go hand-in-hand with “we’re the Good Guys, let’s try to outrace the Bad Guys so AI ends up in the right hands”. This could maybe be justified if it were true, but when it’s not even true it strikes me as an especially bad argument to make.