So my point is even accepting the orthogonality thesis, and now instrumental convergence as defined in the post above isn’t enough to lead to the conclusion that AI existential risk is very probable, without more assumptions.
Strong agree. The OT itself is not an argument for AI danger: it needs to be combined with other claims.
The random potshot version of the OT argument is one way of turning possibilities into probabilities.
Many of the minds in mindpsace are indeed weird and unfriendly to humans, but that does not make it likely that the AIs we will construct will be. You can argue for the likelihood of eldritch AI on on the assumption that any attempt to build an AI is a random potshot into mindspace, in which the chance of building an eldrich AI is high, because there are a lot of them, and a random potshot hits any individual mind with the same likelihood as any other. But the random potshot assumption is obviously false. We dont’ want to take a random potshot, and couldn’t if we wanted to becasue we are constrained by our limitations and biases.
To reply in Stuart Russell’s words: “One of the most common patterns involves omitting something from the objective that you do actually care about. In such cases … the AI system will often find an optimal solution that sets the thing you do care about, but forgot to mention, to an extreme value.”
There are vastly more possible worlds that we humans can’t survive in than those we can, let alone live comfortably in. Agreed, “we don’t want to make a random potshot”, but making an agent that transforms our world into one of these rare ones where we want to live in is hard because we don’t know how to describe that world precisely.
Eliezer Yudkowsky’s rocket analogy also illustrates this very vividly: If you want to land on Mars, it’s not enough to point a rocket in the direction where you can currently see the planet and launch it. You need to figure out all kinds of complicated things about gravity, propulsion, planetary motions, solar winds, etc. But our knowledge of these things is about as detailed as that of the ancient Romans, to stay in the analogy.
I agree with that, and I also agree with Yann LeCun’s intention to “not being stupid enough to create something that we couldn’t control”. I even think not creating an uncontrollable AI is our only hope. I’m just not sure whether I trust humanity (including Meta) to be “not stupid”.
Strong agree. The OT itself is not an argument for AI danger: it needs to be combined with other claims.
The random potshot version of the OT argument is one way of turning possibilities into probabilities.
Many of the minds in mindpsace are indeed weird and unfriendly to humans, but that does not make it likely that the AIs we will construct will be. You can argue for the likelihood of eldritch AI on on the assumption that any attempt to build an AI is a random potshot into mindspace, in which the chance of building an eldrich AI is high, because there are a lot of them, and a random potshot hits any individual mind with the same likelihood as any other. But the random potshot assumption is obviously false. We dont’ want to take a random potshot, and couldn’t if we wanted to becasue we are constrained by our limitations and biases.
To reply in Stuart Russell’s words: “One of the most common patterns involves omitting something from the objective that you do actually care about. In such cases … the AI system will often find an optimal solution that sets the thing you do care about, but forgot to mention, to an extreme value.”
There are vastly more possible worlds that we humans can’t survive in than those we can, let alone live comfortably in. Agreed, “we don’t want to make a random potshot”, but making an agent that transforms our world into one of these rare ones where we want to live in is hard because we don’t know how to describe that world precisely.
Eliezer Yudkowsky’s rocket analogy also illustrates this very vividly: If you want to land on Mars, it’s not enough to point a rocket in the direction where you can currently see the planet and launch it. You need to figure out all kinds of complicated things about gravity, propulsion, planetary motions, solar winds, etc. But our knowledge of these things is about as detailed as that of the ancient Romans, to stay in the analogy.
It’s difficult to create an aligned Sovereign, but easy not to create a Sovereign at all.
I agree with that, and I also agree with Yann LeCun’s intention to “not being stupid enough to create something that we couldn’t control”. I even think not creating an uncontrollable AI is our only hope. I’m just not sure whether I trust humanity (including Meta) to be “not stupid”.