DirectedEvolution comments on Superintelligent Introspection: A Counter-argument to the Orthogonality Thesis

DirectedEvolution 29 Aug 2021 15:38 UTC
2 points
The assumptions Bostrom uses to justify the orthogonality thesis include:
- If desire is required in order for beliefs to motivate actions, and if intelligence may produce belief, but not desire.
- ″… if the agent happens to have certain standing desires of some sufficient, overriding strength.”
- ″… if it is possible to build a cognitive system (or more neutrally, an “optimization process”) with arbitrarily high intelligence but with constitution so alien as to contain no clear functional analogues to what in humans we call “beliefs” and “desires”.
- ″… if an agent could have impeccable instrumental rationality even whilst lacking some other faculty constitutive of rationality proper, or some faculty required for the full comprehension of the objective moral facts.”
First, let’s point out that the first three justifications use the word “desire,” rather than “goal.” So let’s rewrite the OT with this substitution:
Intelligence and final desires are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final desire.
Let’s accept the Humean theory of motivation, and agree that there is a fundamental difference between belief and desire. Nevertheless, if Bostrom is implicitly defining intelligence as “the thing that produces beliefs, but not desires,” then he is begging the question in the orthogonality thesis.
Now, let’s consider the idea of “standing desires of some sufficient, overriding strength.” Though I could very easily be missing a place where Bostrom makes this connection, I haven’t found where Bostrom goes from proposing the existence of such standing desires to showing why this is compatible with any level of intelligence. By analogy, we can imagine a human capable of having an extremely powerful desire to consume some drug. We cannot take it for granted that some biomedical intervention that allowed them to greatly increase their level of intelligence would leave their desire to consume the drug unaltered.
Bostrom’s AI with an alien constitution, possessing intelligence but not beliefs and desires, again begs the question. It implicitly defines “intelligence” in such a way that it is fundamentally different from a belief or a desire. Later, he refers to “intelligence” as “skill at prediction, planning, and means-ends reasoning in general.” It is hard to imagine how we could have means-ends reasoning without some sort of desire. This seems to me an equivocation.
His last point, that an agent could be superintelligent without having impeccable instrumental rationality in every domain, is also incompatible with the orthogonality thesis as he describes it here. He says that more or less any level of intelligence could be combined with more or less any final desire. When he makes this point, he is saying that more or less any final desire is compatible with superintelligence, as long as we exclude the parts of intelligence that are incompatible with the desire. While we can accept that an AI could be superintelligent while failing to exhibit perfect rationality in every domain, the orthogonality thesis as stated encompasses a superintelligence that is perfectly rational in every domain.
Rejecting this formulation of the orthogonality thesis is not simulatenously a rejection of the claim that superintelligent AI is a threat. It is instead a rejection of the claim that Bostrom has made a successful argument that there is a fundamental distinction between intelligence and goals, or between intelligence and desires.
My original argument here was meant to go a little further, and illustrate why I think that there is an intrinsic connection between intelligence and desire, at least at a roughly human level of intelligence.