Of course you can make a complicated argument why it doesn’t matter (someone’s end goals might be extremely hostile, but they act in mostly non-hostile ways for instrumental reasons), but there’s not that much difference practically.
I actually think this “complicated argument”, either made or refuted, is the core of this orthogonality business. If you ask the question “Okay, now that we’ve made a really powerful AI somehow, should we check if it’s Friendly before giving it control over the world?” then you can’t answer it just based on what you think the AI would do in a position roughly equal to humans.
Of course, you can just argue that this doesn’t matter because we’re unlikely to face really powerful AIs at all. But that’s also complicated. If the orthogonality thesis is truly wrong, on the other hand, then the answer to the question above is “Of course, let’s give the AI control over the world, it’s not going to hurt humans and in the best case it might help us.”
I actually think this “complicated argument”, either made or refuted, is the core of this orthogonality business. If you ask the question “Okay, now that we’ve made a really powerful AI somehow, should we check if it’s Friendly before giving it control over the world?” then you can’t answer it just based on what you think the AI would do in a position roughly equal to humans.
Of course, you can just argue that this doesn’t matter because we’re unlikely to face really powerful AIs at all. But that’s also complicated. If the orthogonality thesis is truly wrong, on the other hand, then the answer to the question above is “Of course, let’s give the AI control over the world, it’s not going to hurt humans and in the best case it might help us.”