Strong orthogonality hypothesis is definitely wrong—not being openly hostile to most other agents has enormous instrumental advantage. That’s what’s holding modern human societies together—agents like humans, corporations, states etc. - have mostly managed to keep their hostility low. Those that are particularly belligerent (and historical median has been far more belligerent towards strangers than all but the most extreme cases today) don’t do well by instrumental standards at all.
Of course you can make a complicated argument why it doesn’t matter (someone’s end goals might be extremely hostile, but they act in mostly non-hostile ways for instrumental reasons), but there’s not that much difference practically.
You’d pretty much need to postulate infinitely powerful AI (like Eliezer’s AI foom idea, which is totally wrong of course) before you can disregard this argument from every single observation we can make of every single intelligent agent in the real world.
Of course you can make a complicated argument why it doesn’t matter (someone’s end goals might be extremely hostile, but they act in mostly non-hostile ways for instrumental reasons), but there’s not that much difference practically.
I actually think this “complicated argument”, either made or refuted, is the core of this orthogonality business. If you ask the question “Okay, now that we’ve made a really powerful AI somehow, should we check if it’s Friendly before giving it control over the world?” then you can’t answer it just based on what you think the AI would do in a position roughly equal to humans.
Of course, you can just argue that this doesn’t matter because we’re unlikely to face really powerful AIs at all. But that’s also complicated. If the orthogonality thesis is truly wrong, on the other hand, then the answer to the question above is “Of course, let’s give the AI control over the world, it’s not going to hurt humans and in the best case it might help us.”
Strong orthogonality hypothesis is definitely wrong—not being openly hostile to most other agents has enormous instrumental advantage. That’s what’s holding modern human societies together—agents like humans, corporations, states etc. - have mostly managed to keep their hostility low. Those that are particularly belligerent (and historical median has been far more belligerent towards strangers than all but the most extreme cases today) don’t do well by instrumental standards at all.
Of course you can make a complicated argument why it doesn’t matter (someone’s end goals might be extremely hostile, but they act in mostly non-hostile ways for instrumental reasons), but there’s not that much difference practically.
You’d pretty much need to postulate infinitely powerful AI (like Eliezer’s AI foom idea, which is totally wrong of course) before you can disregard this argument from every single observation we can make of every single intelligent agent in the real world.
I actually think this “complicated argument”, either made or refuted, is the core of this orthogonality business. If you ask the question “Okay, now that we’ve made a really powerful AI somehow, should we check if it’s Friendly before giving it control over the world?” then you can’t answer it just based on what you think the AI would do in a position roughly equal to humans.
Of course, you can just argue that this doesn’t matter because we’re unlikely to face really powerful AIs at all. But that’s also complicated. If the orthogonality thesis is truly wrong, on the other hand, then the answer to the question above is “Of course, let’s give the AI control over the world, it’s not going to hurt humans and in the best case it might help us.”