As I argue in the video, I actually think the definitions of “intelligence” and “goal” that you need to make the Orthogonality Thesis trivially true are bad, unhelpful definitions. So I both think that it’s false, and even if it were true it’d be trivial.
I’ll also note that Nick Bostrom himself seems to be making the motte and bailey argument here, which seems pretty damning considering his book was very influential and changed a lot of people’s career paths, including my own.
Edit replying to an edit you made: I mean, the most straightforward reading of Chapters 7 and 8 of Superintelligence is just a possibility-therefore-probability fallacy in my opinion. Without this fallacy, there would be little need to even bring up the orthogonality thesis at all, because it’s such a weak claim.
I mean, the most straightforward reading of Chapters 7 and 8 of Superintelligence is just a possibility-therefore-probability fallacy in my opinion.
The most relevant quote from Superintelligence (that I could find) is:
Second, the orthogonality thesis suggests that we cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans—scientific curiosity, benevolent concern for others, spiritual enlightenment and contemplation, renunciation of material acquisitiveness, a taste for refined culture or for the simple pleasures in life, humility and selflessness, and so forth. We will consider later whether it might be possible through deliberate effort to construct a superintelligence that values such things, or to build one that values human welfare, moral goodness, or any other complex purpose its designers might want it to serve. But it is no less possible— and in fact technically a lot easier—to build a superintelligence that places final value on nothing but calculating the decimal expansion of pi. This suggests that—absent a special effort—the first superintelligence may have some such random or reductionistic final goal.
My interpretation is that Bostrom is trying to be reasonably precise here and trying to do something like:
You might have “blithely assumed” that things would necessarily be fine, but orthogonality. (Again, extremely obvious.)
Also, it (separately) seems to me (Bostrom) to be technically easier to get your AI to have a simple goal, which implies that random goals might be more likely.
I think you disagree with point (2) here (and I disagree with point 2 as well), but this seems different from the claim you made. (I didn’t bother looking for Bostrom’s arguments for (2), but I expect them to be weak and easily defeated, at least ex-post.)
TBC, I can see where you’re coming from, but I think Bostrom tries to avoid this fallacy. It would be considerably better if he explicitly called out this fallacy and disclaimed it. So, I think he should be partially blamed for likely misinterpretations.
As I argue in the video, I actually think the definitions of “intelligence” and “goal” that you need to make the Orthogonality Thesis trivially true are bad, unhelpful definitions. So I both think that it’s false, and even if it were true it’d be trivial.
I’ll also note that Nick Bostrom himself seems to be making the motte and bailey argument here, which seems pretty damning considering his book was very influential and changed a lot of people’s career paths, including my own.
Edit replying to an edit you made: I mean, the most straightforward reading of Chapters 7 and 8 of Superintelligence is just a possibility-therefore-probability fallacy in my opinion. Without this fallacy, there would be little need to even bring up the orthogonality thesis at all, because it’s such a weak claim.
The most relevant quote from Superintelligence (that I could find) is:
My interpretation is that Bostrom is trying to be reasonably precise here and trying to do something like:
I think you disagree with point (2) here (and I disagree with point 2 as well), but this seems different from the claim you made. (I didn’t bother looking for Bostrom’s arguments for (2), but I expect them to be weak and easily defeated, at least ex-post.)
TBC, I can see where you’re coming from, but I think Bostrom tries to avoid this fallacy. It would be considerably better if he explicitly called out this fallacy and disclaimed it. So, I think he should be partially blamed for likely misinterpretations.