I don’t think this post really has much to do with the “orthogonality thesis”, as I understand the term. The orthogonality thesis says:
(A) It’s possible for there to be an arbitrarily capable agent whose overriding priority is maximizing the number of paperclips in the distant future
(B) It’s possible for there to be an arbitrarily capable agent whose overriding priority is solving the Riemann Hypothesis
(C) It’s possible … etc.
I don’t think the orthogonality thesis requires that all these agents are identical except for different weights within a small data structure labeled “goals” in the source code, or whatever. The orthogonality thesis doesn’t require these agents to have any relation whatsoever. It’s just saying they can all exist.
Other than the (mis)use of the term “orthogonality thesis”, what do I think of the post?
From my perspective, I’d say that, holding compute fixed and assuming an approach that scales to radically superhuman AI, agent (A) will almost definitely wind up with better knowledge of metallurgy than agent (B), and agent (B) will almost definitely wind up with better knowledge of prime numbers than agent (A), even though “knowledge” is part of the world-model, not the value function or policy. This seems pretty obvious to me. I think that means I agree with the main substance of the post.
I don’t think this post really has much to do with the “orthogonality thesis”, as I understand the term.
I didn’t read the post as having much to do with the orthogonality thesis either, and hence I made no mention of the orthogonality thesis in my summary.
I nonetheless do think the idea of a spectrum from systems that are well factored into objectives/goals/values, world models, and reasoners/planners to systems where these components are all intertwined is useful/valuable.
And the post correctly identifies relevant considerations, tradeoffs, etc. I found it very much well worth reading.
I don’t think this post really has much to do with the “orthogonality thesis”, as I understand the term. The orthogonality thesis says:
(A) It’s possible for there to be an arbitrarily capable agent whose overriding priority is maximizing the number of paperclips in the distant future
(B) It’s possible for there to be an arbitrarily capable agent whose overriding priority is solving the Riemann Hypothesis
(C) It’s possible … etc.
I don’t think the orthogonality thesis requires that all these agents are identical except for different weights within a small data structure labeled “goals” in the source code, or whatever. The orthogonality thesis doesn’t require these agents to have any relation whatsoever. It’s just saying they can all exist.
Other than the (mis)use of the term “orthogonality thesis”, what do I think of the post?
From my perspective, I’d say that, holding compute fixed and assuming an approach that scales to radically superhuman AI, agent (A) will almost definitely wind up with better knowledge of metallurgy than agent (B), and agent (B) will almost definitely wind up with better knowledge of prime numbers than agent (A), even though “knowledge” is part of the world-model, not the value function or policy. This seems pretty obvious to me. I think that means I agree with the main substance of the post.
I didn’t read the post as having much to do with the orthogonality thesis either, and hence I made no mention of the orthogonality thesis in my summary.
I nonetheless do think the idea of a spectrum from systems that are well factored into objectives/goals/values, world models, and reasoners/planners to systems where these components are all intertwined is useful/valuable.
And the post correctly identifies relevant considerations, tradeoffs, etc. I found it very much well worth reading.