Viliam comments on Contra Nora Belrose on Orthogonality Thesis Being Trivial

Viliam 9 Oct 2023 9:11 UTC
2 points
0
I think there are two parts that together make one important point.
The first part is that an intelligence can have any goal (a.k.a. the orthogonality thesis).
Second part is that most arbitrarily selected goals are bad. (Not “bad from the selfish short-sighted perspective of the puny human, but good from the viewpoint of a superior intellect”, but bad in the similar sense how rearranging the atoms in your body randomly could hypothetically cure your cancer, but almost certainly will kill you instead.)
The point is, as a first approximation, “don’t build a super-intelligence with random goals, expecting that as it gets smart enough it will spontaneously converge towards good”.
If you have a way to reliably specify good goals, then you obviously don’t have to worry about orthogonality thesis. As far as know, this is not the case we have now.
*
I wrote this before seeing the actual tweets. Now that I saw them, it seems to me that Yudkowsky basically agrees with my interpretation of him, which is based on things he wrote a decade ago, so at least the accusation about “rewriting history” is false.
(The part about “trivial, false, or unintelligible”, uhm, seems like it could just as well be made about e.g. natural selection. “What survives, survives” is trivial, if you don’t also consider mutations. So do we have a motte and bailey of natural selection, where the motte just assumes selection without discussing the mechanism of mutations, and the bailey that includes the mutations? Does this constitute a valid argument against evolution?)
- tailcalled 9 Oct 2023 10:09 UTC
  2 points
  0
  Parent
  I agree these two things together make an important point, but I think the orthogonality thesis by itself also makes an important point as it disproves E. Yudkowsky’s style of argument in the doc I linked to in the OP.