alexey comments on Contra Nora Belrose on Orthogonality Thesis Being Trivial

alexey 6 Nov 2023 21:44 UTC
1 point
0
In fact it seems that the linked argument relies on a version of the orthogonality thesis instead of being refuted by it:
For almost any ultimate goal—joy, truth, God, intelligence, freedom, law—it would be possible to do it better (or faster or more thoroughly or to a larger population) given superintelligence (or nanotechnology or galactic colonization or Apotheosis or surviving the next twenty years).
Nothing about the argument contradicts “the true meaning of life”—which seems in that argument to be effectively defined as “whatever the AI ends up with as a goal if it starts out without a goal”—being e.g. paperclips.
- tailcalled 6 Nov 2023 22:25 UTC
  2 points
  0
  Parent
  In fact it seems that the linked argument relies on a version of the orthogonality thesis instead of being refuted by it:
  The quoted section more seems like instrumental convergence than orthogonality to me?
  Nothing about the argument contradicts “the true meaning of life”—which seems in that argument to be effectively defined as “whatever the AI ends up with as a goal if it starts out without a goal”—being e.g. paperclips.
  In a sense, that’s it’s flaw; it’s supposed to be an argument that building a superintelligence is desirable because it will let you achieve the meaning of life, but since nothing contradicts “the meaning of life” being paperclips, you can substitute “convert the world into paperclips” into the argument and not lose validity. Yet, the argument that we should build a superintelligence because it lets us convert the world into paperclips is of course wrong, so one can go back and say that the original argument was wrong too.
  But in order to accept that, one needs to accept the orthogonality thesis. If one doesn’t consider “maximize the number of charged-up batteries” to be a sufficiently plausible outcome of a superintelligence that it’s even worth consideration, then one is going to be stuck in this sort of reasoning.
  - alexey 10 Nov 2023 0:23 UTC
    1 point
    0
    Parent
    The quoted section more seems like instrumental convergence than orthogonality to me?
    The second part of the sentence, yes. The bolded one seems to acknowledge AIs can have different goals, and I assume that version of EY wouldn’t count “God” as a good goal.
    Another more relevant part:
    Obviously, if the AI is going to be capable of making choices, you need to create an exception to the rules—create a Goal object whose desirability is not calculated by summing up the goals in the justification slot.
    Presumably this goal object can be anything.
    But in order to accept that, one needs to accept the orthogonality thesis.
    I agree that EY rejected the argument because he accepted OT. I very much disagree that this is the only way to reject the argument. In fact, all four positions seem quite possible:
    Accept OT, accept the argument: sure, AIs can have different goals, but this (starting an AI without explicit goals) is how you get an AI which would figure out the meaning of life.
    Reject OT, reject the argument: you can think “figure out the meaning of life” is not a possible AI goal.
    and 4. EY’s positions at different times.
    In addition, OT can itself be a reason to charge ahead with creating an AGI: since it says an AGI can have any goal, you “just” need to create an AGI which will improve the world. It says nothing about setting an AGI’s goal being difficult.