Eliezer Yudkowsky comments on Decision Theory FAQ

Eliezer Yudkowsky 14 Mar 2013 17:41 UTC
16 points
To slightly expand, if an intelligence is not prohibited from the following epistemic feats:

1) Be good at predicting which hypothetical actions would lead to how many paperclips, as a question of pure fact.

2) Be good at searching out possible plans which would lead to unusually high numbers of paperclips—answering the purely epistemic search question, “What sort of plan would lead to many paperclips existing, if someone followed it?”

3) Be good at predicting and searching out which possible minds would, if constructed, be good at (1), (2), and (3) as purely epistemic feats.

Then we can hook up this epistemic capability to a motor output and away it goes. You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats. They must be unable to know the answers to these questions of fact.
- Stuart_Armstrong 15 Mar 2013 14:06 UTC
  0 points
  Parent
  A nice rephrasing of the “no Oracle” argument.
  - Eliezer Yudkowsky 15 Mar 2013 18:20 UTC
    6 points
    Parent
    Only in the sense that any working Oracle can be trivially transformed into a Genie. The argument doesn’t say that it’s difficult to construct a non-Genie Oracle and use it as an Oracle if that’s what you want; the difficulty there is for other reasons.
    
    Nick Bostrom takes Oracles seriously so I dust off the concept every year and take another look at it. It’s been looking slightly more solvable lately, I’m not sure if it would be solvable enough even assuming the trend continued.
    - Stuart_Armstrong 18 Mar 2013 10:19 UTC
      2 points
      Parent
      A clarification: my point was that denying orthogonality requires denying the possibility of Oracles being constructed; your post seemed a rephrasing of that general idea (that once you can have a machine that can solve some things abstractly, then you need just connect that abstract ability to some implementation module).
      - Eliezer Yudkowsky 18 Mar 2013 19:56 UTC
        4 points
        Parent
        Ah. K. It does seem to me like “you can construct it as an Oracle and then turn it into an arbitrary Genie” sounds weaker than “denying the Orthogonality thesis means superintelligences cannot know 1, 2, and 3.” The sort of person who denies OT is liable to deny Oracle construction because the Oracle itself would be converted unto the true morality, but find it much more counterintuitive that an SI could not know something. Also we want to focus on the general shortness of the gap from epistemic knowledge to a working agent.
        Stuart_Armstrong 19 Mar 2013 11:09 UTC
        0 points
        Parent
        Possibly. I think your argument needs to be a bit developed to show that one can extract the knowledge usefully, which is not a trivial statement for general AI. So your argument is better in the end, but needs more argument to establish.
- whowhowho 14 Mar 2013 18:32 UTC
  −4 points
  Parent
  
  You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats.
  
  I don’t see the significance of “purely epistemic”. I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can’t have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.
  
  What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively means predicting what smarter version of tiself would do. Making it smarted doesn’t help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.
  
  Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn’t rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case that most most agents won’t have firewalled values. If have to engineer something in , it isn’t an intrinsic truth.