wedrifid comments on What if AI doesn’t quite go FOOM?

wedrifid 20 Jun 2010 2:42 UTC
6 points

One of the many reasons that I will win my bet with Eliezer is that it is impossible for an AI to understand itself.

That is just… trivially false.

If it could, it would be able to predict it’s own actions, and this is a logical contradiction, just as it is for us.

And that is the worst reasoning I have encountered in at least a week. Not only is it trying to foist a nonsensical definition ‘understand’, an AI could predict it’s own actions. AND even if it couldn’t it still wouldn’t be a logical contradiction. It’d just be a fact.
- Unknowns 20 Jun 2010 5:29 UTC
  −2 points
  Parent
  An AI could not predict its own actions, because any intelligent agent is quite capable of implementing the algorithm: “Take the predictor’s predicted action. Do the opposite.”
  
  In order to predict itself (with 100% accuracy), it would have to be able to emulate its own programming, and this would cause a never-ending loop. Thus this is impossible.
  - JoshuaZ 20 Jun 2010 5:32 UTC
    8 points
    Parent
    
    An AI could not predict its own actions, because any intelligent agent is quite capable of implementing the algorithm: “Take the predictor’s predicted action. Do the opposite.”
    
    Ok. And why would your AI decide to do so? You seem to be showing that a sufficiently pathological AI won’t be able to predict its own actions. How this shows that other AIs won’t be able to predict their own actions within some degree of certainty seems off.
    - Unknowns 20 Jun 2010 5:36 UTC
      −8 points
      Parent
      This isn’t pathological. For example, it is a logical contradiction for someone to predict my actions in advance (and tell me about it), because my “programming” will lead me to do something else, much like the above algorithm. This is a feature, not a bug. Being able to be predicted is a great weakness. Any well programmed AI will avoid this weakness, just as we do.
      - wedrifid 20 Jun 2010 9:38 UTC
        10 points
        Parent
        
        Being able to be predicted is a great weakness.
        
        Being able to be predicted is absolutely vital for making credible threats and promises. And, along with being able to accurately predict, allows for cooperation with other rational agents.
      - JoshuaZ 20 Jun 2010 5:45 UTC
        8 points
        Parent
        There appears to be a lot of logic here that is happening implicitly because I’m not following you.
        
        You wrote:
        
        An AI could not predict its own actions, because any intelligent agent is quite capable of implementing the algorithm: “Take the predictor’s predicted action. Do the opposite.”
        
        Now, this seems like a very narrow sort of AI that would go and then do something else against what was predicted.
        
        For example, it is a logical contradiction for someone to predict my actions in advance (and tell me about it), because my “programming” will lead me to do something else, much like the above algorithm.
        
        You seem to be using “logical contradiction” in a non-standard fashion. Do you mean it won’t happen given how your mind operates? In that case, permit me to make a few predictions about your actions over the next 48 hours (that you could probably predict also): 1) You will sleep at some point in that time period. 2) You will eat at some point in that time period. I make both of those with probability around .98 each. If we extend to one month I’m willing to make a similar confidence prediction that you will make a phonecall or check your email within in that time. I’m pretty sure you are not going to go out of your way as a result of these predictions to try to go do something else.
        
        You also seem to be missing the point about what an AI would actually need to improve. Say for example that the AI has a subroutine for factoring integers. If it comes up with a better algorithm for factoring integers, it can replace the subroutine with the new one. It doesn’t need to think deeply about how this will alter behavior.
        Unknowns 20 Jun 2010 6:14 UTC
        −7 points
        Parent
        I agree with those predictions. However, my point would become clear if you attempted to translate your probability of 0.98 into a bet with me, with me betting $100 and you betting $5000. I would surely win the bet (with at least a probability of 0.98).
        LucasSloan 20 Jun 2010 7:08 UTC
        14 points
        Parent
        I am willing to bet, at 10,000 to 1 odds, that you will sleep sometime in the next 2 weeks. The pay out on this bet is not transferable to your heirs.
        JoshuaZ 20 Jun 2010 13:59 UTC
        2 points
        Parent
        
        I agree with those predictions. However, my point would become clear if you attempted to translate your probability of 0.98 into a bet with me, with me betting $100 and you betting $5000. I would surely win the bet (with at least a probability of 0.98).
        
        No it wouldn’t because that’s a very different situation. My probability estimate for you not eating food in a 48 hour period if you get paid $5000 when you succeed and must pay $100 if you fail is much lower. If I made the bet with some third party I’d be perfectly willing to do so as long as I had some reassurance that the third party isn’t intending to pay you a large portion of the resulting winnings if you win.
      - Alicorn 20 Jun 2010 5:41 UTC
        7 points
        Parent
        I don’t find predictability a weakness. If someone says to me, “Hey, Alicorn, I predict you’re going to eat that sandwich you’re holding,” I’m going to say, “Yes. You are exactly right. And I’m glad you are! If you were wrong, then I wouldn’t get to eat this delicious sandwich, which I want (that being why I made it and picked it up).”
        
        Did you have some other, less general sort of predictability in mind when you made the claim that it’s a weakness?
        Unknowns 20 Jun 2010 6:06 UTC
        −2 points
        Parent
        It is only universal predictability that is a weakness.
        Alicorn 20 Jun 2010 6:09 UTC
        4 points
        Parent
        Why? Predicting my actions doesn’t make them actions I don’t want to take. Predicting I’ll eat a sandwich if I want one doesn’t hurt me; and if others can predict that I’ll cooperate on the prisoner’s dilemma iff my opponent will cooperate iff I’ll cooperate, so much the better for all concerned.
        
        Can you give an example of a case where being predictable would hurt someone who goes about choosing actions well in the first place? Note that, as with the PD thing above, actions are dependent on context; if the prediction changes the context, then that will already be factored into an accurate prediction.
        cousin_it 20 Jun 2010 8:02 UTC
        5 points
        Parent
        
        Can you give an example of a case where being predictable would hurt someone who goes about choosing actions well in the first place?
        
        Good question. Your intuition is correct as long as your actions are chosen “optimally” in the game-theoretic sense. This is one of the ideas behind Nash equilibria: your opponent can’t gain anything from knowing your strategy and vice versa. A caveat is that the Nash equilibria of many games require “mixed strategies” with unpredictable randomizing, so if the opponent can predict the output of your random device, you’re in trouble.
        timtyler 20 Jun 2010 7:03 UTC
        2 points
        Parent
        If you can accurately predict the action of a chess player faster than they can make it, then you have more time to think about your response. There are cases where this can make a difference—even if they happen to play perfectly.
        Unknowns 20 Jun 2010 6:16 UTC
        −3 points
        Parent
        Alicorn, your note about the PD implies that it is universally the case that there is some one action that will benefit you even if others predict it. There is no reason to think that this is the case; and if there is even one instance where doing what others predict you will do is harmful, then being universally predictable is a weakness.
      - wedrifid 20 Jun 2010 9:40 UTC
        4 points
        Parent
        
        For example, it is a logical contradiction for someone to predict my actions in advance (and tell me about it),
        
        Again, this is not a logical contradiction. You do not have a clear understanding of what the concept entails.It doesn’t mean ‘sometimes impractical’ or ‘often people adapt to avoid it’.
        Nick_Tarleton 20 Jun 2010 11:47 UTC
        0 points
        Parent
        No, this really would be a logical contradiction if the agent being predicted does implement the stated algorithm (and won’t override it when something more important is at stake). It just has nothing to do with self-improvement, for which predicting abstract properties of specific algorithms is what matters; much like Rice’s theorem doesn’t mean we can’t prove that specific programs output pi (e.g.).
        wedrifid 20 Jun 2010 12:08 UTC
        3 points
        Parent
        
        No, this really would be a logical contradiction if the agent being predicted does implement the stated algorithm
        
        No, it is not a logical contradiction. The fact that someone can implement a stupid algorithm does not make the claim “it is a logical contradiction for someone to predict my actions in advance and tell me about it”. Just because someone could implement a stupid algorithm for decision making or a naive algorithm for prediction (don’t know when to shut up) doesn’t mean you can make that general claim. Not even close.
        
        Your argument would probably apply if I were refuting a different but somewhat related assertion.
        Nick_Tarleton 22 Jun 2010 18:11 UTC
        2 points
        Parent
        
        No, it is not a logical contradiction. The fact that someone can implement a stupid algorithm does not make the claim “it is a logical contradiction for someone to predict my actions in advance and tell me about it”. Just because someone could implement a stupid algorithm for decision making or a naive algorithm for prediction (don’t know when to shut up) doesn’t mean you can make that general claim. Not even close.
        
        It does mean you can make a general claim analogous to Rice’s theorem / the undecidability of the halting problem — not that such a claim is incredibly interesting for our purposes.
        
        Your argument would probably apply if I were refuting a different but somewhat related assertion.
        
        Point taken; it doesn’t seem like we actually disagree about anything.
        wedrifid 22 Jun 2010 18:43 UTC
        2 points
        Parent
        
        It does mean you can make a general claim analogous to Rice’s theorem / the undecidability of the halting problem — not that such a claim is incredibly interesting for our purposes.
        
        The cache of this conversation is buried somewhat in my brain but I think there is something to what you say here.
      - Sniffnoy 20 Jun 2010 19:39 UTC
        2 points
        Parent
        But an AI with that programming is predictable, and, much worse, manipulable! In order to get it to do anything, you need only inform it that you predicted that it will not do that thing*. It’s just a question of how long it takes people to realize that it has this behavior. It is far weaker than an AI that sometimes behaves as predicted and sometimes does not. Consider e.g. Alicorn’s sandwich example; if we imagine an AI that needed to eat (a silly idea but demonstrates the point), you don’t want it to refuse to do so simply because someone predicted it will (which anyone easily could).
        
        *This raises the question of whether the AI will realize that in fact you are secretly predicting that it will do the opposite. But once you consider that then the AI has to keep track of probabilities of what people’s true (rather than just claimed) predictions are, I think it becomes clear that this is just a silly thing to be implementing in the first place. Especially because even if people didn’t go up to it and say “I bet you’re going to try to keep yourself alive”, they would still be implicitly predicting it by expecting it.
        wedrifid 20 Jun 2010 20:19 UTC
        1 point
        Parent
        
        But once you have played a couple of games of ‘paper, scissors rock’ I think it becomes clear that this is just a silly thing to be implementing in the first place.
        
        Sniffnoy 20 Jun 2010 20:29 UTC
        2 points
        Parent
        Yes, that as well. Such an AI would, it seems offhand, be playing a perpetual game of Poisoned Chalice Switcheroo to no real end.