Vladimir_Nesov comments on Why would an AI try to figure out its goals?

Vladimir_Nesov 16 Nov 2011 22:29 UTC
0 points
I’d expect a “narrow AI” that’s capable enough to destroy humanity to be versed in enough domains to qualify as goal-directed (according to a notion of having a goal that refers to a tendency to do something consequentialistic in a wide variety of domains, which seems to be essentially the same thing as “being competent”, since you’d need a notion of “competence” for that, and notions of “competence” seem to refer to successful goal-achievement given some goals).
- cousin_it 16 Nov 2011 22:49 UTC
  0 points
  Parent
  Just being versed in nanotech could be enough. Or exotic physics. Or any number of other narrow domains.
  - Vladimir_Nesov 16 Nov 2011 22:56 UTC
    0 points
    Parent
    Could be, but not particularly plausible if would still naturally qualify as “AI-caused catastrophe”, rather than primarily a nanotech/physics experiment/tools going wrong with a bit of AI facilitating the catastrophe.
    
    (I’m interested in what you think about the AGI competence=goals thesis. To me this seems to dissolve the question and I’m curious if I’m missing the point.)
    - cousin_it 17 Nov 2011 0:35 UTC
      3 points
      Parent
      That doesn’t sound right. What if I save people on Mondays and kill people on Tuesdays, being very competent at both? You could probably stretch the definition of “goal” to explain such behavior, but it seems easier to say that competence is just competence.
      - Vladimir_Nesov 17 Nov 2011 0:53 UTC
        0 points
        Parent
        
        You could probably stretch the definition of “goal” to explain such behavior
        
        Characterize, not explain. This defines (idealized) goals given behavior, it doesn’t explain behavior. The (detailed) behavior (together with the goals) is perhaps explained by evolution or designer’s intent (or error), but however evolution (design) happened is a distinct question from what is agent’s own goal.
        
        Saying that something is goal-directed seems to be an average fuzzy category, like “heavy things”. Associated with it are “quantitative” ideas of a particular goal, and optimality of its achievement (like with particular weight).
      - Vladimir_Nesov 17 Nov 2011 0:47 UTC
        0 points
        Parent
        This could be a goal, maximization of (Monday-saved + Tuesday-killed). If resting and preparation the previous day helps, you might opt for specializing in Tuesday-killing, but Monday-save someone if that happens to be convenient and so on…
        
        I think this only sounds strange because humans don’t have any temporal terminal values, and so there is an implicit moral axiom of invariance in time. It’s plausible we could’ve evolved something associated with time of day, for example. (It’s possible we actually do have time-dependent values associated with temporal discounting.)
        wedrifid 17 Nov 2011 3:33 UTC
        0 points
        Parent
        
        I think this only sounds strange because humans don’t have any temporal terminal values, and so there is an implicit moral axiom of invariance in time.
        
        I don’t believe this is the case. I need to use temporal terminal values to model the preferences that I seem to have.
        Vladimir_Nesov 17 Nov 2011 3:43 UTC
        0 points
        Parent
        If you are not talking about temporal discounting (which I mentioned), as your comment stands I can only see that there is disagreement, but don’t understand why. (Values I can think of whose expression is plausibly time-dependent seem to be better explained in terms of context.)
        wedrifid 17 Nov 2011 3:48 UTC
        2 points
        Parent
        
        If you are not talking about temporal discounting (which I mentioned)
        
        Yes, this is the most obvious one. I’m not sure if there are others. I would not have mentioned this if I had noticed your caveat.