SteveG comments on Superintelligence 10: Instrumentally convergent goals

SteveG 18 Nov 2014 18:12 UTC
2 points
0
One reason we did not go travelling might have been a resource constraint, perhaps money but also a limited ability to plan good trips because of distraction or knowledge should be counted as a limitation of planning resources.

That aside, people still have multiple drives which are not really goals, and we sort of compromise amongst these drives. The approach the mind takes is not always the best.

In people, it’s really those mid-brain drives that run a lot of things, not intellect.

We could try to carefully program in some lower-level or more complex sets of “drives” into an AI. The “utility function” people speak of in these threads is really more like an incredibly overpowering drive for the AI.

If it is wrong, then there is no hedge, check or diversification. The AI will just pursue that drive.

As much as our minds often .take us in the wrong direction with our drives, at least they are diversified and checked.

Checks and diversification of drives seem like an appealing element of mind design, even at significant cost to efficiency at achieving goals. We should explore these options in detail.
- grobstein 18 Nov 2014 19:46 UTC
  2 points
  0
  Parent
  In humans, goal drift may work as a hedging mechanism.
- grobstein 18 Nov 2014 19:49 UTC
  1 point
  0
  Parent
  But I don’t think “utility function” in the context of this post has to mean, a numerical utility explicitly computed in the code.
  
  It could just be, the agent behaves as-if its utilities are given by a particular numerical function, regardless of whether this is written down anywhere.
  - SteveG 19 Nov 2014 4:33 UTC
    2 points
    0
    Parent
    People do not behave as if we have utilities given by a particular numerical function that collapses all of their hopes and goals into one number, and machines need not do it that way, either.
    
    Often when we act, we end up 25% short of the optimum solution, but we have been hypothesizing systems with huge amounts of computing power.
    
    If they frequently end up 25% or even 80% short of behaving optimally, so what? In exchange for an AGI that stays under control we should be willing to make the trade-off.
    
    In fact, if their efficiency falls by 95%, they are still wildly powerful.
    
    Eliezer and Bostrom have discovered a variety of difficulties with AGIs which can be thought of as collapsing all of their goals into a single utility function.
    
    Why not also think about making other kinds of systems?
    
    An AGI could have a vast array of hedges, controls, limitations, conflicting tendencies and tropisms which frequently cancel each other out and prevent dangerous action.
    
    The book does scratch the surface on these issues, but it is not all about fail-safe mind design and managed roll-out. We can develop a whole literature on those topics.
    - NxGenSentience 24 Nov 2014 11:52 UTC
      0 points
      0
      Parent
      
      People do not behave as if we have utilities given by a particular numerical function that collapses all of their hopes and goals into one number, and machines need not do it that way, either.
      
      I think this point is well said, and completely correct.
      
      ..
      
      Why not also think about making other kinds of systems?
      
      An AGI could have a vast array of hedges, controls, limitations, conflicting tendencies and tropisms which frequently cancel each other out and prevent dangerous action.
      
      The book does scratch the surface on these issues, but it is not all about fail-safe mind design and managed roll-out. We can develop a whole literature on those topics.
      
      I agree. I find myself continually wanting to bring up issues in the latter class of issues… so copiously so, that frequently it feels like I am trying to redesign our forum topic. So, I have deleted numerous posts-in-progress that fall into that category. I guess those of us who have ideas about fail-safe mind design that are more subtle—or to put it more neutrally—do not fit the running paradigm in which the universe of discourse is that of transparent, low-dimensional (low dimensional function range space, not low dimensional function domain space) utility functions, need to start writing our own white papers.
      
      When I hear the Bostrom claims only 7 people in the world are thinking full time and productively about (in essence) fail safe mind design, or that someone at MIRI wrote only FIVE people are doing so (though in the latter case, the author of that remark did say that there might be others doing this kind of work “on the margin”, whatever that means), I am shocked.
      
      It’s hard to believe, for one thing. Though, the people making those statements must have good reasons for doing so.
      
      But maybe the deriviation of such low numbers could be more understandable, if one stipulates that “work on the problem” is to be counted if and only if candidate people belong to the equivalence class of thinkers restricting their approach to this ONE, very narrow conceptual and computational vocabulary.
      
      That kind of utility function-based discussion (remember when they were called ‘heuristics’ in the assigned projects, in our first AI courses?) has its value, but it’s a tiny slice of the possible conceptual, logical and design pie … about like looking at the night sky through a soda straw. If we restrict ourselves to such approaches, no wonder people think it will take 50 or 100 years to do AI of interest.
      
      Ourside of the culture of collapsing utility functions and the like, I see lots of smart (often highly mathematical, so they count as serious) papers in whole brain chaotic resonant neurodynamics; new approachs to foundations of mental health issues and disorders of subjective empathy (even some application of deviant neurodynamics to deviant cohort value theory, and defective cohort “theory of mind”—in the neuropsychiatric and mirror neuron sense) that are grounded in, say, pathologies with transient Default Node Network coupling… and distrubances of phase coupled equilibria across the brain.
      
      If we run out of our own ideas to use from scratch (which I don’t think is at all the case … as your post might suggest, we have barely scratched the surface), then we can go have a look at current neurology and neurobiology, where people are not at all shy about looking for “information processing” mechanisms underlying complex personality traits, even underlying value and aesthetic judgements.
      
      I saw a visual system neuroscientist’s paper the other day offering a theory of why abstract (ie. non-representational) art is so intriguing to (not all, but some) human brains. It was a multi-layered paper, discussing some transiently coupled neurodynamical mechanisms of vision (the authors’ specialties), some reward system neuromodulator concepts, and some traditional concepts expressed at a phenomenological, psychological level of description. An ambitious paper, yes!
      
      But ambition is good. I keep saying, we can’t expect to do real AI on the cheap.
      
      A few hours or days reading such papers is good fertilizer, even if we do not seek to translate, in any direct way (like copying “algorithms” from natural brains) wetware brain research, into our goal, which presumably is to do dryware mind design --- and do it in a way where we choose our own functional limits, not have nature’s 4.5 billion years of accidents choose boundary conditions on substrate platforms, for us.
      
      Of course, not everyone is interested in doing this. I HAVE learned in this forum, that “AI” is a “big tent”. Lots of uses exist for narrow AI, in thousands of indutries and fields. Thousands of narrow AI systems are already in play.
      
      But, really… aren’t most of us interested in this topic because we want the more ambitious result?
      
      Bostrom says “we will not be concerned with the metaphysics of mind...” and ”...not concern ourselves whether these entities have genuine self-awareness....”
      
      Well, I guess we won’t be BUILDING real minds anytime soon, then. One can hardly expect to create, that which one won’t even openly discuss. Bostrom is wrting and speaking, using the language of “agency” and “goals” and “motivational sets”, but he is only using those terms metaphorically.
      
      Unless, that is, everyone else in here (other than me) actually is prepared to deny that we—who spawned those concepts, to describe rich, conscious, intentionally entrained features of the lives of self-aware, genuine conscious creatures—are different, i.e., that we are conscious and self-aware.
      
      No one here needs a lesson in intellectual history. We all know that people did deny that , back in the behaviorism era. (I have studied the reasons—philosophical and cultural—and continue to uncover in great detail, mistaken assumptions out of which that intellectual fad grew.)
      
      Only ff we do THAT again, will we NOT be using “agent” metaphorically, when we apply that to machines with no real consciousness, because ex hypothesi WE’d posess no minds either, in the sense we all know we do posess, as conscious humans.
      
      We’d THEN be using it (’agent”, “goal”, “motive” … the whole equivalence class of related nouns and predicates) in the same sense for both classes of entities (ourselves, and machines with no “awareness”, where the latter is defined as anyting other than public, 3rd person observable behavior.)
      
      Only in this case, would it not be a metaphor to use ‘agent, motive’, etc. in describing intelligent (but not conscious) machines, whcih evidently is the astringent conceptual model within which Bostrom wishes to frame HLAI --- proscribing considerations, as he does, of whether they are genuinely self-aware.
      
      But, well, I always thought that that excessively positivistic attitude, had more than a little something to do with the “AI winter” (just like it is widely acknowledged to have been responsible for the neuroscience winter that paralleled it.)
      
      Yet neuroscientists are not embarassed to now say, “That was a MISTAKE, and—fortunately—we are over it. We wasted some good years, but are no longer wasting time denying the existence of consciousness, the very thing that makes the brain interesting and so full of fundamental scientific interest. And now, the race is on to understand how the brain creates real mental states.”
      
      NEUROSCIENCE has gotten over that problem with discussing mental states qua mental states , clearly.
      
      And this is one of the most striking about-faces in the modern intelllectual history of science.
      
      So, back to us. What’s wrong with computer science? Either AI-ers KNOW that real consciousness exists, just like neuroscientists do, and AI-ers just don’t give a hoot about making machines that are actually conscious.
      
      Or, AI-ers are afraid of tackling a problem that is a little more interesting, deeper, and harder (a challenge that gets thousands of neuroscientists and neurophilosophers up on the morning.)
      
      I hope the latter is not true, because I think the depth and possibilities of the real thing—AI with consciousnes—are what gives it all the attraction (and holds, in the end, for reasons I won’t attempt to desribe in a short post, the only possibility of making the things friendly, if not benificient.)
      
      Isn’t that what gives AI its real interest? Otherwise, why not just write business software?
      
      Could it be that Bostrom is throwing out the baby with the bathwater, when he stipulates that the discussion, as he frames it, can be had (and meaningful progress made), without the interlocutors (us) being concerned about whether AIs have genuine self awareness, etc?