[deleted] comments on On Terminal Goals and Virtue Ethics

[deleted] 23 Jun 2014 10:10 UTC
3 points
Ok, let me finally get around to answering this.

FAI has definite subproblems. It is not a matter of scratching away at a chalkboard hoping to make some breakthrough in “philosophy” or some other proto-sensical field that will Elucidate Everything and make the problem solvable at all. FAI, right now, is a matter of setting researchers to work on one subproblem after another until they are all solved.

In fact, when I do literature searches for FAI/AGI material, I often find that the narrow AI or machine-learning literature contains a round dozen papers nobody working explicitly on FAI has ever cited, or even appears to know about. This is my view: there is low-hanging fruit in applying existing academic knowledge to FAI problems. Where such low-hanging fruit does not exist, the major open problems can largely be addressed by recourse to higher-hanging fruit within mathematics, or even to empirical science.

Since you believe it’s all so wide-open, I’d like to know what you think of as “the FAI problem”.

If you have an Oracle AI you can trust, you can use it to solve FAI problems for you. This is a fine approach.

We don’t have time to be dicking around doing basic research on whiteboards.

Luckily, we don’t need to dick around.
- DefectiveAlgorithm 24 Jun 2014 9:10 UTC
  2 points
  Parent
  
  an Oracle AI you can trust
  
  That’s a large portion of the FAI problem right there.
  
  EDIT: To clarify, by this I don’t mean to imply that FAI is easy, but that (trustworthy) Oracle AI is hard.
  - [deleted] 24 Jun 2014 9:33 UTC
    −2 points
    Parent
    In-context, what was meant by “Oracle AI” is a very general learning algorithm with some debug output, but no actual decision-theory or utility function whatsoever built in. That would be safe, since it has no capability or desire to do anything.
    - Mark_Friedenbach 24 Jun 2014 16:18 UTC
      2 points
      Parent
      You have to give it a set of directed goals and a utility function which favors achieving those goals, in order for the oracle AI to be of any use.
      - [deleted] 24 Jun 2014 17:36 UTC
        1 point
        Parent
        Why? How are you structuring your Oracle AI? This sounds like philosophical speculation, not algorithmic knowledge.
    - DefectiveAlgorithm 24 Jun 2014 9:55 UTC
      0 points
      Parent
      Ok, but a system like you’ve described isn’t likely to think about what you want it to think about or produce output that’s actually useful to you either.
      - [deleted] 24 Jun 2014 10:48 UTC
        0 points
        Parent
        Well yes. That’s sort of the problem with building one. Utility functions are certainly useful for specifying where logical uncertainty should be reduced.
        DefectiveAlgorithm 24 Jun 2014 11:50 UTC
        2 points
        Parent
        Well, ok, but if you agree with this then I don’t see how you can claim that such a system would be particularly useful for solving FAI problems.
        [deleted] 24 Jun 2014 12:33 UTC
        0 points
        Parent
        Well, I don’t know about the precise construction that would be used. Certainly I could see a human being deliberately focusing the system on some things rather than others.
    - Plasmon 24 Jun 2014 18:03 UTC
      −2 points
      Parent
      All existing learning algorithms I know of, and I dare say all that exist, have at least an utility function, and also something that could be interpreted as a decision theory. Consider for example support vector machines, which explicitly try to maximize a margin (that would be the utility function), and any algorithm for computing SVMs can be interpreted as a decision theory. Similar considerations hold for neural networks, genetic algorithms, and even the minimax algorithm.
      
      Thus, I strongly doubt that the notion of a learning algorithm with no utility function makes any sense.
      - [deleted] 24 Jun 2014 18:06 UTC
        2 points
        Parent
        Those are optimization criteria, but they are not decision algorithms in the sense that we usually talk about them in AI. A support vector machine is just finding the extrema of a cost function via its derivative, not planning a sequence of actions.
        Plasmon 24 Jun 2014 18:21 UTC
        −1 points
        Parent
        The most popular algorithm for SVMs does plan a sequence of actions, complete with heuristics as to which action to take. True, the “actions” are internal : they are changes to some data structure within the computer’s memory, rather than changes to the external world. But that is not so different from e.g. a chess AI, which assigns some heuristic score to chess positions and attempts to maximize it using a decision algorithm (to decide which move to make), even though the chessboard is just a data structure within the computer memory.
        [deleted] 24 Jun 2014 18:31 UTC
        0 points
        Parent
        “Internal” to the “agent” is very different from having an external output to a computational system outside the “agent”. “Actions” that come from an extremely limited, non-Turing-complete “vocabulary” (really: programming language or computational calculus (those two are identical)) are also categorically different from a Turing complete calculus of possible actions.
        
        The same distinction applies for hypothesis class that the learner can learn: if it’s not Turing complete (or some approximation thereof, like a total calculus with coinductive types and corecursive programs), then it is categorically not general learning or general decision-making.
        
        This is why we all employ primitive classifiers every day without danger, and you need something like Solomonoff’s algorithmic probability in order to build AGI.
        Plasmon 24 Jun 2014 19:26 UTC
        0 points
        Parent
        I agree, of course, that none of the examples I gave (“primitive classifiers”) are dangerous. Indeed, the “plans” they are capable of considering are too simple to pose any threat (they are, as you say, not Turing complete).
        
        But, that doesn’t seem to relevant to the argument at all. You claimed
        
        a very general learning algorithm with some debug output, but no actual decision-theory or utility function whatsoever built in. That would be safe, since it has no capability or desire to do anything.
        
        You claimed that a general learning algorithm without decision-theory or utility function is possible. I pointed out that all (harmless) practical learning algorithms we know of do in fact have decision theories and utility functions. What would “a learning algorithm without decision-theory or utility function, something that has no desire to do anything” even look like? Does the concept even make sense? Eliezer writes here
        
        A string of zeroes down an output line to a motorized arm is just as much an output as any other output; there is no privileged null, there is no such thing as ‘no action’ among all possible outputs. To ‘do nothing’ is just another string of English words, that would be interpreted the same as any other English words, with latitude.
        
        [deleted] 24 Jun 2014 21:07 UTC
        0 points
        Parent
        
        You claimed that a general learning algorithm without decision-theory or utility function is possible. I pointed out that all (harmless) practical learning algorithms we know of do in fact have decision theories and utility functions.
        
        /facepalm
        
        There is in fact such a thing as a null output. There is in fact such a thing as a learner with a sub-Turing hypothesis class. Such a learner with such a primitive output as “in the class” or “not in the class” does not engage in world optimization, that is: its actions do not, to its own knowledge, skew any probability distribution over future states of any portion of the world outside itself.
        
        It does not narrow the future.
        
        Now, what we’ve been proposing as an Oracle is even less capable. It would truly have no outputs whatsoever, only input and a debug view. It would, by definition, be incapable of narrowing the future of anything, even its own internal states.
        
        Perhaps I have misused terminology, but that is what I was referring to: inability to narrow the outer world’s future.
        Plasmon 24 Jun 2014 21:29 UTC
        0 points
        Parent
        This thing you are proposing, an “oracle” that is incapable of modeling itself and incapable of modeling its environment (either would require turing-complete hypotheses), what could it possibly be useful for? What could it do that today’s narrow AI can’t?
        [deleted] 24 Jun 2014 21:44 UTC
        0 points
        Parent
        A) It wasn’t my proposal.
        
        B) The proposed software could model the outer environment, but not act on it.
        Expand this thread
        Plasmon 24 Jun 2014 21:50 UTC
        0 points
        Parent
        Physics is turing-complete, so no, a learner that did not consider turing complete hypotheses could not model the outer environment.
        [deleted] 24 Jun 2014 21:56 UTC
        0 points
        Parent
        You seem to have lost the thread of the conversation. The proposal was to build a learner that can model the environment using Turing-complete models, but which has no power to make decisions or take actions. This would be a Solomonoff Inducer approximation, not an AIXI approximation.
        Plasmon 25 Jun 2014 5:52 UTC
        0 points
        Parent
        You said
        
        There is in fact such a thing as a learner with a sub-Turing hypothesis class. Such a learner with such a primitive output as “in the class” or “not in the class” does not engage in world optimization, that is: its actions do not, to its own knowledge, skew any probability distribution over future states of any portion of the world outside itself. … Now, what we’ve been proposing as an Oracle is even less capable.
        
        which led me to think you were talking about an oracle even less capable than a learner with a sub-Turing hypothesis class.
        
        It would truly have no outputs whatsoever, only input and a debug view. It would, by definition, be incapable of narrowing the future of anything, even its own internal states.
        
        If the hypotheses it considers are turing-complete, then, given enough information (and someone would give it enough information, otherwise they couldn’t do anything useful with it), it could model itself, its environment, the relation between its internal states and what shows up on the debug view, and the reactions of its operators on the information they learn from that debug view. Its (internal) actions very much would, to its own knowledge, skew the probability distribution over future states of the outer world.
- Eliezer Yudkowsky 23 Jun 2014 19:14 UTC
  2 points
  Parent
  
  I often find that the narrow AI or machine-learning literature contains a round dozen papers nobody working explicitly on FAI has ever cited, or even appears to know about.
  
  Name three. FAI contains a number of counterintuitive difficulties and it’s unlikely for someone to do FAI work successfully by accident. On the other hand, someone with a fuzzier model believing that a paper they found sure sounds relevant, why isn’t MIRI citing it, is far more probable from my perspective and prior.
  - [deleted] 24 Jun 2014 8:19 UTC
    3 points
    Parent
    I wouldn’t say that there’s someone out there directly solving FAI problems without having explicitly intended to do so. I would say there’s a lot we can build on.
    
    Keep in mind, I’ve seen enough of a sample of Eld Science being stupid to understand how you can have a very low prior on Eld Science figuring out anything relevant. But lacking more problem guides from you on the delta between plain AI problems and FAI problems, we go on what we can.
    
    One paper on utility learning that relies on a supervised-learning methodology (pairwise comparison data) rather than a de-facto reinforcement learning methodology (which can and will go wrong in well-known ways when put into AGI). One paper on progress towards induction algorithms that operate at multiple levels of abstraction, which could be useful for naturalized induction if someone put more thought and expertise into it.
    
    That’s only two, but I’m a comparative beginner at this stuff and Eld Science isn’t very good at focusing on our problems, so I expect that there’s actually more to discover and I’m just limited by lack of time and knowledge to do the literature searches.
    
    By the way, I’m already trying to follow the semi-official MIRI curriculum, but if you could actually write out some material on the specific deltas where FAI work departs from the preexisting knowledge-base of academic science, that would be really helpful.
  - TheAncientGeek 23 Jun 2014 20:07 UTC
    −9 points
    Parent
    Define doing FAI work successfully....
- Mark_Friedenbach 23 Jun 2014 18:46 UTC
  1 point
  Parent
  
  Since you believe it’s all so wide-open, I’d like to know what you think of as “the FAI problem”.
  
  1) Designing a program capable of arbitrary self-modification, yet maintaining guarantees of “correct” behavior according to a goal set that is by necessity included in the modifications as well.
  
  2) Designing such a high level set of goals which ensure “friendliness”.
  - TheAncientGeek 24 Jun 2014 8:59 UTC
    0 points
    Parent
    Designing, not evolving?
    - Mark_Friedenbach 24 Jun 2014 15:05 UTC
      5 points
      Parent
      That seems a circular argument. How do you use a self-modifying evolutionary search to find a program whose properties remain stable under self-modifying evolutionary search? Unless you started with the right answer, the search AI would quickly rewrite or reinterpret its own driving goals in a non-friendly way, and who knows what you’d end up with.
      - TheAncientGeek 24 Jun 2014 16:49 UTC
        −2 points
        Parent
        I don’t see why the search algorithm would need to be self modifying.
        
        I don’t see why you would be searching for stability as opposed to friendliNess. Human testers can judge friendliness directly.
        Mark_Friedenbach 24 Jun 2014 16:57 UTC
        2 points
        Parent
        It’s how you draw your system box. Evolutionary search is equivalent to a self-modifying program, if you think of the whole search process as the program. The same issues apply.
        
        I think the sequences do a good job at demolishing the idea that human testers can possibly judge friendliness directly, so long as the AI operates as a black box. If you have a debug view into the operation of the AI that is a different story, but then you don’t need friendliness anyway.
        TheAncientGeek 25 Jun 2014 13:32 UTC
        −1 points
        Parent
        If I draw a box around the selection algorithm and find there is nothing self modifying inside …where’s the circularity?
  - [deleted] 24 Jun 2014 8:10 UTC
    0 points
    Parent
    (1) is naturalized induction, logical uncertainty, and getting around the Loebian Obstacle.
    
    (2) is the cognitive science of evaluative judgements.
    - Mark_Friedenbach 24 Jun 2014 15:00 UTC
      0 points
      Parent
      Great, you’ve got names for answers you are looking for. That doesn’t mean the answers are any easier to find. You’ve attached a label to the declarative statement which specifies the requirements a solution must meet, but that doesn’t make the search for a solution suddenly have a fixed timeline. It’s uncertain research: it might take 5 years, 10 years, or 50 years, and throwing more people at the problem won’t necessarily make the project go any faster.
      - [deleted] 24 Jun 2014 15:06 UTC
        1 point
        Parent
        And how is trying to build a safe Oracle AI that can solve FAI problems for us not basic research? Or, to make a better statement: how is trying to build an Unfriendly superintelligent paperclip maximizer not basic research, at today’s research frontier?
        
        Logical uncertainty, for example, is a plain, old-fashioned AI problem. We need it for FAI, we’re pretty sure, but it’s turning out UFAI might need it, too.
        Mark_Friedenbach 24 Jun 2014 15:56 UTC
        1 point
        Parent
        “Basic research is performed without thought of practical ends.”
        
        “Applied research is systematic study to gain knowledge or understanding necessary to determine the means by which a recognized and specific need may be met.”
        
        -National Science Foundation.
        
        We need to be doing applied research, not basic research. What MIRI should do is construct a complete roadmap to FAI, or better: a study exhaustively listing strategies for achieving a positive singularity, and tactics for achieving friendly or unfriendly AGI, and concluding with a small set of most-likely scenarios. MIRI should then have identified risk factors which affect either the friendliness of the AGI in each scenario, or the capability of the UFAI to do damage (in boxing setups). These risk factors should be prioritized based on how much it is expected knowing more about each would bias the outcome in a positive direction, and it should be these problems as the topics of MIRI workshops.
        
        Instead MIRI is performing basic research. It’s basic research not because it is useless, but because we are not certain at this point in time what relative utility it will have. And if we don’t have a grasp on expected utility, how can we prioritize? There’s a hundred avenues of research which are important to varying degrees to the FAI project. I worked for a number of years at NASA-Ames Research Center, and in the same building as me was the Space Biosciences Division. Great people, don’t get me wrong, and for decades they have funded really cool research on the effects of microgravity and radiation on living organisms, with the justification that such effects and counter-measures need to be known for long duration space voyages, e.g. a 2-year mission to Mars. Never mind that the microgravity issue is trivially solved with a few thousand dollar steel tether connecting the upper stage to the space craft as they spin to create artificial gravity, and the radiation exposure is mitigated by having a storm shelter in the craft and throwing a couple of Martian sandbags on the roof once you get there. It’s spending millions of dollars to develop the pressurized-ink “Space Pen”, when the humble pencil would have done just fine.
        
        Sadly I think MIRI is doing the same thing, and it is represented in one part of your post I take huge issue with:
        
        Logical uncertainty, for example, is a plain, old-fashioned AI problem. We need it for FAI, we’re pretty sure...
        
        If we’re only “pretty sure” it’s needed for FAI, if we can’t quantify exactly what its contribution will be, and how important that contribution is relative to other possible things to be working on.. then we have some meta-level planning to do first. Unfortunately I don’t see MIRI doing any planning like this (or if they are, it’s not public).
        [deleted] 24 Jun 2014 16:19 UTC
        0 points
        Parent
        Are you on the “Open Problems in Friendly AI” Facebook group? Because much of the planning is on there.
        
        If we’re only “pretty sure” it’s needed for FAI, if we can’t quantify exactly what its contribution will be, and how important that contribution is relative to other possible things to be working on.. then we have some meta-level planning to do first. Unfortunately I don’t see MIRI doing any planning like this (or if they are, it’s not public).
        
        Logical uncertainty lets us put probabilities to sentences in logics. This, supposedly, can help get us around the Loebian Obstacle to proving self-referencing statements and thus generating stable self-improvement in an agent. Logical uncertainty also allows for making techniques like Updateless Decision Theory into real algorithms, and this too is an AI problem: turning planning into inference.
        
        The cognitive stuff about human preferences is the Big Scary Hard Problem of FAI, but utility learning (as Stuart Armstrong has been posting about lately) is a way around that.
        
        If you can create a stably self-improving agent that will learn its utility function from human data, equipped with a decision theory capable of handling both causative games and Timeless situations correctly… then congratulations, you’ve got a working plan for a Friendly AI and you can start considering the expected utility of actually building it (at least, to my limited knowledge).
        
        Around here you should usually clarify whether your uncertainty is logical or indexical ;-).
        Mark_Friedenbach 24 Jun 2014 16:51 UTC
        0 points
        Parent
        Or.. you could use a boxed oracle AI to develop singularity technologies for human augmentation, or other mechanisms to keep moral humans in the loop through the whole process, and sidestep the whole issue of FAI and value loading in the first place.
        
        Which approach do you think can be completed earlier with similar probabilities of success? What data did you use to evaluate that, and how certain are you of its accuracy and completeness?
        [deleted] 24 Jun 2014 17:35 UTC
        1 point
        Parent
        I actually really do think that de novo AI is easier than human intelligence augmentation. We have good cognitive theories for how an agent is supposed to work (including “ideal learner” models of human cognitive algorithms). We do not have very good theories of in-vitro neuroengineering.
        Mark_Friedenbach 24 Jun 2014 22:33 UTC
        0 points
        Parent
        Yes, but those details would be handled by the post-”FOOM” boxed AI. You get to greatly discount their difficulty.
        [deleted] 25 Jun 2014 6:24 UTC
        1 point
        Parent
        This assumes that you have usable, safe Oracle AI which then takes up your chosen line of FAI or neuroengineering problems for you. You are conditioning the hard part on solving the hard part.
- TheAncientGeek 23 Jun 2014 19:39 UTC
  0 points
  Parent
  You don’t need to solve philosophy to solve FAI, but philosophy is relevant to figuring out, in broad terms, the relative livelihoods of various problems and solutions.