cousin_it comments on The AI design space near the FAI [draft]

cousin_it Mar 18, 2012, 12:36 PM
3 points
Very nice! To tweak your argument a bit:

1) As AI designs approach FAI, they become potentially much worse for mankind than random AIs that just kill us. (This is similar to Eliezer’s fragility of value thesis.)

2) Human errors of various kinds make it all but certain that we will build a bad random AI or a monstrous almost-FAI even if we think we have a good FAI design. Such errors may include coding mistakes (lots of those in every non-trivial program), tiny conceptual mistakes (same but harder to catch), bad last-minute decisions caused by stress, etc.

3) Therefore it’s not obvious that pushing toward FAI helps mankind.

It would be interesting to hear what SingInst folks think of this argument.
What links here?
- Wei Dai's comment on Holden Karnofsky’s Singularity Institute Objection 1 by Paul Crowley (May 13, 2012, 12:58 PM; 6 points)
- Wei Dai's comment on S-risks: Why they are the worst existential risks, and how to prevent them by Kaj_Sotala (Jun 21, 2017, 8:17 AM; 4 points)
- Will_Newsome Mar 19, 2012, 9:11 AM
  11 points
  Parent
  I’ve been avoiding helping SingInst and feel guilty when I do help them because of a form of this argument. The apparent premature emphasis on CEV, Eliezer’s spotty epistemology and ideology (or incredibly deep ploys to make people think he has spotty epistemology and ideology), their firing Steve Rayhawk (who had an extremely low salary) while paying Eliezer about a hundred grand a year, &c., are disturbing enough that I fear that supporting them might be the sort of thing that is obviously stupid in retrospect. They have good intentions, but sometimes good intentions aren’t enough, sometimes you have to be sane. Thus I’m refraining from supporting or condemning them until I have a much better assessment of the situation. I have a similarly tentative attitude toward Leverage Research.
  - Wei Dai Mar 22, 2012, 6:24 PM
    4 points
    Parent
    Carl Shulman decided to join SingInst, so they can’t be too crazy. :) Seriously, what’s your explanation for why he seems to think SingInst is worth supporting but you don’t (at least not yet)?
    
    BTW, somebody needs to update SingInst’s list of research fellows, unless Carl has also been fired (but he just got hired in 2011 so that seems unlikely).
    - wedrifid Mar 22, 2012, 7:29 PM
      8 points
      Parent
      
      Carl Shulman decided to join SingInst, so they can’t be too crazy. :) Seriously, what’s your explanation for why he seems to think SingInst is worth supporting but you don’t (at least not yet)?
      
      Far be it from me to suggest that Carl Shulman suffers from the frailties, biases and tendency to respond incentives prevalent amongst his fellow humans but didn’t Carl Shulman also marry one of SingInst’s most prominent researchers during the same time period that you referenced? That’s the sort of thing that tends to influence human behavior.
      - Will_Newsome Mar 22, 2012, 7:43 PM
        5 points
        Parent
        He’s been affiliated with SingInst much longer than that, though.
    - Will_Newsome Mar 22, 2012, 9:48 PM
      2 points
      Parent
      (I’ve typed out about five different responses thus far, but:) I guess Carl trusts in Eliezer’s prudence more than I do, or is willing to risk Eliezer getting enough momentum to do brain-in-a-box-in-a-basement if it also means that SingInst gains more credibility with which to influence government Manhattan project whole brain emulation endeavors, or gains more credibility with which to attract/hire brilliant strategic thinkers. Carl and I disagree about psi; this might cause him to be more confident than I am that the gods aren’t going to mess with us (aren’t already messing with us). Psi really confuses me and I’m having a lot of trouble seeing its implications. “Supporting” would mean different things for me and Carl; for me it means helping revise papers occasionally, for him it means a full-time job. Might be something to do with marginals. I think that the biggest difference is that for Carl “supporting” involves shaping SingInst policy and making it more strategic, whereas I don’t have that much leverage. I have a very strong bias towards being as meta as possible and staying as meta as possible for as long as possible, probably to a greater extent than Carl; I think that doing things is almost always a bad idea, whereas talking about things is in itself generally okay. Unfortunately when SingInst talks about things that tends to cause people to do things, like how the CEV document has led to a whole bunch of people thinking about FAI in terms of CEV for no particularly good reason Anyway, it’s a good question, and I don’t have a good answer. Why don’t you think SingInst is worth supporting when Carl does?
      - Wei Dai Mar 26, 2012, 3:47 AM
        6 points
        Parent
        
        Why don’t you think SingInst is worth supporting when Carl does?
        
        I have provided SingInst with various forms of support in the past, but I’ve done so privately and like to think of it as “helping people I know and like” instead of “supporting SingInst”. I guess for these reasons:
        
        I’m afraid that adopting the role/identity of a SingInst supporter will affect my objectivity when thinking about Singularity-related issues. Carl might be more confident in his own rationality.
        SingInst is still strongly associated with wanting to directly build FAI. It’s a bad idea according to my best guess, and I want to avoid giving the impression that I support the idea. Carl may have different opinions on this subject or do not care as much about giving other people wrong impressions of his beliefs.
        Carl may have the above worries as well but the kinds of support he can give requires that he does so publicly.
        
        “Supporting” would mean different things for me and Carl; for me it means helping revise papers occasionally, for him it means a full-time job. [...] I think that the biggest difference is that for Carl “supporting” involves shaping SingInst policy and making it more strategic, whereas I don’t have that much leverage.
        
        Carl has been writing and publishing a lot of papers lately. Surely it couldn’t hurt to help with those papers?
        Will_Newsome Mar 28, 2012, 11:36 PM
        4 points
        Parent
        
        SingInst is still strongly associated with wanting to directly build FAI. It’s a bad idea according to my best guess, and I want to avoid giving the impression that I support the idea.
        
        I think this is a serious concern, especially as I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous. If my suspicion is wrong then SingInst working directly on FAI isn’t that harmful marginally speaking, but if it’s right then SingInst’s support of decision theory research might make it one of the most dangerous institutions around.
        
        Given that you’re worried and that you’re highly respected in the community, this would seem to be one of those “stop, melt, and catch fire” situations that Eliezer talks about, so I’m confused about SingInst’s apparently somewhat cavalier attitude. They seem to be intent on laying the groundwork for the ennead.
        Mitchell_Porter Mar 29, 2012, 12:20 AM
        7 points
        Parent
        
        I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous
        
        A chess computer doesn’t need reflection to win at chess. An AGI doesn’t need reflection to make its own causal models. So if the game is ‘eat the earth’, an unreflective AGI seems like a contender. One might argue that it needs to ‘understand’ reflection in order to understand the human beings that might oppose it, or to model its own nature, but I think the necessary capacities could emerge in an indirect way. In making a causal model of an external reflective intelligence it might need to worry about the halting problem, but computational resource bounds are a real-world issue that will anyway require it to have heuristics for noticing when a particular subtask is taking up too much time. As for self-modelling, it may be capable of forming partial self-models relevant for reasoning correctly about the implications of self-modification (or just the implications of damage to itself), just by applying standard causal modelling to its own physical vicinity, i.e. without any special data representations or computational architecture designed to tell it ‘this item represents me, myself, and not just another object in the world’.
        
        It would be desirable to have a truly rigorous understanding of both these issues, but just thinking about them informally already tells me that there’s no safety here, we can’t say “whew, at least that isn’t possible”. Finally, a world-eating AGI equipped with a knowledge of physics and a head start in brute power might never have to worry about reflection, because human beings and their machines are just too easy to swat aside. You don’t need to become an entomologist before you can stomp an insect.
        Will_Newsome Mar 29, 2012, 12:46 AM
        6 points
        Parent
        I agree with everything you’ve written as far as my modal hypothesis goes, but I also think we’re going to lose in that case, so I’ve sort of renormalized to focus my attention at least somewhat more on worlds where for some reason academic/industry AI approaches don’t work, even if that requires some sort of deus ex machina. My intuition says that highly recursive narrow AI style techniques should give you AGI, but to some extent this does go against e.g. the position of many philosophers of mind, and in this case I hope they’re right. Trying to imagine intermediate scenarios led me to think about this kinda stuff.
        
        It would of course be incredibly foolish to entirely write off worlds where AGI is relatively easy, but I also think we should think about cases where for whatever reason that isn’t the case, and if it’s not the case then SingInst is in a uniquely good position to build uFAI.
        J_Taylor Apr 1, 2012, 11:32 PM
        3 points
        Parent
        
        I’ve sort of renormalized to focus my attention at least somewhat more on worlds where for some reason academic/industry AI approaches don’t work, even if that requires some sort of deus ex machina
        
        I apologize for asking, but I just want to clarify something. When you write ‘deus ex machina’, you’re not solely using the term in a metaphorical sort of way, are you? Because, if you mean what it sort of sounds like you mean, at least some of your public positions suddenly make a lot more sense.
        Will_Newsome Apr 2, 2012, 1:29 AM
        3 points
        Parent
        Yes, literal deus ex machina is one scenario which I find plausible.
        Wei Dai Mar 29, 2012, 3:15 PM
        2 points
        Parent
        
        I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous
        
        Another way in which decision theoretic insights may be harmful is if they increase the sophistication of UFAI and allow them to control less sophisticated AGIs in other universes.
        
        They seem to be intent on laying the groundwork for the ennead.
        
        I’m trying to avoid being too confrontational, which might backfire, or I might be wrong myself. It seems safer to just push them to be more strategic and either see the danger themselves or explain why it’s a good idea despite the dangers.
    - Will_Newsome Mar 22, 2012, 6:53 PM
      0 points
      Parent
      Would it defeat your purpose if I replied via private message?
      
      BTW, somebody needs to update SingInst’s list of research fellows, unless Carl has also been fired (but he just got hired in 2011 so that seems unlikely).
      
      There’s probably a reason. Weird employee status, new to the country, personal preference, or something like that.
      - Wei Dai Mar 22, 2012, 6:57 PM
        1 point
        Parent
        Well, it would defeat a part of my purpose (i.e., encouraging discussions of strategies for achieving positive Singularity) but of course I also want to know your answer for myself.
  - cousin_it Mar 19, 2012, 7:38 PM
    3 points
    Parent
    
    firing Steve Rayhawk
    
    He seems to be still on the list of research associates...
    - Will_Newsome Mar 20, 2012, 6:10 AM
      4 points
      Parent
      Who isn’t? ;P Anyway, he used to be a Research Fellow, i.e., on the payroll.
      - John_Maxwell Apr 14, 2012, 5:38 AM
        0 points
        Parent
        My impression is that he didn’t seem to be producing much research, and that they’re still open to paying him on a per-output basis.
  - Till_Noonsome Jun 22, 2012, 5:48 PM
    −3 points
    Parent
    Obviously this emphasis on CEV is absurd, but I don’t know what the alternatives are. Do you? And what are they? And can thinking about CEV be used to generate better alternatives?
    - Will_Newsome Jun 22, 2012, 7:00 PM
      2 points
      Parent
      
      Obviously this emphasis on CEV is absurd, but I don’t know what the alternatives are. Do you? And what are they?
      
      I’m a fan of the “just solve decision theory and the rest will follow” approach. Some hybrid of “just solve decision theory” and the philosophical intuitions behind CFAI might also do it and might be less likely to spark AGI by accident. And there’s technically the oracle AI option, but I don’t like that one.
      
      And can thinking about CEV be used to generate better alternatives?
      
      Maybe, but it seems to me that the opportunity cost is high. CEV wastes people’s time on “extrapolation algorithms” and thinking about whether preferences sufficiently converge and other problems that generally aren’t on the correct meta level. It also makes people think that AGI requires an ethical solution rather than a make-sure-you-solve-everything-ever-because-this-is-your-only-chance-bucko solution to all philosophy ever.
    - Winsome_Troll Jun 23, 2012, 7:00 AM
      −7 points
      Parent
      When there is no more namespace in hell, the dead will troll the earth.
- wedrifid Mar 18, 2012, 12:55 PM
  −1 points
  Parent
  
  2) Human errors of various kinds make it all but certain that we will build a monstrous almost-FAI even if we’re certain of having a good FAI design.
  
  Lies! We’re not intelligent or reliable enough for us to be all but certain of getting that close to FAI. We are far, far more likely to build one of the AIs that just kill us!
  
  Would you mind tweaking the argument again such that it includes something like “given that we build an AI that is close to an FAI it is all but certain...”? That would make it appear far stronger to those of us who consider FAI a longshot (that is nevertheless the best option we have.)
  - cousin_it Mar 18, 2012, 12:58 PM
    0 points
    Parent
    Good point, thanks! Tweaked the quoted part. What do you think of the argument?
    - wedrifid Mar 18, 2012, 1:11 PM
      1 point
      Parent
      
      What do you think of the argument?
      
      There is a good point contained therein. I wouldn’t quite accept it since the conclusion is rather strong. I think it is a lot less (relatively) likely that we arrive at a worse-than-extinction uFAI than an outright certainty and even well below even.
      
      The above said I definitely support increased consideration of that space around FAI that really does suck. This is something that is actually suppressed. For example, when someone points out something bad that could come from a future with a near-Friendly AI you should say “You’re right! An AI doing that would be unfriendly indeed but it is the sort of thing we could make if not careful. Good point. Let’s not do that!”
      
      You could then proceed to discuss the possibility that the FAI> of certain groups could fit into the category of “uFAI near FAI” and should be similarly avoided.
      - Dmytry Mar 18, 2012, 1:31 PM
        −2 points
        Parent
        Well, the crux of the issue is that the random AIs may be more likely to leave us alone than near-misses at FAI.
        wedrifid Mar 18, 2012, 1:35 PM
        6 points
        Parent
        
        Well, the crux of the issue is that the random AIs may be more likely to leave us alone than near-misses at FAI.
        
        Random AIs will very nearly all kill us. That is, the overwhelming majority of random AIs do stuff. Doing stuff takes resources. We are resources. We are the resources that are like… right near where it was created and are most necessary to bootstrap it’s way to the rest of the universe.
        
        For the majority of AIs we are terminally irrelevant but our termination is slightly instrumentally useful.
        Dmytry Mar 18, 2012, 2:01 PM
        4 points
        Parent
        You’re making a giant number of implicit ill-founded assumptions here that must all be true. Read my section on the AI space in general.
        
        Firstly, you assume that the ‘stuff’ is unbounded. Needs not be true. I for one thing want to figure out how universe works, out of pure curiosity. That may likely be a very bounded goal right here. I also like to watch nature, or things like mandelbox fractal, which is unbounded but also preserves the nature. Those are valid examples of goals. The AI crowd, when warned not to anthropomorphize, switches to animalomorphization, or worse yet, bacteriomorphization where the AI is just a smarter gray goo, doing the goo thing intelligently. No. The human goal system can be the lower bound on the complexity of the goal system of super human AI. edit: and on top of that, all the lower biological imperatives like desire to reproduce sexually, we tend to satisfy in very unintended ways, from porn to birth control. If i were an upload i would get rid of much of those distracting nonsense goals.
        
        Secondly, you assume that achieving of ‘stuff’ is raw resource-bound, rather than e.g. structuring the resources—bound. So that we’ll be worth less than the atoms we are made of. Which needs never happen.
        
        In this you have a sub-assumption that the AI can only do stuff the gray goo way, and won’t ever discover anything cleverer (like quantum computing, which grows much more rapidly with size) which it would e.g. want to keep crammed together because of light speed lag. The “ai is going to eat us all” is just another of those priveledged baseless guesses what an entity way smarter than you would do. The near-FAI is the only thing with which we are pretty sure it won’t leave us alone.
        What links here?
        wedrifid's comment on The AI design space near the FAI [draft] by Dmytry (Mar 18, 2012, 2:22 PM; -1 points)
        pangel Mar 18, 2012, 3:01 PM
        5 points
        Parent
        Unless its utility function has a maximum, we are at risk. Observing Mandelbrot fractals is probably enhanced by having all the atoms of a galaxy playing the role of pixels.
        
        Would you agree that unless the utility function of a random AI has a (rather low) maximum, and barring the discovery of infinite matter/energy sources, its immediate neighbourhood is likely to get repurposed?
        
        I must say that at least I finally understand why you think botched FAIs are more risky than others.
        
        But consider, as Ben Goertzel mentioned, that nobody is trying to build a random AI. Whatever achieves AGI-level is likely to have a built-in representation for humans and to have a tendency to interact with them. Check to see if I actually understood you correctly: does the previous sentence make it more probable that any future AGI is likely to be destructive?
        Dmytry Mar 18, 2012, 3:11 PM
        0 points
        Parent
        
        Unless its utility function has a maximum, we are at risk. Observing Mandelbrot fractals is probably enhanced by having all the atoms of a galaxy playing the role of pixels.
        
        Cruel physics, cruel physics. There is speed of light delay, that’s thing, and I’m not maniacal about mandelbox (its a 3d fractal) anyway, I won’t want to wipe out interesting stuff in the galaxy for minor gain in the resolution. And if i can circumvent speed of light, all bets are off WRT what kind of resources i would need (or if i would need any, maybe i get infinite computing power in finite space and time)
        
        But consider, as Ben Goertzel mentioned, that nobody is trying to build a random AI.
        
        How’s about generating human brain (in crude emulation of developmental biology)? It’s pretty darn random.
        
        My argument is that, the AI whose only goal is helping humans, if bugged, has the only goal that is messing with humans. The AI that just represents humans in a special way is not this scary, albeit still is, to some extent.
        
        Consider this seed AI: evolution. Comes up with mankind, that tries to talk with outside (god) without even knowing that outside exists, has endangered species list. Of course, if we are sufficiently resource bound, we are going to eat up all other forms of life, but we’d be resource bound because we are too stupid to find a way to go to space, and we clearly would rather not exerminate all other lifeforms.
        
        This example ought to entirely invalidate this notion that ‘almost all’ AIs in AI design space are going to eat you. We have 1 example: evolution going FOOM via evolving human brain, and it cares about wildlife somewhat, yes we do immense damage to environment, but we would not if we could avoid it , even at some expense. If you have 1 example probe into random AI space, and it’s not all this bad, you seriously should not go around telling how you’re extremely sure it is just blind luck et cetera.
        latanius Mar 18, 2012, 3:35 PM
        5 points
        Parent
        Add some anthropics… humans are indeed a FOOMing intelligence relative to the evolutionary timescale, but it’s no use declaring that “we’ve got one example of a random intelligence, and look, its humans::goal_system is remarkably similar to our own goal_system, therefore the next random try will also be similar”...
        
        I’m also pretty sure that evolution would hate us if it had such a concept: instead of our intended design goal of “go and multiply”, we came up with stupid goals that make no sense, like love, happyness, etc.
        Dmytry Mar 18, 2012, 3:47 PM
        0 points
        Parent
        So what? The AI can come up with Foo, Bar, Baz that we never thought it would.
        
        The point is that we got entirely unexpected goal system (starting from evolution as a seed optimizer), with which we got greenpeace seriously risking their lives trying to sink japanese whaling ship, complete with international treaties against whaling. It is okay the AI won’t have love, happyness, etc. but why exactly should i be so extremely sure the foo, bar, and baz won’t make it assign some nonzero utility to mankind? Why we assume the AI will have the goal system of a bacteria?
        
        Why should i be so sure as to approve of stepping into a clearly marked, obvious minefield of “AIs that want to mess with mankind”?
        
        edit: To clarify, here we have AI’s weird random goal systems being reduced to, approximately, a real number: how much it values other complex dynamical systems vs less complex stuff. We value complex systems, and don’t like to disrupt them, even if we don’t understand anything. And most amazingly, the original process (evolution) looks like a good example of, if anything, an unfriendly AI attempt that wouldn’t give a slightest damn. We still do disrupt complex systems, when the resources are a serious enough bottleneck, but we’re making progress at not doing it and trading off some of the efficiency to avoid breaking things.
        Expand this thread
        latanius 18 Mar 2012 16:37 UTC
        2 points
        Parent
        Not disrupting complex systems doesn’t seem to be an universal human value to me (just as Greenpeace is not our universal value system, either). But you’re right, it’s probably not a good approach to treat an AI as just another grey goo.
        
        The problem is that it will be still us who will create that AI, so it will end up having values related to us. It would be a deliberate effort at our part to try to build something that isn’t a member of the FAI-like sphere you wrote about (in which I agree with pangel’s comment). For example, by ordering it to leave us alone and try to build stuff out of Jupiter instead. But then… what’s the point? If this AI was to prevent any further AI development on Earth… that would be a nice case of “ugly just-not-friendly-enough AI messing with humanity”, but if it wasn’t, then we could still end up converting the planet to paperclips by another AI developed later.
        Dmytry 18 Mar 2012 17:01 UTC
        0 points
        Parent
        We have international treaties to this sense. The greenpeace just assigns it particularly high value, comparing to the rest who assign much smaller value. Still, if we had fewer resource and R&D limitations we would be able to preserve animals much better, as the value of animals as animals would stay the same while the cost of alternative ways of acquiring the resources would be lower.
        
        With regards to the effort to build something that’s not a member of the FAI-like sphere, that’s where the majority of real effort to build the AI lies today. Look at the real projects that use techniques which have known practical spinoffs (neural networks), and have the computing power. Blue brain. The FAI effort is a microscopic, neglected fraction of AI effort.
        
        Also, the prevention of paperclippers doesn’t strike me as particularly bad scenario. The smarter AI doesn’t need to use clumsy bureaucracy style mechanisms of forbidding all AI development.
        wedrifid 18 Mar 2012 14:13 UTC
        0 points
        Parent
        
        You’re making a giant number of implicit ill-founded assumptions here that must all be true
        
        I don’t accept that I make or are required to make any of the assumptions that you declare that I make. Allow me to emphasize just how slight a convenience it has to be for an indifferent entity to exterminate humanity. Very, very slight.
        
        I’ll bow out of this conversation. It isn’t worth having it in a hidden draft.
        Dmytry 18 Mar 2012 14:17 UTC
        −1 points
        Parent
        What ever. That is the problem with human language, simplest statements have a zillion possible unfounded assumptions that are not even well defined nor is the maker of statement even aware of them (or would admit making them, because he didn’t, because he just manipulated symbols).
        
        Take “i think therefore i am”. innocent phrase, something that entirely boxed in blind symbolic ai should be able to think, right? No. Wrong. The “I” is only a meaningful symbol when there’s non-i to separate from i, the “think” when you can do something other than thinking, that you need to separate from thought, via symbol ‘think’; therefore implies the statements where it does not follow, and I am refers to the notion that non-i might exist without I existing. Yet if you say something like this, are you ‘making’ those assumptions? You can say no—they come in pre-made, and aren’t being processed.