Roko comments on Architects of Our Own Demise: We Should Stop Developing AI Carelessly

Roko 26 Oct 2023 10:08 UTC
6 points
3

Conversely, if we had a complete technical solution, I don’t see why we necessarily need that much governance competence.

As I said in the article, technically controllable ASIs are the equivalent of an invasive species which will displace humans from Earth politically, economically and militarily.
- Max H 26 Oct 2023 14:14 UTC
  5 points
  −11
  Parent
  And I’m saying that, assuming all the technical problems are solved, AI researchers would be the ones in control, and I (mostly) trust them to just not do things like build an AI that acts like an invasive species, or argues for its own rights, or build something that actually deserves such rights.
  Maybe some random sociologists on Twitter will call for giving AIs rights, but in the counterfactual world where AI researchers have fine control of their own creations, I expect no one in a position to make decisions on the matter to give such calls any weight.
  Even in the world we actually live in, I expect such calls to have little consequence. I do think some of the things you describe are reasonably likely to happen, but the people responsible for making them happen will do so unintentionally, with opinion columnists, government regulations, etc. playing little or no role in the causal process.
  - Jayson_Virissimo 26 Oct 2023 16:41 UTC
    8 points
    1
    Parent
    ...I (mostly) trust them to just not do things like build an AI that acts like an invasive species...
    What is the basis of this trust? Anecdotal impressions of a few that you know personally in the space, opinion polling data, something else?
    - Max H 26 Oct 2023 17:21 UTC
      2 points
      −6
      Parent
      A bit of anecdotal impressions, yes, but mainly I just think that in humans being smart, conscientious, reflective, etc. enough to be the brightest researcher a big AI lab is actually pretty correlated with being Good (and also, that once you actually solve the technical problems, it doesn’t take that much Goodness to do the right thing for the collective and not just yourself).
      Or, another way of looking at it, I find Scott Aaronson’s perspective convincing, when it is applied to humans. I just don’t think it will apply at all to the first kinds of AIs that people are actually likely to build, for technical reasons.
      - Roman Leventov 29 Oct 2023 11:29 UTC
        8 points
        2
        Parent
        I think there are way more transhumanists and post-humanists at AGI labs than you imagine. Richard Sutton is a famous example (btw, I’ve just discovered that he moved from DeepMind to Keen Technologies, John Carmack’s venture), but I believe there are many more of them, but they disguise themselves for political reasons.
  - Roko 26 Oct 2023 18:24 UTC
    5 points
    1
    Parent
    
    AI researchers would be the ones in control
    
    No. You have simplistic and incorrect beliefs about control.
    
    If there are a bunch of companies (Deepmind, Anthropic, Meta, OpenAI, …) and a bunch of regulation efforts and politicians who all get inputs, then the AI researchers will have very little control authority, as little perhaps as the physicists had over the use of the H-bomb.
    
    Where does the control really reside in this system?
    
    Who made the decision to almost launch a nuclear torpedo in the Cuban Missile Crisis?
    - Max H 26 Oct 2023 18:51 UTC
      3 points
      0
      Parent
      In the Manhattan project, there was no disagreement between the physicists, the politicians / generals, and the actual laborers who built the bomb, on what they wanted the bomb to do. They were all aligned around trying to build an object that would create the most powerful explosion possible.
      
      As for who had control over the launch button, of course the physicists didn’t have that, and never expected to. But they also weren’t forced to work on the bomb; they did so voluntarily and knowing they wouldn’t be the ones who got any say in whether and how it would be used.
      
      Another difference between an atomic bomb and AI is that the bomb itself had no say in how it was used. Once a superintelligence is turned on, control of the system rests entirely with the superintelligence and not with any humans. I strongly expect that researchers at big labs will not be forced to program an ASI to do bad things against the researchers’ own will, and I trust them not to do so voluntarily. (Again, all in the probably-counterfactual world where they know and understand all the consequences of their own actions.)
      - Vaniver 26 Oct 2023 19:18 UTC
        7 points
        0
        Parent
        In the Manhattan project, there was no disagreement between the physicists, the politicians / generals, and the actual laborers who built the bomb, on what they wanted the bomb to do.
        In that they wanted the bomb to explode? I think the analogous level of control for AI would be unsatisfactory.
        they did so voluntarily and knowing they wouldn’t be the ones who got any say in whether and how it would be used.
        I’m not sure they thought this; I think many expected that by playing along they would have influence later. Tech workers today often seem to care a lot about how products made by their companies are deployed.
        Max H 26 Oct 2023 21:14 UTC
        2 points
        0
        Parent
        
        In that they wanted the bomb to explode? I think the analogous level of control for AI would be unsatisfactory.
        
        The premise of this hypothetical is that all the technical problems are solved—if an AI lab wants to build an AI to pursue the collective CEV of humanity or whatever, they can just get it to do that. Maybe they’ll settle on something other than CEV that is a bit better or worse or just different, but my point was that I don’t expect them to choose something ridiculous like “our CEO becomes god-emperor forever” or whatever.
        
        I’m not sure they thought this; I think many expected that by playing along they would have influence later. Tech workers today often seem to care a lot about how products made by their companies are deployed.
        
        Yeah, I was probably glossing over the actual history a bit too much; most of my knowledge on this comes from seeing Oppenheimer recently. The actual dis-analogy is that no AI researcher would really be arguing for not building and deploying ASI in this scenario, vs. with the atomic bomb where lots of people wanted to build it to have around, but not actually use it or only use it as some kind of absolute last resort. I don’t think many AI researchers in our actual reality have that kind of view on ASI, and probably few to none would have that view in the counterfactual where the technical problems are solved.
      - Roko 26 Oct 2023 21:22 UTC
        5 points
        0
        Parent
        
        researchers at big labs will not be forced to program an ASI to do bad things against the researchers’ own will
        
        Well these systems aren’t programmed. Researchers work on architecture and engineering, goal content is down to the RLHF that is applied and the wishes of the user(s), and the wishes of the user(s) are determined by market forces, user preferences, etc. And user preferences may themselves be influenced by other AI systems.
        
        Closed source models can have RLHF and be delivered via an API, but open source models will not be far behind at any given point in time. And of course prompt injection attacks can bypass the RLHF on even closed source models.
        
        The decisions about what RLHF to apply on contentious topics will come from politicians and from the leadership of the companies, not from the researchers. And politicians are influenced by the media and elections, and company leadership is influenced by the market and by cultural trends.
        
        Where does the chain of control ultimately ground itself?
        
        Answer: it doesn’t. Control of AI in the current paradigm is floating. Various players can influence it, but there’s no single source of truth for “what’s the AI’s goal”.
        Max H 26 Oct 2023 21:50 UTC
        6 points
        2
        Parent
        I don’t dispute any of that, but I also don’t think RLHF is a workable method for building or aligning a powerful AGI.
        Zooming out, my original point was that there are two problems humanity is facing, quite different in character but both very difficult:
        a coordination / governance problem, around deciding when to build AGI and who gets to build it
        a technical problem, around figuring out how to build an AGI that does what the builder wants at all.
        My view is that we are currently on track to solve neither of those problems. But if you actually consider what the world in which we sufficiently-completely solve even of them looks like, it seems like either is sufficient for a relatively high probability of a relatively good outcome, compared to where we are now.
        Both possible worlds are probably weird hypotheticals which shouldn’t have an impact on what our actual strategy in the world we actually live in should be, which is of course to pursue solutions to both problems simultaneously with as much vigor as possible. But it still seems worth keeping in mind that if even one thing works out sufficiently well, we probably won’t be totally doomed.
        Roko 27 Oct 2023 16:34 UTC
        2 points
        0
        Parent
        
        a technical problem, around figuring out how to build an AGI that does what the builder wants
        
        How does a solution to the above solve the coordination/governance problem?
        Carl Feynman 31 Oct 2023 15:31 UTC
        1 point
        0
        Parent
        I think the theory is something like the following: We build the guaranteed trustworthy AI, and ask it to prevent the creation of unaligned AI, and it comes up with the necessary governance structures, and the persuasion and force needed to implement them.
        I’m not sure this is a certain argument. Some political actions are simply impossible to accomplish ethically, and therefore unavailable to a “good” actor even given superhuman abilities.
      - M. Y. Zuo 26 Oct 2023 19:27 UTC
        4 points
        2
        Parent
        In the Manhattan project, there was no disagreement between the physicists, the politicians / generals, and the actual laborers who built the bomb, on what they wanted the bomb to do. They were all aligned around trying to build an object that would create the most powerful explosion possible.
        Where did you learn of this?
        From what I know it was the opposite, there were so many disagreements, even just among the physicists, that they decided to duplicate nearly all effort to produce two different types of nuclear device designs, the gun type and the implosion type, simultaneously.
        e.g. both plutonium and uranium processing supply chains were set up at massive expense, and later environmental damage, just in case one design didn’t work.
        philh 30 Oct 2023 20:22 UTC
        2 points
        0
        Parent
        Without commenting on whether there was in fact much agreement or disagreement among the physicists, this doesn’t sound like much evidence of disagreement. I think it’s often entirely reasonable to try two technical approaches simultaneously, even if everyone agrees that one of them is more promising.
        M. Y. Zuo 31 Oct 2023 13:03 UTC
        −1 points
        0
        Parent
        You do realize setting up each supply chain alone took up well over 1% of total US GDP right?
        philh 31 Oct 2023 16:01 UTC
        1 point
        0
        Parent
        I didn’t know that, but not a crux. This information does not make me think it was obviously unreasonable to try both approaches simultaneously.
        
        (Downvoted for tone.)
        M. Y. Zuo 31 Oct 2023 18:12 UTC
        1 point
        0
        Parent
        How does this relate to the discussion Max H and Roko were having? Or the question I asked of Max H?
        philh 31 Oct 2023 19:06 UTC
        2 points
        0
        Parent
        I don’t know, I didn’t intend it to relate to those things. It was a narrow reply to something in your comment, and I attempted to signal it as such.
        
        (I’m not very invested in this conversation and currently intend to reply at most twice more.)
        M. Y. Zuo 31 Oct 2023 20:58 UTC
        1 point
        0
        Parent
        Okay then.
- Algon 26 Oct 2023 10:36 UTC
  5 points
  0
  Parent
  So you don’t think a pivotal act exists? Or, more amitiously, you don’t think a sovereign implementing CEV would result in a good enough world?
  - Roko 26 Oct 2023 10:50 UTC
    3 points
    1
    Parent
    Who is going to implement CEV or some other pivotal act?
    - Algon 26 Oct 2023 18:32 UTC
      2 points
      0
      Parent
      Ah, I see. Yeah, that’s a reasonable worry. Any ideas on how someone in those orgs could incentivize such behavior whilst discouraging poorly thought out pivotal acts? I would be OK with a future where e.g. OAI gets 90-99% of the cosmic endowment as long as the rest of us get a chunk, or get the chance to safely grow to the point where we have a shot at the vast scraps OAI leaves behind.
      - Roko 26 Oct 2023 21:24 UTC
        8 points
        9
        Parent
        
        Ah, I see. Yeah, that’s a reasonable worry. Any ideas on how someone in those orgs could incentivize such behavior whilst discouraging poorly thought out pivotal acts?
        
        the fact that we are having this conversation simply underscores how dangerous this is and how unprepared we are.
        
        This is the future of the universe we’re talking about. It shouldn’t be a footnote!