mike_hawke comments on Counting arguments provide no evidence for AI doom

mike_hawke 6 Mar 2024 1:11 UTC
4 points
2
EDIT: This is wrong. See descendent comments.
I spent a bunch of time wondering how you could could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason. I was going to challenge you to list a handful of other claims that you had similar credence in, until I searched the comments for “0.1%” and found this one.
~~I’m annoyed at this, and I request that you prominently edit the OP.~~
- Quintin Pope 7 Mar 2024 3:24 UTC
  4 points
  2
  Parent
  The post says “we should assign very low credence to the spontaneous emergence of scheming in future AI systems— perhaps 0.1% or less.”
  I.e., not “no AI will ever do anything that might be well-described as scheming, for any reason.”
  It should be obvious that, if you train an AI to scheme, you can get an AI that schemes.
  - mike_hawke 12 Mar 2024 23:49 UTC
    14 points
    2
    Parent
    Damn, woops.
    My comment was false (and strident; worst combo). I accept the strong downvote and I will try to now make a correction.
    I said:
    I spent a bunch of time wondering how you could could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason.
    
    What I meant to say was:
    I spent a bunch of time wondering how you could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason, even if you stipulate that it must happen spontaneously.
    And now you have also commented:
    Well, I have <0.1% on spontaneous scheming, period. I suspect Nora is similar and just misspoke in that comment.
    So....I challenge you to list a handful of other claims that you have similar credence in. Special Relativity? P!=NP? Major changes in our understanding of morality or intelligence or mammal psychology? China pulls ahead in AI development? Scaling runs out of steam and gives way to other approaches like mind uploading? Major betrayal against you by a beloved family member?
    The OP simply says “future AI systems” without specifying anything about these systems, their paradigm, or what offworld colony they may or may not be developed on. Just...all AI systems henceforth forever. Meaning that no AI creators will ever accidentally recapitulate the scheming that is already observed in nature...? That’s such a grand, sweeping claim. If you really think it’s true, I just don’t understand your worldview. If you’ve already explained why somewhere, I hope someone will link me to it.
- Noosphere89 7 Mar 2024 1:07 UTC
  2 points
  0
  Parent
  Agree with this hugely, though I could make a partial defense of the confidence given, but yes I’d like this post to be hugely edited.
  - Nora Belrose 7 Mar 2024 2:59 UTC
    1 point
    0
    Parent
    What do you mean “hugely edited”? What other things would you like us to change? If I were starting from scratch I would of course write the post differently but I don’t think it would be worth my time to make major post hoc edits; I would like to focus on follow up posts.
    - Noosphere89 7 Mar 2024 3:30 UTC
      2 points
      0
      Parent
      Specifically, I wanted the edit to be a clarification that you only have a <0.1% probability on spontaneous scheming ending the world.
      - Quintin Pope 7 Mar 2024 4:08 UTC
        2 points
        −2
        Parent
        Well, I have <0.1% on spontaneous scheming, period. I suspect Nora is similar and just misspoke in that comment.
        Nora Belrose 7 Mar 2024 7:16 UTC
        1 point
        −2
        Parent
        If it’s spontaneous then yeah, I don’t expect it to happen ~ever really. I was mainly thinking about cases where people intentionally train models to scheme.