Daniel Kokotajlo comments on Where I agree and disagree with Eliezer

Daniel Kokotajlo 19 Jun 2022 20:47 UTC
LW: 27 AF: 11
3
AF
This is a thread for anyone who wants to give a high-level take or reaction that isn’t contributing much to the discussion (and thus isn’t worth a top-level comment).
- Neel Nanda 19 Jun 2022 22:59 UTC
  LW: 31 AF: 8
  20
  AF Parent
  I broadly agree with this much more than Eliezer’s and think this did a good job of articulating a bunch of my fuzzy “this seems off”. Most notably, Eliezer underrating the Importance and tractability of interpretability, and overrating the discontinuity of AI progress
- Gedusa 20 Jun 2022 13:19 UTC
  17 points
  30
  Parent
  I found it really helpful to have a list of places where Eliezer and Paul agree. It’s interesting to see that there is a lot of similarity on big picture stuff like AI being extremely dangerous.
- Daniel Kokotajlo 19 Jun 2022 20:48 UTC
  LW: 10 AF: 7
  1
  AF Parent
  I think my take is roughly “What Paul would think if he had significantly shorter timelines.”
  - paulfchristiano 19 Jun 2022 20:53 UTC
    LW: 30 AF: 13
    2
    AF Parent
    Do you think that some of my disagreements should change if I had shorter timelines?
    (As mentioned last time we talked, but readers might not have seen: I’m guessing ~15% on singularity by 2030 and ~40% on singularity by 2040.)
    What links here?
    How do AI timelines affect how you live your life? by Quadratic Reciprocity (11 Jul 2022 13:54 UTC; 80 points)
    Mo Putera's comment on Why Do AI researchers Rate the Probability of Doom So Low? by Aorou (24 Sep 2022 7:19 UTC; 9 points)
    Evan R. Murphy's comment on Where I agree and disagree with Eliezer by paulfchristiano (22 Jun 2022 2:34 UTC; 3 points)
    - Daniel Kokotajlo 19 Jun 2022 21:56 UTC
      LW: 55 AF: 17
      29
      AF Parent
      I think most of your disagreements on this list would not change.
      However, I think if you conditioned on 50% chance of singularity by 2030 instead of 15%, you’d update towards faster takeoff, less government/societal competence (and thus things more likely to fail at an earlier, less dignified point), more unipolar/local takeoff, lower effectiveness of coordination/policy/politics-style strategies, less interpretability and other useful alignment progress, less chance of really useful warning shots… and of course, significantly higher p(doom).
      
      To put it another way, when I imagine what (I think) your median future looks like, it’s got humans still in control in 2035, sitting on top of giant bureaucracies of really cheap, really smart proto-AGIs that fortunately aren’t good enough at certain key skills (like learning-to-learn, or concept formation, or long-horizon goal-directedness) to be an existential threat yet, but are definitely really impressive in a bunch of ways and are reshaping the world economy and political landscape and causing various minor disasters here and there that serve as warning shots. So the whole human world is super interested in AI stuff and policymakers are all caught up on the arguments for AI risk and generally risks are taken seriously instead of dismissed as sci-fi and there are probably international treaties and stuff and also meanwhile the field of technical alignment has had 13 more years to blossom and probably lots of progress has been made on interpretability and ELK and whatnot and there are 10x more genius researchers in the field with 5+ years of experience already… and even in this world, singularity is still 5+ years away, and probably there are lots of expert forecasters looking at awesome datasets of trends on well-designed benchmarks predicting with some confidence when it will happen and what it’ll look like.
      
      This world seems pretty good to me, it’s one where there is definitely still lots of danger but I feel like >50% chance things will be OK. Alas it’s not the world I expect, because I think probably things will happen sooner and go more quickly than that, with less time for the world to adapt and prepare.
      - Eli Tyre 20 Jun 2022 23:17 UTC
        7 points
        7
        Parent
        I personally found this to be a very helpful comment for visualizing how things could go.
      - Noosphere89 3 Oct 2024 17:08 UTC
        4 points
        0
        Parent
        Re my own updates, I’d say that my own probability of 50-55% chance of singularity by 2030, using the knowledge about AI, alignment and governance we have now:
        
        Faster takeoff is correct, but nowhere near as fast as Eliezer’s usual stories.
        
        Somewhat less competence, but only somewhat, because of the MNM effect and the ridiculously strong control system that was essentially a collective intelligence that operated for at least several months, and more generally I believe that governments will respond harder as the problem gets more severe.
        
        IMO, we are probably going to get fairly concentrated takeoffs, but not totally unipolar takeoffs.
        
        Politics and coordination will be reasonably effective by default, because I expect the government and the public to wake up hard once AIs start automating a lot of stuff.
        
        IMO, most of the value of alignment and interpretability research will be gotten very near into the singularity, or even right on the event horizon of the transition from human to AI, for almost the same reasons why a whole lot of the percentage of capability research will be gotten, but also it’s quite surprising how much we got the low-hanging fruit of alignment such that we could well dream for bigger targets.
        
        Useful warning shots will definitely be less, but I also expect governments to wake up a lot more than they have right now once they realize that AI is automating everything.
    - Evan R. Murphy 21 Jun 2022 0:58 UTC
      2 points
      1
      Parent
      I’m guessing ~15% on singularity by 2030 and ~40% on singularity by 2040
      These figures surprise me, I thought that you believed in shorter timelines because from Agreements #8 in your post where you said “[Transformative AI] is more likely to be years than decades, and there’s a real chance that it’s months”, .
      ~40% by 2040 sounds like an expectation of transformative AI probably taking decades. (Unless I’m drawing a false equivalence between transformative AI and what you mean by “singularity”.)
      - paulfchristiano 21 Jun 2022 3:04 UTC
        8 points
        1
        Parent
        In agreement #8 I’m talking about the time from “large impact on the world” (say increasing GDP by 10%, automating a significant fraction of knowledge work, “feeling like TAI is near,” something like that) to “transformative impact on the world” (say singularity, or 1-2 year doubling times, something like that). I think right now the impact of AI on the world is very small compared to this standard.
        Evan R. Murphy 21 Jun 2022 6:08 UTC
        1 point
        0
        Parent
        Thanks, that makes it more clear to me the two different periods of time you’re talking about.
- Michaël Trazzi 22 Jun 2022 1:26 UTC
  6 points
  3
  Parent
  Datapoint: I skimmed through Eliezer’s post, but read this one from start to finish in one sitting. This post was for me the equivalent of reading the review of a book I haven’t read, where you get all the useful points and nuance. I can’t stress enough how useful that was for me. Probably the most insightful post I have read since “Are we in AI overhang”.
- Flaglandbase 20 Jun 2022 6:17 UTC
  0 points
  −5
  Parent
  I never even thought about super-AI dangers before coming to this site, only sub-AI dangers. However IF these claims are true, then there should be delays imposed on AI research. There would be no alternative.
  It should be done in a way that would not slow down the type of progress we really want: inventing a way to defeat the problem of death using technology. The money that would be invested in inventing super powerful hyper-computer minds should instead be invested in inventing a single standard design of powerful “brain chip”. Each brain chip would contain all the information extracted from a single human brain, and could replace that brain’s existence in a durable VR environment.
  It goes without saying this alternative research program would be much, much slower and more expensive than just inventing a single superhuman hyper-AI. It might take a century to invent a way to extract and back up the contents of a single human brain. And that is just too long. In fact it’s intolerable because everyone alive today would still have to die, and be lost forever.
  So it would still be necessary to invent a single, narrowly focused hyper-AI, that would have only ONE task. It would be to invent a way to “transfer” human minds from perishable brains into a more durable medium. After completing that task, the hyper-AI would be shut down.
- Jotto999 20 Jun 2022 23:49 UTC
  −2 points
  −4
  Parent
  Very broadly,
  in 2030 it will still be fairly weird and undersubstantiated, to say that a dev’s project might accidentally turn everyone’s atoms into ML hardware, or might accidentally cause a Dyson sphere to be build.
  - Verden 21 Jun 2022 14:56 UTC
    0 points
    −1
    Parent
    I’m not totally sure what you’re referring to, but if you’re talking about Paul’s guess of “~15% on singularity by 2030 and ~40% on singularity by 2040”, then I want to point out that looking at these two questions, his prediction seems in line with the Metaculus community prediction
    - paulfchristiano 21 Jun 2022 15:03 UTC
      5 points
      0
      Parent
      I don’t think it will ever seem plausible for an accident to turn everyone’s atoms into ML hardware though, because we will probably remain closer to an equilibrium with no free energy for powerful AI to harvest.
    - Jotto999 25 Jun 2022 18:24 UTC
      2 points
      1
      Parent
      I disagree with the community on that. Knocking out silver turing, Montezuma (in the way described), 90% equivalent on Winogrande, and 75th percentile on maths SAT will either take longer to be actually demonstrated in a unified ML system, OR it will happen way sooner than 39 months before “an AI which can perform any task humans can perform in 2021, as well or superior to the best humans in their domain.”, which is incredibly broad. If the questions mean what they are written to mean, as I read them, it’s a hell of a lot more than 39 months (median community estimate).
      The thing I said is about some important scenarios described by people giving significant probability to a hostile hard takeoff scenario. I included the comment here in this subthread because I don’t think it contributed much to the discussion.
- [ ]
  [deleted]