Daniel Kokotajlo answers How did LW update p(doom) after LLMs blew up?

Daniel Kokotajlo 22 Apr 2023 21:57 UTC
14 points
−4
I disagree with your premise; what’s currently happening is very much in-distribution for what was prophecied. It’s definitely got a few surprises in it, but “much more difficult to FOOM” and the other things you list aren’t among them IMO.

I agree that predict-the-world-first, then-develop-agency (and do it via initially-human-designed-bureaucracies) is a safer AGI paradigm than e.g. “train a big NN to play video games and gradually expand the set of games it can play until it can play Real Life.” (credit to Jan Leike for driving this point home to me). I don’t think this means things will probably be fine; I think things will probably not be fine.

We could have had CAIS (Comprehensive AI Services) though, and that would have been way safer still. (At least, five years ago more people seemed to think this, I was not among them) Alas that things don’t seem to be heading in that direction.
- jacob_cannell 23 Apr 2023 1:34 UTC
  13 points
  −2
  Parent
  By “what was prophecied”, I’m assuming you mean EY’s model of the future as written in the sequences and moreover in hanson foom debates.
  
  EY’s foom model goes something like this:
  - humans are nowhere near the limits of intelligence—not only in terms of circuit size, but also crucially in terms of energy efficiency and circuit/algorithm structure
  - biology is also not near physical limits—there is a great room for improvement (ie strong nanotech)
  - mindspace is wide and humans occupy only a narrow slice of it
  So someday someone creates an AGI, and then it can “rewrite its source code” to create a stronger or at least faster thinker, quickly bottoming out in a completely alien mind far more powerful than humans which then quickly creates strong nanotech and takes over the world.
  
  But he was mostly completely wrong here—because human brains are actually efficient, and biology is actually pretty much pareto optimal so we can mostly rule out strong nanotech.
  
  So instead we are more slowly advancing towards brain-like AGI, where we train ANNs through distillation on human thoughts to get AGI designed in the image of the human mind, which thinks human-like thoughts including our various cognitive biases&heuristics. These AGI can not ‘rewrite their source code’ any more than you or I can (which is to say you or I or an AGI could write the source code for a new AGI arch .. and then spend $1B training it …)
  
  So even though hanson was somewhat wrong in the one specific detail that our AGI is not literal scanned brain emulations, it is far far closer to brain emulations than the de novo alien AGI EY predicted (because you don’t actually need to scan a brain to recreate a human-like mind, distillation works pretty well), and this isn’t a very important distinction regardless. So hanson’s model is far closer to reality.
  What links here?
  - Ben Pace's comment on Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky by jacquesthibs (23 Apr 2023 3:35 UTC; 5 points)
  - Daniel Kokotajlo 23 Apr 2023 2:22 UTC
    14 points
    12
    Parent
    Upvoted for quality argument/comment, but agreement-downvoted.
    I wasn’t referring specifically to Yudkowsky’s views, no.
    I disagree that energy efficiency is relevant, either as a part of Yudkowskys model or as a constraint on FOOM.
    I also disagree that nanotech possibility is relevant. I agree that Yud is a big fan of nanotech, but FOOM followed by rapid world takeover does not require nanotech.
    I think mindspace is wide. It may not be wide in the ways your interpretation of Yud thinks it is, but it’s wide in the relevant sense—there’s lots of room for improvement in general intelligence, and human values are complex/fragile.
    Thanks for the link to Hanson’s old post; it’s a good read! I stand my my view that Yudkowsky’s model is closer to reality than Hanson’s.
    - jacob_cannell 23 Apr 2023 4:01 UTC
      6 points
      1
      Parent
      
      I disagree that energy efficiency is relevant, either as a part of Yudkowskys model or as a constraint on FOOM.
      
      I said efficiency in general, not energy efficiency specifically. Assume moore’s law is over now, and the brain is fully flop efficient, such that training AGI requires at least 1e24 flops (and perhaps even 1e23B memops) on a 1e13B+ model. There is no significant further room for any software or hardware improvement—at all. In that world, is EY’s FOOM model correct in the slightest?
      
      Everything about foom depends on efficiency of AGI vs the brain.
      
      You are also probably mistaken that efficiency is not a part of EY’s model, in part because he seems to agree that foom depends on thermodynamic efficiency improvement over the brain, and explicity said so a bit over a year ago:
      
      Which brings me to the second line of very obvious-seeming reasoning that converges upon the same conclusion—that it is in principle possible to build an AGI much more computationally efficient than a human brain—namely that biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.
      
      ATP synthase may be close to 100% thermodynamically efficient, but ATP synthase is literally over 1.5 billion years old and a core bottleneck on all biological metabolism. Brains have to pump thousands of ions in and out of each stretch of axon and dendrite, in order to restore their ability to fire another fast neural spike. The result is that the brain’s computation is something like half a million times less efficient than the thermodynamic limit for its temperature—so around two millionths as efficient as ATP synthase. And neurons are a hell of a lot older than the biological software for general intelligence!
      
      The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences.
      
      This is a critical flaw in his model, which spurred me to write an entire post to refute.
      
      I also disagree that nanotech possibility is relevant. I agree that Yud is a big fan of nanotech, but FOOM followed by rapid world takeover does not require nanotech.
      
      I also agree that nanotech is not that relevant (unless you are talking about practical toop-down nanotech, aka chip lithography), but I was discussing EY’s model in which strong nanotech is important.
      - Daniel Kokotajlo 23 Apr 2023 16:47 UTC
        8 points
        0
        Parent
        I said efficiency in general, not energy efficiency specifically.
        Your link went to a post which we had previously argued about… Your wonderful post goes into all sorts of details about the efficiency of the brain, most centrally energy efficiency, but doesn’t talk about the kinds of efficiency that matter most. The kind of efficiency that matters most is something like “Performance on various world-takeover and R&D tasks, as a function of total $, compute, etc. initially controlled.” Here are the kinds of efficiency you talk about in that post (direct quote):
        energy efficiency in ops/J
        spatial efficiency in ops/mm^2 or ops/mm^3
        speed efficiency in time/delay for key learned tasks
        circuit/compute efficiency in size and steps for key low level algorithmic tasks ^[1]
        learning/data efficiency in samples/observations/bits required to achieve a level of circuit efficiency, or per unit thereof
        Yeah, those are all interesting and worth thinking about but not what matters at the end of the day. (To be clear, I am not yet convinced by your arguments in the post, but that’s a separate discussion) Consider the birds vs. planes analogy. My guess is that planes still aren’t as efficient as birds in a bunch of metrics (energy expended per kg per mile travelled, dollar cost to manufacture per kg, energy cost to manufacture per kg...) but that hasn’t stopped planes from being enormously useful militarily and economically, much more so than birds. (We used to use birds to carry messages; occasionally people have experimented with them for military purposes also e.g. anti-drone warfare).
        Assume moore’s law is over now, and the brain is fully flop efficient, such that training AGI requires at least 1e24 flops (and perhaps even 1e23B memops) on a 1e13B+ model. There is no significant further room for any software or hardware improvement—at all. In that world, is EY’s FOOM model correct in the slightest?
        Funnily enough, I think these assumptions are approximately correct* & yet I think once we get human-level AGI, we’ll be weeks rather than years from superintelligence. If you agree with me on this, then it seems a bit unfair to dunk on EY so much, even if he was wrong about various kinds of brain efficiency. Basically, if he’s wrong about these kinds of brain efficiency, then the maximum limits of intelligence reachable by FOOM are lower than Yud thought, and also the slope of the intelligence explosion will probably be a bit less steep. And I’m grateful that your post exists carefully working through the issues there. But quantitatively if it still takes only a few weeks to reach superintelligence—by which I mean AGI which is significantly more competent than the best-ever humans at task X, for all relevant intellectual tasks—then the bottom line conclusions Yudkowsky drew appear to be correct, no?(I
        
        ’d like to give more concrete details about what I expect singularity to look like, but I’m hesitant because I’m a bit constrained in what I can and should say. I’d be curious though to hear your thoughts on Gwern’s fictional takeover story, which I think is unrealistic in a bunch of ways but am curious to hear whether it’s violating any of the efficiency limits you’ve argued for in that brain efficiency post—and then how the story would need to be changed in order to respect those limits.)
        
        *To elaborate: I’m anticipating a mostly-software singularity, not hardware-based, so while I do think there’s probably significant room for improvement in hardware I don’t think it matters to my bottom line. I also expect that the first AGIs will be trained on more than 1e24 FLOP, and while I no longer think that 1e13 parameters will be required, I think it’s quite plausible that 1e13 parameters will be required. I guess the main way in which I substantively disagree with your assumptions is the “no significant further room for any software improvement at all” bit. If we interpret it narrowly as no significant further room for improvement in the list of efficiency dimensions you gave earlier, then sure, I’m happy to take that assumption on board. If we interpret it broadly as no significant further room for capabilities-per-dollar, or capabilities-per-flop, then I don’t accept that assumption, and claim that you haven’t done nearly enough to establish it, and instead seem to be making a similar mistake to a hypothetical bird enthusiast in 1900 who declared that planes would never outcompete pigeons because it wasn’t possible to be substantially more efficient than birds.
        jacob_cannell 23 Apr 2023 18:31 UTC
        2 points
        0
        Parent
        
        The kind of efficiency that matters most is something like “Performance on various world-takeover and R&D tasks, as a function of total $, compute, etc. initially controlled.” Here are the kinds of efficiency you talk about in that post
        
        Efficiency in terms of intelligence/$ is obviously downstream dependent on the various lower level metrics I cited.
        
        Funnily enough, I think these assumptions are approximately correct* & yet I think once we get human-level AGI, we’ll be weeks rather than years from superintelligence.
        
        I may somewhat agree, depending on how we define SI. However the current transformer GPU paradigm seems destined for a slowish takeoff. GPT4 used perhaps 1e25 flops and produced only a proto-AGI (which ironically is far more general than any one human, but still missing critical action/planning skills/experience), and it isn’t really feasible to continue that scaling to 1e27 flops and beyond any time soon.
        
        If you agree with me on this, then it seems a bit unfair to dunk on EY so much, even if he was wrong about various kinds of brain efficiency.
        
        I don’t think its unfair at all. EY’s unjustified claims are accepted at face value by too many people here, but in reality his sloppy analysis results in a poor predictive track record. The AI/ML folks who dismiss the LW doom worldview as crankish are justified in doing so if this is the best argument for doom.
        
        But quantitatively if it still takes only a few weeks to reach superintelligence—by which I mean AGI which is significantly more competent than the best-ever humans at task X, for all relevant intellectual tasks—then the bottom line conclusions Yudkowsky drew appear to be correct, no?
        
        I’m not sure what the “only a few weeks” measures, but I’ll assume you are referring to the duration of a training run. For various reasons I believe this will tend to be a few months or more for the most competitive models at least for the foreseeable future, not a few weeks.
        
        We already have proto-AGI in the form of GPT4 which is already more competent than the average human at most white-collar, non-robotic tasks. Further increase in generality is probably non-useful, most of the further value will come from improving agentic performance to increasingly out-compete the most productive/skilled humans in valuable skill niches—ie going for more skill depth rather than width. This may require increasing specialization and larger parameter counts—for example if creating the world’s best lisp programmer requires 1T params just by itself, that will result in pretty slow takeoff from here.
        
        I also suspect it may also be possible to soon have a (very expensive) speed-intelligence that is roughly human-level ability but thinks 100x or 1000x faster, but that isn’t the kind of FOOM EY predicted. That’s a scenario I predicted and hanson and others to varying degrees: human shaped minds running at high speeds. Those will necessarily be brain-like AGI, as the brain is simply what intelligence optimized for efficiency and low latency especially looks like. Digital minds can use their much higher clock rate to run minimal depth brain-like circuits at high speeds or much deeper circuits at low speeds, but the evidence is now pretty overwhelmingly favoring the benefits of the former over the latter—as I predicted well in advance.
        
        I’d be curious though to hear your thoughts on Gwern’s fictional takeover story, which I think is unrealistic in a bunch of ways but am curious to hear whether it’s violating any of the efficiency limits
        
        The central premise of the story is that an evolutionary search auto-ML process running on future TPUs and using less than 5e24 flops (around the training cost of GPT4) results suddenly in a SI, in a future world that seems to completely lack even human-level AGI. No I don’t think that’s realistic at all, because the brain is efficient and just human-level AGI requires around that much. SI requires far more. The main caveat of course—as I already mentioned—is that once you train a human-level AGI you could throw a bunch of compute at it to run it faster, but doing that doesn’t actually increase the net human-power of the resulting mind (vs spending the same compute on N agent instances in parallel).
        
        If we interpret it broadly as no significant further room for capabilities-per-dollar, or capabilities-per-flop, then I don’t accept that assumption, and claim that you haven’t done nearly enough to establish it,
        
        Essentially all recent progress comes from hardware, not software. Some people here like to cite a few works trying to measure software-progress, but those conclusions/analysis are mostly all wrong (a tangent for another thread). The difference between GPT1 scaling to GPT4 is almost entirely due to throwing more money on hardware combined with hardware advances from nvidia/TSMC. OpenAI’s open secret of success is simply that they were the first to test the scaling hypothesis—which more than anything else—is a validation of my predictive model^[1].
        
        The hardware advances are about to peter out, and scaling up the spend on supercomputer training by another 100x from ~$1B ~~is not really an option~~ seems unlikely anytime soon due to poor scaling of supercomputers of that size and various attendant risks^[2].
        
        and instead seem to be making a similar mistake to a hypothetical bird enthusiast in 1900 who declared that planes would never outcompete pigeons because it wasn’t possible to be substantially more efficient than birds.
        
        I never said AGI wouldn’t outcompete humans, on the contrary my model very much has been AGI or early SI by end of this decade and strong singularity before 2050. But the brain is actually efficient, and it just takes alot of compute to reverse engineer the brain, and moore’s law is ending. Moravec’s model was mostly correct, but also hanson’s (because hanson’s model is very much an AGI requires virtual brains model, and he’s carefully thought out much of the resulting economics).
        
        ↩︎
        I should point out that for anthropic reasons we should obviously never expect to witness the endpoint of EY’s doom model, but that model still makes some tentatively different intermediate predictions which have mostly all been falsified.
        
        ↩︎
        A $100B one month training run would require about 50 million high-end GPUs (which cost about $2,000 a month each), and tens of gigawatts of power. Nvidia ships well less than a million flagship GPUs per year.
        
        habryka 23 Apr 2023 18:57 UTC
        4 points
        0
        Parent
        Not commenting on this whole thread, which I do have a lot of takes about that I am still processing, but a quick comment on this line:
        The hardware advances are about to peter out, and scaling up the spend on supercomputer training by another 100x from ~$1B is not really an option.
        I don’t see any reason for why we wouldn’t see a $100B training run within the next few years. $100B is not that much (it’s roughly a third of Google’s annual revenue, so if they really see competition in this domain as an existential threat, they alone might be able to fund a training run like this).
        It might have to involve some collaboration of multiple tech companies, or some government involvement, but I currently expect that if scaling continues to work, we are going to see a $100B training run (though like, this stuff is super hard to forecast, so I am more like 60% on this, and also wouldn’t be surprised if it didn’t happen).
        jacob_cannell 23 Apr 2023 20:05 UTC
        6 points
        0
        Parent
        In retrospect I actually somewhat agree with you so I edited that line and denoted with a strike-through. Yes a $100B training run is an option in theory, but it is unlikely to translate to a 100x increase in training compute due to datacenter scaling difficulties, and this is also greater than OpenAI’s estimated market cap. (I also added a note with a quick fermi estimate showing that a training run of that size would require massively increasing nvidia’s GPU output by at least an OOM) For various reasons I expect even those with pockets that deep to instead invest more in a number of GPT4 size runs exploring alternate training paths.