paulfchristiano comments on paulfchristiano’s Shortform

paulfchristiano 18 Feb 2023 18:29 UTC
61 points
One way of viewing takeoff speed disagreements within the safety community is: most people agree that growth will eventually be explosively fast, and so the question is “how big an impact does AI have prior to explosive growth?” We could quantify this by looking at the economic impact of AI systems prior to the point when AI is powerful enough to double output each year.
(We could also quantify it via growth dynamics, but I want to try to get some kind of evidence further in advance which requires looking at AI in particular—on both views, AI only has a large impact on total output fairly close to the singularity.)
The “slow takeoff” view is that general AI systems will grow to tens of trillions of dollars a year of revenue in the years prior to explosive growth, and so by the time they have automated the entire economy it will look like a natural extension of the prior trend.
(Cost might be a more robust measure than revenue, especially if AI output is primarily reinvested by large tech companies, and especially on a slow takeoff view with a relatively competitive market for compute driven primarily by investors. Revenue itself is very sensitive to unimportant accounting questions, like what transactions occur within a firm vs between firms.)
The fast takeoff view is that the pre-takeoff impact of general AI will be… smaller. I don’t know exactly how small, but let’s say somewhere between $10 million and $10 trillion, spanning 6 orders of magnitude. (This reflects a low end that’s like “ten people in a basement” and a high end that’s just a bit shy of the slow takeoff view.)
It seems like growth in AI has already been large enough to provide big updates in this discussion. I’d guess total revenue from general and giant deep learning systems^[1] will probably be around $1B in 2023 (and perhaps much higher if there is a lot of stuff I don’t know about). It also looks on track to grow to $10 billion over the next 2-3 years if not faster. It seems easy to see how the current technology could do $10-100 billion per year of revenue with zero regulatory changes or further technology progress.
If the “fast takeoff” prediction about pre-takeoff AGI impact was uniformly distributed between $10 million and $10 trillion, that would mean we’ve already run through ¹⁄₃ of the probability mass and so should have made a factor of ³⁄₂ update for slow takeoff. By the time we get to $100 billion/year of impact we will have run through ²⁄₃ of the probability mass.
My sense is that fast takeoff proponents are not owning the predictions of their view and now acknowledging this fact as evidence. Eliezer in particular has frustrated me on this point. These observations are consistent with fast takeoff (which as far as I can tell practically just says “anything goes” until right at the end), but that they are basically implied by the slow takeoff model.
(One might object to this perspective by saying that there is no analogous form of evidence that could ever support fast takeoff. But I think that’s just how it is—if Alice thinks that the world will end at t=1 and Bob thinks the world will end at a uniformly random time between t=0 and t=1, then Bob’s view will just inevitably become less likely as time passes without the world ending.)
1. ^
  It’s unclear if you should treat “general and giant deep learning systems” as the relevant category, though I think it’s probably right if you want to evaluate the views of actual slow takeoff proponents in the safety community. And more broadly, within that community I expect a lot of agreement that these systems are general intelligences that are the most relevant comparison point for transformative AI.
What links here?
- TurnTrout's comment on TurnTrout’s shortform feed by TurnTrout (18 Feb 2023 19:00 UTC; 9 points)
- Vladimir_Nesov's comment on moridinamael’s Shortform by moridinamael (2 Mar 2023 7:35 UTC; 6 points)
- Steven Byrnes 21 Feb 2023 16:34 UTC
  17 points
  Parent
  [For reference, my view is “probably fast takeoff, defined by <2 years between widespread common knowledge of basically how AGI algorithms can work, and the singularity.” (Is that “fast”? I dunno the definitions.) ]
  My view is that LLMs are not part of the relevant category of proto-AGI; I think proto-AGI doesn’t exist yet. So from my perspective,
  - you’re asking for a prediction about how much economic impact comes from a certain category of non-proto-AGI software,
  - you’re noting that the fast takeoff hypothesis doesn’t constrain this prediction at all,
  - and therefore you’re assigning the fast takeoff hypothesis a log-uniform prior as its prediction.
  I don’t think this is a valid procedure.
  ANALOGY:
  The Law of Conservation of Energy doesn’t offer any prediction about the half-life of tritium. It would be wrong to say that it offers an implicit prediction about the half-life of tritium, namely, log-uniform from 1 femtosecond to the lifetime of the universe, or whatever. It just plain doesn’t have any prediction at all!
  Really, you have to distinguish two things:
  - “The Law of Conservation of Energy”
  - “My whole worldview, within which The Law of Conservation of Energy is just one piece.”
  If you ask me to predict the total energy of the tritium decay products, I would make a prediction using Conservation of Energy. If you ask me to predict the half-life of tritium, I would make a prediction using other unrelated things that I believe about nuclear physics.
  And that’s fine!
  And in particular, the failure of Conservation of Energy here is not a failure at all; indeed it’s not any evidence that I’m wrong to believe Conservation of Energy. Right?
  BACK TO AI:
  Again, you’re asking for a prediction about how much economic impact comes from a certain category of (IMO)-non-AGI AI software. And I’m saying “my belief in fast takeoff doesn’t constrain my answer to that question—just like my belief in plate tectonics doesn’t constrain my answer to that question”. I don’t think any less of the fast takeoff hypothesis on account of that fact, any more than I think less of plate tectonics.
  So instead, I would come up with a prediction in a different way. What do these systems do now, what do I expect them to do in the future, how big are relevant markets, etc.?
  I don’t know what my final answers would be, but I feel like you’re not allowed to put words in my mouth and say that my prediction is “log-uniform from $10M to $10T”. That’s not my prediction!
  Sorry if I’m misunderstanding.
  - paulfchristiano 21 Feb 2023 18:28 UTC
    10 points
    Parent
    I don’t think any less of the fast takeoff hypothesis on account of that fact, any more than I think less of plate tectonics.
    But if non-AGI systems in fact transform the world before AGI is built, then I don’t think I should care nearly as much about your concept of “AGI.” That would just be a slow takeoff, and would probably mean that AGI isn’t the relevant thing for our work on mitigating AI risk.
    So you can have whatever views you want about AGI if you say they don’t make a prediction about takeoff speed. But once you are making a claim about takeoff speed, you are also making a prediction that non-AGI systems aren’t going to transform the world prior to AGI.
    I don’t know what my final answers would be, but I feel like you’re not allowed to put words in my mouth and say that my prediction is “log-uniform from $10M to $10T”. That’s not my prediction!
    It seems to me like your view about takeoff is only valid if non-AGI systems won’t be that impactful prior to AGI, and in particular will have <$10 trillion of impact or something like that (otherwise AGI is not really related to takeoff speed). So somehow you believe that. And therefore some kind of impact is going to be evidence against that part of your view. It might be that the entire update occurs as non-AGIs move from $1 trillion to $10 trillion of impact, because that’s the step where you concentrated the improbability.
    In some sense if you don’t say any prediction then I’m “not allowed” to update about your view based on evidence, because I don’t know which update is correct. But in fact I’m mostly just trying to understand what is true, and I’m trying to use other people’s statements as a source of hypotheses and intuitions and models of the world, and regardless of whether those people state any predictions formally I’m going to be constantly updating about what fits the evidence.
    The most realistic view I see that implies fast takeoff without making predictions about existing systems is:
    You have very short AGI timelines based on your reasoning about AGI.
    Non-AGI impact simply can’t grow quickly enough to be large prior to AGI.
    For example, if you think the median time to brain-like AGI is <10 years, then I think that most of our disagreement will be about that.
    - Steven Byrnes 21 Feb 2023 19:21 UTC
      8 points
      Parent
      Thanks!
      But if non-AGI systems in fact transform the world before AGI is built, then I don’t think I should care nearly as much about your concept of “AGI.”
      Fair enough! I do in fact expect that AI will not be transformative-in-the-OpenPhil-sense (i.e. as much or more than the agricultural or industrial revolution) unless that AI is importantly different from today’s LLMs (e.g. advanced model-based RL). But I don’t think we’ve gotten much evidence on this hypothesis either way so far, right?
      For example: I think if you walk up to some normal person and say “We already today have direct (albeit weak) evidence that LLMs will evolve into something that transforms the world much much more than electrification + airplanes + integrated circuits + the internet combined”, I think they would say “WTF?? That is a totally wild claim, and we do NOT already have direct evidence for it”. Right?
      If you are mean “transformative” in a weaker-than-OpenPhil sense, well the internet “transformed the world” according to everyday usage, and the impact of the internet on the economy is (AFAICT) >$10T. I suppose that the fact that the internet exists is somewhat relevant to AGI x-risk, but I don’t think it’s very relevant. I think that people trying to make AGI go well in a hypothetical world where the internet doesn’t exist would be mostly doing pretty similar things as we are.
      The most realistic view I see that implies fast takeoff without making predictions about existing systems is:
      You have very short AGI timelines based on your reasoning about AGI.
      Non-AGI impact simply can’t grow quickly enough to be large prior to AGI.
      Why not “non-AGI AI systems will eventually be (at most) comparably impactful to the internet or automobiles or the printing press, before plateauing, and this is ridiculously impactful by everyday standards, but it doesn’t strongly change the story of how we should be thinking about AGI”?
      BTW, why do we care about slow takeoff anyway?
      Slow takeoff suggests that we see earlier smaller failures that have important structural similarity to later x-risk-level failures
      Slow takeoff means the world that AGI will appear in will be so different from the current world that it totally changes what makes sense to do right now about AGI x-risk.
      (Anything else?)
      I can believe that “LLMs will transform the world” comparably to how the internet or the integrated circuit has transformed the world, without expecting either of those bullets to be true, right?
      - paulfchristiano 21 Feb 2023 20:28 UTC
        11 points
        Parent
        I think we are getting significant evidence about the plausibility that deep learning is able to automate real human cognitive work, and we are seeing extremely rapid increases in revenue and investment. I think those observations have extremely high probability if big deep learning systems will be transformative (this is practically necessary to see!), and fairly low base rate (not clear exactly how low but I think 25% seems reasonable and generous).
        So yeah, I think that we have gotten considerable evidence about this, more than a factor of 4. I’ve personally updated my views by about a factor of 2, from a 25% chance to a 50% chance that scaling up deep learning is the real deal and leads to transformation soon. I don’t think “That’s a wild claim!” means you don’t have evidence, that’s not how evidence works.
        I think that normal people who follow tech have also moved their views massively. They take the possibility of crazy transformations from deep leaning much more seriously than they did 5 years ago. They are much more likely to view deep learning as producing systems analogous to humans in economically relevant ways. And so on.
        Why not “non-AGI AI systems will eventually be (at most) comparably impactful to the internet or automobiles or the printing press, before plateauing, and this is ridiculously impactful by everyday standards, but it doesn’t strongly change the story of how we should be thinking about AGI”?
        That view is fine, but now I’m just asking what your probability distribution is over the location of that plateau. Is it no evidence to see LMs at $10 billion? $100 billion? Is your probability distribution just concentrated with 100% of its mass between $1 trillion and $10 trillion? (And if so: why?)
        It’s maybe plausible to say that your hypothesis is just like mine but with strong cutoffs at some particular large level like $1 trillion. But why? What principle makes an impact of $1 trillion possible but $10 trillion impossible?
        Incidentally, I don’t think the internet adds $10 trillion of value. I agree that as I usually operationalize it the soft takeoff view is not technically falsified until AI gets to ~$100 trillion per year (though by $10 trillion I think fast takeoff has gotten considerably less feasible in addition to this update, and the world will probably be significantly changed and prepared for AI in a way that is a large part of what matters about slow takeoff), so we could replace the upper end of that interval with $100 trillion if we wanted to be more generous.
        Steven Byrnes 21 Feb 2023 22:11 UTC
        8 points
        Parent
        Hmm, for example, given that the language translation industry is supposedly $60B/yr, and given that we have known for decades that AI can take at least some significant chunk out of this industry at the low-quality end [e.g. tourists were using babelfish.altavista.com in the 1990s despite it sucking], I think someone would have to have been very unreasonable indeed to predict in advance that there will be an eternal plateau in the non-AGI AI market that’s lower than $1B/yr. (And that’s just one industry!) (Of course, that’s not a real prediction in advance ¯\_(ツ)_/¯ )
        What I was getting at with “That’s a wild claim!” is that your theory makes an a-priori-obvious prediction (AI systems will grow to a >$1B industry pre-FOOM) and a controversial prediction (>$100T industry), and I think common sense in that situation is to basically ignore the obvious prediction and focus on the controversial one. And Bayesian updating says the same thing. The crux here is whether or not it has always been basically obvious to everyone, long in advance, that there’s at least $1B of work on our planet that can be done by non-FOOM-related AI, which is what I’m claiming in the previous paragraph where I brought up language translation. (Yeah I know, I am speculating about what was obvious to past people without checking what they said at the time—a fraught activity!)
        Yeah deep learning can “automate real human cognitive work”, but so can pocket calculators, right? Anyway, I’d have to think more about what my actual plateau prediction is and why. I might reply again later. :)
        I feel like your thinking here is actually mostly coming from “hey look at all the cool useful things that deep learning can do and is doing right now”, and is coming much less from the specific figure “$1B/year in 2023 and going up”. Is that fair?
        paulfchristiano 22 Feb 2023 0:16 UTC
        13 points
        Parent
        I don’t think it’s obvious a priori that training deep learning to imitate human behavior can predict general behavior well enough to carry on customer support conversations, write marketing copy, or write code well enough to be helpful to software engineers. Similarly it’s not obvious whether it will be able to automate non-trivial debugging, prepare diagrams for a research paper, or generate plausible ML architectures. Perhaps to some people it’s obvious there is a divide here, but to me it’s just not obvious so I need to talk about broad probability distributions over where the divide sits.
        I think ~$1B/year is a reasonable indicator of the generality and extent of current automation. I really do care about that number (though I wish I knew it better) and watching it go up is a big deal. If it can just keep being more useful with each passing year, I will become more skeptical of claims about fundamental divides, even if after the fact you can look at each thing and say “well it’s not real strong cognition.” I think you’ll plausibly be able to do that up through the end of days, if you are shameless enough.
        I think the big ambiguity is about how you mark out a class of systems that benefit strongly from scale (i.e. such that doubling compute more than doubles economic value) and whether that’s being done correctly here. I think it’s fairly clear that the current crop of systems are much more general and are benefiting much more strongly from scale than previous systems. But it’s up for debate.
        Steven Byrnes 9 Mar 2023 20:38 UTC
        6 points
        Parent
        Hmm. I think we’re talking past each other a bit.
        I think that everyone (including me) who wasn’t expecting LLMs to do all the cool impressive things that they can in fact do, or who wasn’t expecting LLMs to improve as rapidly as they are in fact improving, is obligated to update on that.
        Once I do so update, it’s not immediately obvious to me that I learn anything more from the $1B/yr number. Yes, $1B/yr is plenty of money, but still a drop in the bucket of the >$1T/yr IT industry, and in particular, is dwarfed by a ton of random things like “corporate Linux support contracts”. Mostly I’m surprised that the number is so low!! (…For now!!)
        But whatever, I’m not sure that matters for anything.
        Anyway…
        I did spend considerable time last week pondering where & whether I expect LLMs to plateau. It was a useful exercise; I appreciate your prodding. :)
        I don’t really have great confidence in my answers, and I’m mostly redacting the details anyway. But if you care, here are some high-level takeaways of my current thinking:
        (1) I expect there to be future systems that centrally incorporate LLMs, but also have other components, and I expect these future systems to be importantly more capable, less safe, and more superficially / straightforwardly agent-y than is an LLM by itself as we think of them today.
        IF “LLMs scale to AGI”, I expect that this is how, and I expect that my own research will turn out to be pretty relevant in such a world. More generally, I expect that, in such systems, we’ll find the “traditional LLM alignment discourse” (RLHF fine-tuning, shoggoths, etc.) to be pretty irrelevant, and we’ll find the “traditional agent alignment discourse” (instrumental convergence, goal mis-generalization, etc.) to be obviously & straightforwardly relevant.
        (2) One argument that pushes me towards fast takeoff is pretty closely tied to what I wrote in my recent post:
        Two different perspectives are:
        AGI is about knowing how to do lots of things
        AGI is about not knowing how to do something, and then being able to figure it out.
        I’m strongly in the second camp.…
        The following is a bit crude and not entirely accurate, but to a first approximation I want to say that LLMs have a suite of abstract “concepts” that it has seen in its training data (and that were in the brains of the humans who created that training data), and LLMs are really good at doing mix-and-match compositionality and pattern-match search to build a combinatorial explosion of interesting fresh outputs out of that massive preexisting web of interconnected concepts.
        But I think there are some types of possible processes along the lines of:
        “invent new useful concepts from scratch—even concepts that have never occurred to any human—and learn them permanently, such that they can be built on the future”
        “notice inconsistencies in existing concepts / beliefs, find ways to resolve them, and learn them permanently, such that those mistakes will not be repeated in the future”
        etc.
        I think LLMs can do things like this a little bit, but not so well that you can repeat them in an infinite loop. For example, I suspect that if you took this technique and put it in an infinite loop, it would go off the rails pretty quickly. But I expect that future systems (of some sort) will eventually be able to do these kinds of things well enough to form a stable loop, i.e. the system will be able to keep running this process (whatever it is) over and over, and not go off the rails, but rather keep “figuring out” more and more things, thus rocketing off to outer space, in a way that’s loosely analogous to self-play in AlphaZero, or to a smart human gradually honing in on a better and better understanding of a complicated machine.
        I think this points to an upcoming “discontinuity”, in the sense that I think right now we don’t have systems that can do the above bullet points (at least, not well enough to repeat them in an infinite loop), and I think we will have such systems in the future, and I think we won’t get TAI until we do. And it feels pretty plausible to me (admittedly not based on much!) that it would only take a couple years or less between “widespread knowledge of how to build such systems” and “someone gets an implementation working well enough that they can run it in an infinite loop and it just keeps “figuring out” more and more things, correctly, and thus it rockets off to radically superhuman intelligence and capabilities.”
        (3) I’m still mostly expecting LLMs (and more broadly, LLM-based systems) to not be able to do the above bullet point things, and (relatedly) to plateau at a level where they mainly assist rather than replace smart humans. This is tied to fundamental architectural limitations that I believe transformers have (and indeed, that I believe DNNs more generally have), which I don’t want to talk about…
        (4) …but I could totally be wrong. ¯\_(ツ)_/¯ And I think that, for various reasons, my current day-to-day research program is not too sensitive to the possibility that I’m wrong about that.
        What links here?
        Steven Byrnes's comment on Discussion with Nate Soares on a key alignment difficulty by HoldenKarnofsky (15 Mar 2023 18:39 UTC; 2 points)
        Nathan Helm-Burger 11 Apr 2023 18:26 UTC
        6 points
        Parent
        Steven: as someone who has read all your posts agrees with you on almost everything, this is a point where I have a clear disagreement with you. When I switched from neuroscience to doing ML full-time, some of the stuff I read to get up to speed was people theorizing about impossibly large (infinite or practically so) neural networks. I think that the literature on this does a pretty good job of establishing that, in the limit, neural networks can compute any sort of function. Which means that they can compute all the functions in a human brain, or a set of human brains. Meaning, it’s not a question of whether scaling CAN get us to AGI. It certainly can. It’s a question of when. There is inefficiency in trying to scale an algorithm which tries to brute force learn the relevant functions rather than have them hardcoded in via genetics. I think that you are right that there are certain functions the human brain does quite well that current SoTA LLMs do very poorly. I don’t think this means that scaling LLMs can’t lead to a point where the relevant capabilities suddenly emerge. I think we are already in a regime of substantial compute and data overhang for AGI, and that the thing holding us back is the proper design and integration of modules which emulate the functions of parts of the brain not currently well imitated by LLMs. Like the reward and valence systems of the basal ganglia, for instance. It’s still an open question to me whether we will get to AGI via scaling or algorithmic improvement. Imagine for a moment that I am correct that scaling LLMs could get us there, but also that a vastly more efficient system which borrows more functions from the human brain is possible. What might this scenario look like? Perhaps an LLM gets strong enough to, upon human prompting and with human assistance, analyze the computational neuroscience literature and open source code, and extract useful functions, and then do some combination of intuitively improve their efficiency and brute force test them in new experimental ML architectures. This is not so big a leap from what GPT-4 is capable of. I think that that’s plausibly even a GPT-5 level of skill. Suppose also that these new architectures can be added onto existing LLM base models, rather than needed the base model to be retrained from scratch. As some critical amount of discoveries accumulate, GPT-5 suddenly takes a jump forward in efficacy, enabling it to process the rest of the potential improvements much faster, and then it takes another big jump forward, and then is able to rapidly self-improve with no further need for studying existing published research. In such a scenario, we’d have a foom over the course of a few days which could take us by surprise and lead to a rapid loss of control. This is exactly why I think Astera’s work is risky, even though their current code seems quite harmless on its own. I think it is focused on (some of) the places where LLMs do poorly, but also that there’s nothing stopping the work from being effectively integrated with existing models for substantial capability gains. This is why I got so upset with Astera when I realized during my interview process with them that they were open-sourcing their code, and also when I carefully read through their code and saw great potential there for integrating it with more mainstream ML to the empowerment of both.
        literature examples of the sort of thing I’m talking about with ‘enough scaling will eventually get us there’, even though I haven’t read this particular paper: https://arxiv.org/abs/2112.15577
        https://openreview.net/forum?id=HyGBdo0qFm
        Paul: I think you are making a valid point here. I think your point is (sadly) getting obscured by the fact our assumptions have shifted under our feet since the time when you began to make your point about slow vs fast takeoff.
        I’d like to explain what I think the point you are right on is, and then try to describe how I think we need a new framing for the next set of predictions.
        Several years ago, Eliezer and MIRI generally were frequently emphasizing the idea of a fast take-off that snuck up on us before the world had been much changed by narrow AI. You correctly predicted that the world would indeed be transformed in a very noticeable way by narrow AI before AGI. Eliezer in discussions with you has failed to acknowledge ways in which his views shifted from what they were ~10 years ago towards your views. I think this reflects poorly on him, but I still think he has a lot of good ideas, and made a lot of important predictions well in advance of other people realizing how important this was all going to be. As I’ve stated before, I often find my own views seeming to be located somewhere in-between your views and Eliezer’s wherever you two disagree.
        I think we should acknowledge your point that the world being changed in a very noticeable way by AI before true AGI, just as you have acknowledged Eliezer’s point that once a runaway out-of-human-control human-out-of-the-loop recursive-self-improvement process gets started it could potentially proceed shockingly fast and lead to a loss of humanity’s ability to regain control of the resulting AGI even once we realized what is happening. [I say Eliezer’s point here, not to suggest that you disagreed with him on this point, but simply that he was making this a central part of his predictions from fairly early on.]
        I think the framing we need now is: how can we predict, detect, and halt such a runaway RSI process before it is too late? This is important to consider from multiple angles. I mostly think that the big AI labs are being reasonably wary about this (although they certainly could do better). What concerns me more is the sort of people out in the wild who will take open source code and do dumb or evil stuff with it, Chaos-GPT-style, for personal gain or amusement. I think the biggest danger we face is that affordable open-source models seem to be lagging only a few years behind SotA models, and that the world is full of chaotic people who could (knowingly or not) trigger a runaway RSI process if such a thing is cheap and easy to do.
        In such a strategic landscape, it could be crucially important to figure out how to:
        a) slow down the progress of open source models, to keep dangerous runaway RSI from becoming cheap and easy to trigger
        b) use SotA models to develop better methods of monitoring and preventing anyone outside a reasonably-safe-behaving org from doing this dangerous thing.
        c) improving the ability of the reasonable orgs to self-monitor and notice the danger before they blow themselves up
        I think that it does not make strategic sense to actively hinder the big AI labs. I think our best move is to help them move more safely, while also trying to build tools and regulation for monitoring the world’s compute. I do not think there is any feasible solution for this which doesn’t utilize powerful AI tools to help with the monitoring process. These AI tools could be along the lines of SotA LLMs, or something different like an internet police force made up of something like Conjecture’s CogEms. Or perhaps some sort of BCI or gene-mod upgraded humans (though I doubt we have time for this).
        My view is that algorithmic progress, pointed to by neuroscience, is on the cusp of being discovered, and if those insights are published, will make powerful AGI cheap and available to all competent programmers everywhere in the world. With so many people searching, and the necessary knowledge so widely distributed, I don’t think we can count on keeping this under wraps forever. Rather than have these insights get discovered and immediately shared widely (e.g. by some academic eager to publish an exciting paper who didn’t realize the full power and implications of their discovery), I think it would be far better to have a safety-conscious lab discover these, have a way to safely monitor themselves to notice the danger and potential power of what they’ve discovered. They can then keep the discoveries secret and collaborate with other safety-conscious groups and with governments to set up the worldwide monitoring we need to prevent a rogue AGI scenario. Once we have that, we can move safely to the long reflection and take our time figuring out better solutions to alignment. [An important crux for me here is that I believe that if we have control of an AGI which we know is potentially capable of recursively self-improving beyond our bounds to control it, we can successfully utilize this AGI at its current level of ability without letting it self-improve. If someone convinced me that this was untenable, it would change my strategic recommendations.]
        As @Jed McCaleb said in his recent post, ‘The only way forward is through!’. https://www.lesswrong.com/posts/vEtdjWuFrRwffWBiP/we-have-to-upgrade
        What links here?
        Nathan Helm-Burger's comment on Request to AGI organizations: Share your views on pausing AI progress by Akash (11 Apr 2023 19:06 UTC; 6 points)
        Nathan Helm-Burger's comment on Core of AI projections from first principles: Attempt 1 by tailcalled (11 Apr 2023 18:59 UTC; 2 points)
        Nathan Helm-Burger's comment on Some Intuitions Around Short AI Timelines Based on Recent Progress by Aaron_Scher (11 Apr 2023 21:02 UTC; 2 points)
        Nathan Helm-Burger 17 Apr 2023 15:49 UTC
        4 points
        Parent
        As you can see from this prediction market I made, a lot of people currently disagree with me. I expect this will be a different looking distribution a year from now.
        
        https://manifold.markets/NathanHelmBurger/will-gpt5-be-capable-of-recursive-s?r=TmF0aGFuSGVsbUJ1cmdlcg
        
        Here’s an intuition pump analogy for how I’ve been thinking about this. Imagine that I, as someone with a background in neuroscience and ML was granted the following set of abilities. Would you bet that I, with this set of abilities, would be able to do RSI? I would.
        
        Abilities that I would have if I were an ML model trying to self-improve:
        
        Make many copies of myself, and checkpoints throughout the process.
        
        Work at high speed and in parallel with copies of myself.
        
        Read all the existing scientific literature that seemed potentially relevant.
        
        Observe all the connections between my neurons, all the activations of my clones as I expose them to various stimuli or run them through simulations.
        
        Ability to edit these weights and connections.
        
        Ability to add neurons (up to a point) where they seemed most needed, connected in any way I see fit, initialized with whatever weights I choose.
        
        Ability to assemble new datasets and build new simulations to do additional training with.
        
        Ability to freeze some subsection of a clone’s model and thus more rapidly train the remaining unfrozen section.
        
        Ability to take notes and write collaborative documents with my clones working in parallel with me.
        
        Ok. Thinking about that set of abilities, doesn’t it seem like a sufficiently creative, intelligent, determined general agent could successfully self-improve? I think so. I agree it’s unclear where the threshold is exactly, and when a transformer-based ML model will cross that threshold. I’ve made a bet at ‘GPT-5’, but honestly I’m not certain. Could be longer. Could be sooner...
        Nathan Helm-Burger 11 Apr 2023 21:04 UTC
        2 points
        Parent
        Sorry @the gears to ascension . I know your view is that it would be better for me to be quiet about this, but I think the benefits of speaking up in this case outweigh the potential costs.
        the gears to ascension 11 Apr 2023 21:13 UTC
        4 points
        Parent
        oh, no worries, this part is obvious
        dxu 21 Feb 2023 21:34 UTC
        7 points
        Parent
        Is your probability distribution just concentrated with 100% of its mass between $1 trillion and $10 trillion? (And if so: why?)
        To specifically answer the question in the parenthetical (without commenting on the dollar numbers; I don’t actually currently have an intuition strongly mapping [the thing I’m about to discuss] to dollar amounts—meaning that although I do currently think the numbers you give are in the right ballpark, I reserve the right to reconsider that as further discussion and/or development occurs):
        The reason someone might concentrate their probability mass at or within a certain impact range, is if they believe that it makes conceptual sense to divide cognitive work into two (or more) distinct categories, one of which is much weaker in the impact it can have. Then the question of how this division affects one’s probability distribution is determined almost entirely by the question of what level at which they think the impact of the weaker category will saturate. And that question, in turn, has a lot more to do with the concrete properties they expect (or don’t expect) to see from the weaker cognition type, than it has to do with dollar quantities directly. You can translate the former into the latter, but only via an additional series of calculations and assumptions; the actual object-level model—which is where any update would occur—contains no gear directly corresponding to “dollar value of impact”.
        So when this kind of model encounters LLMs doing unusual and exciting things that score very highly on metrics like revenue, investment, and overall “buzz”… well, these metrics don’t directly lead the model to update. What instead the model considers relevant is whether, when you look at the LLM’s output, that output seems to exhibit properties of cognition that are strongly prohibited by the model’s existing expectations about weak versus strong cognitive work—and if it doesn’t, then the model simply doesn’t update; it wasn’t, in fact, surprised by the level of cognition it observed—even if (perhaps) the larger model embedding it, which does track things like how the automation of certain tasks might translate into revenue/profit, was surprised.
        And in fact, I do think this is what we observe from Eliezer (and from like-minded folk): he’s updated in the sense of becoming less certain about how much economic value can be generated by “weak” cognition (although one should also note that he’s never claimed to be particularly certain about this metric to begin with); meanwhile, he has not updated about the existence of a conceptual divide between “weak” and “strong” cognition, because the evidence he’s been presented with legitimately does not have much to say on that topic. In other words, I think he would say that the statement
        I think we are getting significant evidence about the plausibility that deep learning is able to automate real human cognitive work
        is true, but that its relevance to his model is limited, because “real human cognitive work” is a category spanning (loosely speaking) both “cognitive work that scales into generality”, and “cognitive work that doesn’t”, and that by agglomerating them together into a single category, you’re throwing away a key component of his model.
        Incidentally, I want to make one thing clear: this does not mean I’m saying the rise of the Transformers provides no evidence at all in favor of [a model that assigns a more direct correspondence between cognitive work and impact, and postulates a smooth conversion from the former to the latter]. That model concentrates more probability mass in advance on the observations we’ve seen, and hence does receive Bayes credit for its predictions. However, I would argue that the updates in favor of this model are not particularly extreme, because the model against which it’s competing didn’t actually strongly prohibit the observations in question, only assign less probability to them (and not hugely less, since “slow takeoff” models don’t generally attempt to concentrate probability mass to extreme amounts, either)!
        All of which is to say, I suppose, that I don’t really disagree with numerical likelihoods you give here:
        I think that we have gotten considerable evidence about this, more than a factor of 4. I’ve personally updated my views by about a factor of 2, from a 25% chance to a 50% chance that scaling up deep learning is the real deal and leads to transformation soon.
        but that I’m confused that you consider this “considerable”, and would write up a comment chastising Eliezer and the other “fast takeoff” folk because they… weren’t hugely moved by, like, ~2 bits’ worth of evidence? Like, I don’t see why he couldn’t just reply, “Sure, I updated by around 2 bits, which means that now I’ve gone from holding fast takeoff as my dominant hypothesis to holding fast takeoff as my dominant hypothesis.” And that seems like that degree of update would basically produce the kind of external behavior that might look like “not owning up” to evidence, because, well… it’s not a huge update to begin with?
        (And to be clear, this does require that his prior look quite different from yours. But that’s already been amply established, I think, and while you can criticize his prior for being overconfident—and I actually find myself quite sympathetic to that line of argument—criticizing him for failing to properly update given that prior is, I think, a false charge.)
        paulfchristiano 21 Feb 2023 23:25 UTC
        7 points
        Parent
        Yes, I’m saying that each $ increment the “qualitative division” model fares worse and worse. I think that people who hold onto this qualitative division have generally been qualitatively surprised by the accomplishments of LMs, that when they make concrete forecasts those forecasts have mismatched reality, and that they should be updating strongly about whether such a division is real.
        What instead the model considers relevant is whether, when you look at the LLM’s output, that output seems to exhibit properties of cognition that are strongly prohibited by the model’s existing expectations about weak versus strong cognitive work—and if it doesn’t, then the model simply doesn’t update; it wasn’t, in fact, surprised by the level of cognition it observed—even if (perhaps) the larger model embedding it, which does track things like how the automation of certain tasks might translate into revenue/profit, was surprised.
        I’m most of all wondering how you get high level of confidence in the distinction and its relevance. I’ve seen only really vague discussion. The view that LM cognition doesn’t scale into generality seems wacky to me. I want to see the description of tasks it can’t do.
        In general if someone won’t state any predictions of their view I’m just going to update about your view based on my understanding of what it predicts (which is after all what I’d ultimately be doing if I took a given view seriously). I’ll also try to update about your view as operated by you, and so e.g. if you were generally showing a good predictive track record or achieving things in the world then I would be happy to acknowledge there is probably some good view there that I can’t understand.
        I’m confused that you consider this “considerable”, and would write up a comment chastising Eliezer and the other “fast takeoff” folk because they… weren’t hugely moved by, like, ~2 bits’ worth of evidence? Like, I don’t see why he couldn’t just reply, “Sure, I updated by around 2 bits, which means that now I’ve gone from holding fast takeoff as my dominant hypothesis to holding fast takeoff as my dominant hypothesis.”
        I do think that a factor of two is significant evidence. In practice in my experience that’s about as much evidence as you normally get between realistic alternative perspectives in messy domains. The kind of forecasting approach that puts 99.9% probability on things and so doesn’t move until it gets 10 bits is just not something that works in practice.
        On the slip side, it’s enough evidence that Eliezer is endlessly condescending about it (e.g. about those who only assigned a 50% probability to the covid response being as inept as it was). Which I think is fine (but annoying), a factor of 2 is real evidence. And if I went around saying “Maybe our response to AI will be great” and then just replied to this observation with “whatever covid isn’t the kind of thing I’m talking about” without giving some kind of more precise model that distinguishes, then you would be right to chastise me.
        Perhaps more importantly, I just don’t know where someone with this view would give ground. Even if you think any given factor of two isn’t a big deal, ten factors of two is what gets you from 99.9% to 50%. So you can’t just go around ignoring a couple of them every few years!
        And rhetorically, I’m not complaining about people ultimately thinking fast takeoff is more plausible. I’m complaining about not expressing the view in such a way that we can learn about it based on what appears to me to be multiple bits of evidence, or acknowledging that evidence. This isn’t the only evidence we’ve gotten, I’m generally happy to acknowledge many bits of ways in which my views have moved towards other people’s.
        Vladimir_Nesov 21 Feb 2023 22:58 UTC
        5 points
        Parent
        So one claim is that a theory of post-AGI effects often won’t say things about pre-AGI AI, so mostly doesn’t get updated from pre-AGI observations. My take on LLM alignment asks to distinguish human-like LLM AGIs from stronger AGIs (or weirder LLMs), with theories of stronger AGIs not naturally characterizing issues with human-like LLMs. Like, they aren’t concerned with optimizing for LLM superstimuli while their behavior remains in human imitation regime, where caring for LLM-specific things didn’t have a chance to gain influence. When the mostly faithful imitation nature of LLMs breaks with enough AI tinkering, the way human nature is breaking now towards influence of AGIs, we get another phase change to stronger AGIs.
        
        This seems like a pattern, theories of extremal later phases being bounded within their scopes, saying little of preceding phases that transition into them. If the phase transition boundaries get muddled in thinking about this, we get misleading impressions about how the earlier phases work, while their navigation is instrumental for managing transitions into the much more concerning later phases.
        What links here?
        Vladimir_Nesov's comment on AGI in sight: our look at the game board by Andrea_Miotti (21 Feb 2023 23:23 UTC; 2 points)
- Daniel Kokotajlo 19 Feb 2023 0:27 UTC
  17 points
  Parent
  1. What you are basically saying is “Yudkowsky thought AGI might have happened by now, whereas I didn’t; AGI hasn’t happened by now, therefore we should update from Yud to me by a factor of ~1.5 (and also from Yud to the agi-is-impossible crowd, for that matter)” I agree.
  2. Here’s what I think is going to happen (this is something like my modal projection, obviously I have a lot of uncertainty; also I don’t expect the world economy to be transformed as fast as this projects due to schlep and regulation, and so probably things will take a bit longer than depicted here but only a bit.):
    
    No pressure, but I’d love it if you found time someday to fiddle with the settings of the model at takeoffspeeds.com and then post a screenshot of your own modal or median future. I think that going forward, we should all strive to leave this old “fast vs. slow takeoff” debate in the dust & talk more concretely about variables in this model, or in improved models.
  What links here?
  - Vladimir_Nesov's comment on AGI in sight: our look at the game board by Andrea_Miotti (20 Feb 2023 22:55 UTC; 4 points)
  - paulfchristiano 19 Feb 2023 6:04 UTC
    9 points
    Parent
    I don’t quite know what “AGI might have happened by now means.”
    I thought that we might have built transformative AI by 2023 (I gave it about 5% in 2010 and about 2% in 2018), and I’m not sure that Eliezer and I have meaningfully different timelines. So obviously “now” doesn’t mean 2023.
    If “now” means “When AI is having ~$1b/year of impact,” and “AGI” means “AI that can do anything a human can do better” then yes, I think that’s roughly what I’m saying.
    But an equivalent way of putting it is that Eliezer thinks weak AI systems will have very little impact, and I think weak AI systems will have a major impact, and so the more impact weak AI systems have the more evidence it gives for my view.
    One way of putting it makes it seem like Eliezer would have shorter timelines since AGI might happen at any moment. Another way of putting it makes it seem like Eliezer may have longer timelines because nothing happens in the AGI-runup, and the early AI applications will drive increases in investment and will eventually accelerate R&D.
    I don’t know whether Eliezer in fact has shorter or longer timelines, because I don’t think he’s commented publicly recently. So it seems like either way of putting it could be misleading.
    - Daniel Kokotajlo 20 Feb 2023 6:10 UTC
      5 points
      Parent
      Ah, I’m pretty sure Eliezer has shorter timelines than you. He’s been cagy about it but he sure acts like it, and various of his public statements seem to suggest it. I can try to dig them up if you like.
      If “now” means “When AI is having ~$1b/year of impact,” and “AGI” means “AI that can do anything a human can do better” then yes, I think that’s roughly what I’m saying.
      Yep that’s one way of putting what I said yeah. My model of EY’s view is: Pre-AGI systems will ramp up in revenue & impact at some rate, perhaps the rate that they have ramped up so far. Then at some point we’ll actually hit AGI (or seed AGI) and then FOOM. And that point MIGHT happen later, when AGI is already a ten-trillion-dollar industry, but it’ll probably happen before then. So… I definitely wasn’t interpreting Yudkowsky in the longer-timelines way. His view did imply that maybe nothing super transformative would happen in the run-up to AGI, but not because pre-AGI systems are weak, rather because there just won’t be enough time for them to transform things before AGI comes.
      
      Anyhow, I’ll stop trying to speak for him.
    - JNS 27 Feb 2023 13:50 UTC
      1 point
      Parent
      My model is very discontinues, I try to think of AI as AI (and avoid the term AGI).
      And sure intelligence has some G measure, and everything we have built so far is low G^[1] (humans have high G).
      Anyway, at the core I think the jump will happen when an AI system learns the meta task / goal “Search and evaluate”^[2], once that happens^[3] G would start increasing very fast (versus earlier), and adding resources to such a thing would just accelerate this^[4].
      And I don’t see how that diverges from this reality or a reality where its not possible to get there, until obviously we get there.
      ^
      I can’t speak to what people have built / are building in private.
      ^
      Whenever people say AGI, I think AI that can do “search and evaluate” recursively.
      ^
      And my intuition says that requires a system that has much higher G than current once, although looking at how that likely played out for us, it might be much lower than my intuition leads me to believe.
      ^
      That is contingent on architecture, if we built a system that cannot scale easily or at all, then this wont happen.
  - leogao 19 Feb 2023 2:19 UTC
    7 points
    Parent
    +1 for the push for more quantitative models.
    (though I would register that trying to form a model with so many knobs to turn is really daunting, so I expect I personally will probably procrastinate a bit before actually putting together one, and I anticipate others to maybe feel similar)
    - Daniel Kokotajlo 19 Feb 2023 5:16 UTC
      4 points
      Parent
      I mean it’s not so daunting if you mostly just defer to Tom & accept the default settings, but then tweak a few settings here and there.
      
      Also it’s very cheap to fiddle with each setting one by one to see how much of an effect it has. Most of them don’t have much of an effect, so you only need to really focus on a few of them (such as the training requirements and the FLOP gap)
  - Vladimir_Nesov 19 Feb 2023 0:59 UTC
    4 points
    Parent
    Why slow hardware takeoff (just $10^{6}$ in 4 years, though measuring in dollars is confusing)? I expect it a bit later, but faster, because nanotech breaks manufacturing tech level continuity, channeling theoretical research directly, and with reasonable amount of compute performing relevant research is not a bottleneck. This would go from modern circuit fabs to disassembling the moon and fueling it with fusion (or something of this sort of level of impact), immediately and without any intermediate industrial development process.
    What links here?
    Vladimir_Nesov's comment on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky by bayesed (20 Feb 2023 21:07 UTC; 6 points)
    - Daniel Kokotajlo 19 Feb 2023 2:01 UTC
      4 points
      Parent
      I don’t think the model applies once you get to strongly superhuman systems—so, by mid-2027 in the scenario depicted. At that point, yeah, I’d expect the whole economy to be furiously bootstrapping towards nanotech or maybe even there already. Then the dissassemblies begin.
      
      Also, as I mentioned, I think the model might overestimate the speed at which new AI advances can be rolled out into the economy, and converted into higher GDP and more/better hardware. Thus I think we completely agree.
- Matthew Barnett 11 Apr 2023 2:23 UTC
  9 points
  Parent
  Very minor note: is there some way we can move away from the terms “slow takeoff” and “fast takeoff” here? My impression is that almost everyone is confused by these terms except for a handful of people who are actively engaged in the discussions. I’ve seen people variously say that slow takeoff means:
  - That AI systems will take decades to go from below human-level to above human level
  - That AI systems will take decades to go from slightly above human-level to radically superintelligent
  - That we’ll never get explosive growth (GDP will just continue to grow at like 3%)
  Whereas the way you’re actually using the term is “We will have a smooth rather than abrupt transition to a regime of explosive growth.” I don’t have any suggestions for what terms we should use instead, only that we should probably come up with better ones.
- lc 19 Feb 2023 1:53 UTC
  6 points
  Parent
  Questions of how many trillions of dollars OpenAI will be allowed to generate by entities like the U.S. government are unimportant to the bigger concern, which is: Will these minds be useful in securing AIs before they become so powerful they’re dangerous? Given the pace of ML research and how close LLMS are apparently able to get to AGI without actually-being-AGI, as a person who was only introduced to this debate in like the last year or so, my answer is: seems like that’s what’s happening now? Yeah?
- Vladimir_Nesov 18 Feb 2023 19:18 UTC
  6 points
  Parent
  In my model, the AGI threshold is a phase change. Before that point, AI improves at human research speed; after that point it improves much faster (even as early AGI is just training skills and not doing AI research). Before AGI threshold, the impact on the world is limited to what sub-AGI AI can do, which depends on how much time is available before AGI for humans to specialize AI to particular applications. So with short AGI threshold timelines, there isn’t enough time for humans to make AI very impactful, but after AGI threshold is crossed, AI advancement accelerates much more than before that point. And possibly there is not enough time post-AGI to leave much of an economic impact either, before it gets into post-singularity territory (nanotech and massive compute; depending on how AGIs navigate transitive alignment, possibly still no superintelligence).
  
  I think this doesn’t fit into the takeoff speed dichotomy, because in this view the speed of the post-AGI phase isn’t observable during the pre-AGI phase.
  What links here?
  - Vladimir_Nesov's comment on AGI in sight: our look at the game board by Andrea_Miotti (20 Feb 2023 22:55 UTC; 4 points)