Is your probability distribution just concentrated with 100% of its mass between $1 trillion and $10 trillion? (And if so: why?)
To specifically answer the question in the parenthetical (without commenting on the dollar numbers; I don’t actually currently have an intuition strongly mapping [the thing I’m about to discuss] to dollar amounts—meaning that although I do currently think the numbers you give are in the right ballpark, I reserve the right to reconsider that as further discussion and/or development occurs):
The reason someone might concentrate their probability mass at or within a certain impact range, is if they believe that it makes conceptual sense to divide cognitive work into two (or more) distinct categories, one of which is much weaker in the impact it can have. Then the question of how this division affects one’s probability distribution is determined almost entirely by the question of what level at which they think the impact of the weaker category will saturate. And that question, in turn, has a lot more to do with the concrete properties they expect (or don’t expect) to see from the weaker cognition type, than it has to do with dollar quantities directly. You can translate the former into the latter, but only via an additional series of calculations and assumptions; the actual object-level model—which is where any update would occur—contains no gear directly corresponding to “dollar value of impact”.
So when this kind of model encounters LLMs doing unusual and exciting things that score very highly on metrics like revenue, investment, and overall “buzz”… well, these metrics don’t directly lead the model to update. What instead the model considers relevant is whether, when you look at the LLM’s output, that output seems to exhibit properties of cognition that are strongly prohibited by the model’s existing expectations about weak versus strong cognitive work—and if it doesn’t, then the model simply doesn’t update; it wasn’t, in fact, surprised by the level of cognition it observed—even if (perhaps) the larger model embedding it, which does track things like how the automation of certain tasks might translate into revenue/profit, was surprised.
And in fact, I do think this is what we observe from Eliezer (and from like-minded folk): he’s updated in the sense of becoming less certain about how much economic value can be generated by “weak” cognition (although one should also note that he’s never claimed to be particularly certain about this metric to begin with); meanwhile, he has not updated about the existence of a conceptual divide between “weak” and “strong” cognition, because the evidence he’s been presented with legitimately does not have much to say on that topic. In other words, I think he would say that the statement
I think we are getting significant evidence about the plausibility that deep learning is able to automate real human cognitive work
is true, but that its relevance to his model is limited, because “real human cognitive work” is a category spanning (loosely speaking) both “cognitive work that scales into generality”, and “cognitive work that doesn’t”, and that by agglomerating them together into a single category, you’re throwing away a key component of his model.
Incidentally, I want to make one thing clear: this does not mean I’m saying the rise of the Transformers provides no evidence at all in favor of [a model that assigns a more direct correspondence between cognitive work and impact, and postulates a smooth conversion from the former to the latter]. That model concentrates more probability mass in advance on the observations we’ve seen, and hence does receive Bayes credit for its predictions. However, I would argue that the updates in favor of this model are not particularly extreme, because the model against which it’s competing didn’t actually strongly prohibit the observations in question, only assign less probability to them (and not hugely less, since “slow takeoff” models don’t generally attempt to concentrate probability mass to extreme amounts, either)!
All of which is to say, I suppose, that I don’t really disagree with numerical likelihoods you give here:
I think that we have gotten considerable evidence about this, more than a factor of 4. I’ve personally updated my views by about a factor of 2, from a 25% chance to a 50% chance that scaling up deep learning is the real deal and leads to transformation soon.
but that I’m confused that you consider this “considerable”, and would write up a comment chastising Eliezer and the other “fast takeoff” folk because they… weren’t hugely moved by, like, ~2 bits’ worth of evidence? Like, I don’t see why he couldn’t just reply, “Sure, I updated by around 2 bits, which means that now I’ve gone from holding fast takeoff as my dominant hypothesis to holding fast takeoff as my dominant hypothesis.” And that seems like that degree of update would basically produce the kind of external behavior that might look like “not owning up” to evidence, because, well… it’s not a huge update to begin with?
(And to be clear, this does require that his prior look quite different from yours. But that’s already been amply established, I think, and while you can criticize his prior for being overconfident—and I actually find myself quite sympathetic to that line of argument—criticizing him for failing to properly update given that prior is, I think, a false charge.)
Yes, I’m saying that each $ increment the “qualitative division” model fares worse and worse. I think that people who hold onto this qualitative division have generally been qualitatively surprised by the accomplishments of LMs, that when they make concrete forecasts those forecasts have mismatched reality, and that they should be updating strongly about whether such a division is real.
What instead the model considers relevant is whether, when you look at the LLM’s output, that output seems to exhibit properties of cognition that are strongly prohibited by the model’s existing expectations about weak versus strong cognitive work—and if it doesn’t, then the model simply doesn’t update; it wasn’t, in fact, surprised by the level of cognition it observed—even if (perhaps) the larger model embedding it, which does track things like how the automation of certain tasks might translate into revenue/profit, was surprised.
I’m most of all wondering how you get high level of confidence in the distinction and its relevance. I’ve seen only really vague discussion. The view that LM cognition doesn’t scale into generality seems wacky to me. I want to see the description of tasks it can’t do.
In general if someone won’t state any predictions of their view I’m just going to update about your view based on my understanding of what it predicts (which is after all what I’d ultimately be doing if I took a given view seriously). I’ll also try to update about your view as operated by you, and so e.g. if you were generally showing a good predictive track record or achieving things in the world then I would be happy to acknowledge there is probably some good view there that I can’t understand.
I’m confused that you consider this “considerable”, and would write up a comment chastising Eliezer and the other “fast takeoff” folk because they… weren’t hugely moved by, like, ~2 bits’ worth of evidence? Like, I don’t see why he couldn’t just reply, “Sure, I updated by around 2 bits, which means that now I’ve gone from holding fast takeoff as my dominant hypothesis to holding fast takeoff as my dominant hypothesis.”
I do think that a factor of two is significant evidence. In practice in my experience that’s about as much evidence as you normally get between realistic alternative perspectives in messy domains. The kind of forecasting approach that puts 99.9% probability on things and so doesn’t move until it gets 10 bits is just not something that works in practice.
On the slip side, it’s enough evidence that Eliezer is endlessly condescending about it (e.g. about those who only assigned a 50% probability to the covid response being as inept as it was). Which I think is fine (but annoying), a factor of 2 is real evidence. And if I went around saying “Maybe our response to AI will be great” and then just replied to this observation with “whatever covid isn’t the kind of thing I’m talking about” without giving some kind of more precise model that distinguishes, then you would be right to chastise me.
Perhaps more importantly, I just don’t know where someone with this view would give ground. Even if you think any given factor of two isn’t a big deal, ten factors of two is what gets you from 99.9% to 50%. So you can’t just go around ignoring a couple of them every few years!
And rhetorically, I’m not complaining about people ultimately thinking fast takeoff is more plausible. I’m complaining about not expressing the view in such a way that we can learn about it based on what appears to me to be multiple bits of evidence, or acknowledging that evidence. This isn’t the only evidence we’ve gotten, I’m generally happy to acknowledge many bits of ways in which my views have moved towards other people’s.
So one claim is that a theory of post-AGI effects often won’t say things about pre-AGI AI, so mostly doesn’t get updated from pre-AGI observations. My takeon LLM alignment asks to distinguish human-like LLM AGIs from stronger AGIs (or weirder LLMs), with theories of stronger AGIs not naturally characterizing issues with human-like LLMs. Like, they aren’t concerned with optimizing for LLM superstimuli while their behavior remains in human imitation regime, where caring for LLM-specific things didn’t have a chance to gain influence. When the mostly faithful imitation nature of LLMs breaks with enough AI tinkering, the way human nature is breaking now towards influence of AGIs, we get another phase change to stronger AGIs.
This seems like a pattern, theories of extremal later phases being bounded within their scopes, saying little of preceding phases that transition into them. If the phase transition boundaries get muddled in thinking about this, we get misleading impressions about how the earlier phases work, while their navigation is instrumental for managing transitions into the much more concerning later phases.
To specifically answer the question in the parenthetical (without commenting on the dollar numbers; I don’t actually currently have an intuition strongly mapping [the thing I’m about to discuss] to dollar amounts—meaning that although I do currently think the numbers you give are in the right ballpark, I reserve the right to reconsider that as further discussion and/or development occurs):
The reason someone might concentrate their probability mass at or within a certain impact range, is if they believe that it makes conceptual sense to divide cognitive work into two (or more) distinct categories, one of which is much weaker in the impact it can have. Then the question of how this division affects one’s probability distribution is determined almost entirely by the question of what level at which they think the impact of the weaker category will saturate. And that question, in turn, has a lot more to do with the concrete properties they expect (or don’t expect) to see from the weaker cognition type, than it has to do with dollar quantities directly. You can translate the former into the latter, but only via an additional series of calculations and assumptions; the actual object-level model—which is where any update would occur—contains no gear directly corresponding to “dollar value of impact”.
So when this kind of model encounters LLMs doing unusual and exciting things that score very highly on metrics like revenue, investment, and overall “buzz”… well, these metrics don’t directly lead the model to update. What instead the model considers relevant is whether, when you look at the LLM’s output, that output seems to exhibit properties of cognition that are strongly prohibited by the model’s existing expectations about weak versus strong cognitive work—and if it doesn’t, then the model simply doesn’t update; it wasn’t, in fact, surprised by the level of cognition it observed—even if (perhaps) the larger model embedding it, which does track things like how the automation of certain tasks might translate into revenue/profit, was surprised.
And in fact, I do think this is what we observe from Eliezer (and from like-minded folk): he’s updated in the sense of becoming less certain about how much economic value can be generated by “weak” cognition (although one should also note that he’s never claimed to be particularly certain about this metric to begin with); meanwhile, he has not updated about the existence of a conceptual divide between “weak” and “strong” cognition, because the evidence he’s been presented with legitimately does not have much to say on that topic. In other words, I think he would say that the statement
is true, but that its relevance to his model is limited, because “real human cognitive work” is a category spanning (loosely speaking) both “cognitive work that scales into generality”, and “cognitive work that doesn’t”, and that by agglomerating them together into a single category, you’re throwing away a key component of his model.
Incidentally, I want to make one thing clear: this does not mean I’m saying the rise of the Transformers provides no evidence at all in favor of [a model that assigns a more direct correspondence between cognitive work and impact, and postulates a smooth conversion from the former to the latter]. That model concentrates more probability mass in advance on the observations we’ve seen, and hence does receive Bayes credit for its predictions. However, I would argue that the updates in favor of this model are not particularly extreme, because the model against which it’s competing didn’t actually strongly prohibit the observations in question, only assign less probability to them (and not hugely less, since “slow takeoff” models don’t generally attempt to concentrate probability mass to extreme amounts, either)!
All of which is to say, I suppose, that I don’t really disagree with numerical likelihoods you give here:
but that I’m confused that you consider this “considerable”, and would write up a comment chastising Eliezer and the other “fast takeoff” folk because they… weren’t hugely moved by, like, ~2 bits’ worth of evidence? Like, I don’t see why he couldn’t just reply, “Sure, I updated by around 2 bits, which means that now I’ve gone from holding fast takeoff as my dominant hypothesis to holding fast takeoff as my dominant hypothesis.” And that seems like that degree of update would basically produce the kind of external behavior that might look like “not owning up” to evidence, because, well… it’s not a huge update to begin with?
(And to be clear, this does require that his prior look quite different from yours. But that’s already been amply established, I think, and while you can criticize his prior for being overconfident—and I actually find myself quite sympathetic to that line of argument—criticizing him for failing to properly update given that prior is, I think, a false charge.)
Yes, I’m saying that each $ increment the “qualitative division” model fares worse and worse. I think that people who hold onto this qualitative division have generally been qualitatively surprised by the accomplishments of LMs, that when they make concrete forecasts those forecasts have mismatched reality, and that they should be updating strongly about whether such a division is real.
I’m most of all wondering how you get high level of confidence in the distinction and its relevance. I’ve seen only really vague discussion. The view that LM cognition doesn’t scale into generality seems wacky to me. I want to see the description of tasks it can’t do.
In general if someone won’t state any predictions of their view I’m just going to update about your view based on my understanding of what it predicts (which is after all what I’d ultimately be doing if I took a given view seriously). I’ll also try to update about your view as operated by you, and so e.g. if you were generally showing a good predictive track record or achieving things in the world then I would be happy to acknowledge there is probably some good view there that I can’t understand.
I do think that a factor of two is significant evidence. In practice in my experience that’s about as much evidence as you normally get between realistic alternative perspectives in messy domains. The kind of forecasting approach that puts 99.9% probability on things and so doesn’t move until it gets 10 bits is just not something that works in practice.
On the slip side, it’s enough evidence that Eliezer is endlessly condescending about it (e.g. about those who only assigned a 50% probability to the covid response being as inept as it was). Which I think is fine (but annoying), a factor of 2 is real evidence. And if I went around saying “Maybe our response to AI will be great” and then just replied to this observation with “whatever covid isn’t the kind of thing I’m talking about” without giving some kind of more precise model that distinguishes, then you would be right to chastise me.
Perhaps more importantly, I just don’t know where someone with this view would give ground. Even if you think any given factor of two isn’t a big deal, ten factors of two is what gets you from 99.9% to 50%. So you can’t just go around ignoring a couple of them every few years!
And rhetorically, I’m not complaining about people ultimately thinking fast takeoff is more plausible. I’m complaining about not expressing the view in such a way that we can learn about it based on what appears to me to be multiple bits of evidence, or acknowledging that evidence. This isn’t the only evidence we’ve gotten, I’m generally happy to acknowledge many bits of ways in which my views have moved towards other people’s.
So one claim is that a theory of post-AGI effects often won’t say things about pre-AGI AI, so mostly doesn’t get updated from pre-AGI observations. My take on LLM alignment asks to distinguish human-like LLM AGIs from stronger AGIs (or weirder LLMs), with theories of stronger AGIs not naturally characterizing issues with human-like LLMs. Like, they aren’t concerned with optimizing for LLM superstimuli while their behavior remains in human imitation regime, where caring for LLM-specific things didn’t have a chance to gain influence. When the mostly faithful imitation nature of LLMs breaks with enough AI tinkering, the way human nature is breaking now towards influence of AGIs, we get another phase change to stronger AGIs.
This seems like a pattern, theories of extremal later phases being bounded within their scopes, saying little of preceding phases that transition into them. If the phase transition boundaries get muddled in thinking about this, we get misleading impressions about how the earlier phases work, while their navigation is instrumental for managing transitions into the much more concerning later phases.