Isn’t it a bit suspicious that the thing-that’s-discontinuous is hard to measure, but the-thing-that’s-continuous isn’t? I mean, this isn’t totally suspicious, because subjective experiences are often hard to pin down and explain using numbers and statistics. I can understand that, but the suspicion is still there.
I sympathize with this view, and I agree there is some element of truth to it that may point to a fundamental gap in our understanding (or at least in mine). But I’m not sure I entirely agree that discontinuous capabilities are necessarily hard to measure: for example, there are benchmarks available for things like arithmetic, which one can train on and make quantitative statements about.
I think the key to the discontinuity question is rather that 1) it’s the jumps in model scaling that are happening in discrete increments; and 2) everything is S-curves, and a discontinuity always has a linear regime if you zoom in enough. Those two things together mean that, while a capability like arithmetic might have a continuous performance regime on some domain, in reality you can find yourself halfway up the performance curve in a single scaling jump (and this is in fact what happened with arithmetic and GPT-3). So the risk, as I understand it, is that you end up surprisingly far up the scale of “world-ending” capability from one generation to the next, with no detectable warning shot beforehand.
“No one predicted X in advance” is only damning to a theory if people who believed that theory were making predictions about it at all. If people who generally align with Paul Christiano were indeed making predictions to the effect of GPT-3 capabilities being impossible or very unlikely within a narrow future time window, then I agree that would be damning to Paul’s worldview. But—and maybe I missed something—I didn’t see that. Did you?
No, you’re right as far as I know; at least I’m not aware of any such attempted predictions. And in fact, the very absence of such prediction attempts is interesting in itself. One would imagine that correctly predicting the capabilities of an AI from its scale ought to be a phenomenally valuable skill — not just from a safety standpoint, but from an economic one too. So why, indeed, didn’t we see people make such predictions, or at least try to?
There could be several reasons. For example, perhaps Paul (and other folks who subscribe to the “continuum” world-model) could have done it, but they were unaware of the enormous value of their predictive abilities. That seems implausible, so let’s assume they knew the value of such predictions would be huge. But if you know the value of doing something is huge, why aren’t you doing it? Well, if you’re rational, there’s only one reason: you aren’t doing it because it’s too hard, or otherwise too expensive compared to your alternatives. So we are forced to conclude that this world-model — by its own implied self-assessment — has, so far, proved inadequate to generate predictions about the kinds of capabilities we really care about.
(Note: you could make the argument that OpenAI did make such a prediction, in the approximate yet very strong sense that they bet big on a meaningful increase in aggregate capabilities from scale, and won. You could also make the argument that Paul, having been at OpenAI during the critical period, deserves some credit for that decision. I’m not aware of Paul ever making this argument, but if made, it would be a point in favor of such a view and against my argument above.)
Yeah, these are interesting points.
I sympathize with this view, and I agree there is some element of truth to it that may point to a fundamental gap in our understanding (or at least in mine). But I’m not sure I entirely agree that discontinuous capabilities are necessarily hard to measure: for example, there are benchmarks available for things like arithmetic, which one can train on and make quantitative statements about.
I think the key to the discontinuity question is rather that 1) it’s the jumps in model scaling that are happening in discrete increments; and 2) everything is S-curves, and a discontinuity always has a linear regime if you zoom in enough. Those two things together mean that, while a capability like arithmetic might have a continuous performance regime on some domain, in reality you can find yourself halfway up the performance curve in a single scaling jump (and this is in fact what happened with arithmetic and GPT-3). So the risk, as I understand it, is that you end up surprisingly far up the scale of “world-ending” capability from one generation to the next, with no detectable warning shot beforehand.
No, you’re right as far as I know; at least I’m not aware of any such attempted predictions. And in fact, the very absence of such prediction attempts is interesting in itself. One would imagine that correctly predicting the capabilities of an AI from its scale ought to be a phenomenally valuable skill — not just from a safety standpoint, but from an economic one too. So why, indeed, didn’t we see people make such predictions, or at least try to?
There could be several reasons. For example, perhaps Paul (and other folks who subscribe to the “continuum” world-model) could have done it, but they were unaware of the enormous value of their predictive abilities. That seems implausible, so let’s assume they knew the value of such predictions would be huge. But if you know the value of doing something is huge, why aren’t you doing it? Well, if you’re rational, there’s only one reason: you aren’t doing it because it’s too hard, or otherwise too expensive compared to your alternatives. So we are forced to conclude that this world-model — by its own implied self-assessment — has, so far, proved inadequate to generate predictions about the kinds of capabilities we really care about.
(Note: you could make the argument that OpenAI did make such a prediction, in the approximate yet very strong sense that they bet big on a meaningful increase in aggregate capabilities from scale, and won. You could also make the argument that Paul, having been at OpenAI during the critical period, deserves some credit for that decision. I’m not aware of Paul ever making this argument, but if made, it would be a point in favor of such a view and against my argument above.)