Your graph shows “a small increase” that represents progress that is equal to an advance of a third to a half the time left until catastrophe on the default trajectory. That’s not small! That’s as much progress as everyone else combined achieves in a third of the time till catastrophic models! It feels like you’d have to figure out some newer efficient training that allows you to get GPT-3 levels of performance with GPT-2 levels of compute to have an effect that was plausibly that large.
In general I wish you would actually write down equations for your model and plug in actual numbers; I think it would be way more obvious that things like this are not actually reasonable models of what’s going on.
I’m not sure I get your point here: the point of the graph is to just illustrate that when effects compound, looking only at the short term difference is misleading. Short term differences lead to much larger long term effects due to compounding. The graph was just a quick digital rendition of what I previously drew on a whiteboard to illustrate the concept, and is meant to be an intuition pump.
The model is not implying any more sophisticated complex mathematical insight than just “remember that compounding effects exist”, and the graph is just for illustrative purposes.
Of course, if you had perfect information at the bottom of the curve, you would see that the effect your “small” intervention is having is actually quite big: but that’s precisely the point of the post, it’s very hard to see this normally! We don’t have perfect information, and the point aims at raising salience to people’s minds that what they perceive as a “small” action in the present moment, will likely lead to a “big” impact later on.
To illustrate the point: if you make a discovery now worth 2 billion dollars more of investment in AI capabilities, and this compounds yearly at a 20% rate, you’ll get far more than +2 billion in the final total e.g., 10 years later. If you make this 2 billion dollar discovery later, after ten years you will not have as much money invested in capabilities as you would have in the other case!
Such effects might be obvious in retrospect with perfect information, but this is indeed the point of the post: when evaluating actions in our present moment it’s quite hard to foresee these things, and the post aims to raise these effects to salience!
We could spend time on more graphs, equations and numbers, but that wouldn’t be a great marginal use of our time. Feel free to spend more time on this if you find it worthwhile (it’s a pretty hard task, since no one has a sufficiently gears-level model of progress!).
I continue to think that if your model is that capabilities follow an exponential (i.e. dC/dt = kC), then there is nothing to be gained by thinking about compounding. You just estimate how much time it would have taken for the rest of the field to make an equal amount of capabilities progress now. That’s the amount you shortened timelines by; there’s no change from compounding effects.
if you make a discovery now worth 2 billion dollars [...] If you make this 2 billion dollar discovery later
Two responses:
Why are you measuring value in dollars? That is both (a) a weird metric to use and (b) not the one you had on your graph.
Why does the discovery have the same value now vs later?
I’m not sure I get your point here: the point of the graph is to just illustrate that when effects compound, looking only at the short term difference is misleading. Short term differences lead to much larger long term effects due to compounding. The graph was just a quick digital rendition of what I previously drew on a whiteboard to illustrate the concept, and is meant to be an intuition pump.
The model is not implying any more sophisticated complex mathematical insight than just “remember that compounding effects exist”, and the graph is just for illustrative purposes.
Of course, if you had perfect information at the bottom of the curve, you would see that the effect your “small” intervention is having is actually quite big: but that’s precisely the point of the post, it’s very hard to see this normally! We don’t have perfect information, and the point aims at raising salience to people’s minds that what they perceive as a “small” action in the present moment, will likely lead to a “big” impact later on.
To illustrate the point: if you make a discovery now worth 2 billion dollars more of investment in AI capabilities, and this compounds yearly at a 20% rate, you’ll get far more than +2 billion in the final total e.g., 10 years later. If you make this 2 billion dollar discovery later, after ten years you will not have as much money invested in capabilities as you would have in the other case!
Such effects might be obvious in retrospect with perfect information, but this is indeed the point of the post: when evaluating actions in our present moment it’s quite hard to foresee these things, and the post aims to raise these effects to salience!
We could spend time on more graphs, equations and numbers, but that wouldn’t be a great marginal use of our time. Feel free to spend more time on this if you find it worthwhile (it’s a pretty hard task, since no one has a sufficiently gears-level model of progress!).
I continue to think that if your model is that capabilities follow an exponential (i.e. dC/dt = kC), then there is nothing to be gained by thinking about compounding. You just estimate how much time it would have taken for the rest of the field to make an equal amount of capabilities progress now. That’s the amount you shortened timelines by; there’s no change from compounding effects.
Two responses:
Why are you measuring value in dollars? That is both (a) a weird metric to use and (b) not the one you had on your graph.
Why does the discovery have the same value now vs later?