I saw a presentation covering a bunch of this back in February, and the graphs I found most informative were those showing the training flop distributions before updating against already-achievable levels. There is one graph along these lines on page 13 in part 1 in the google docs, but it doesn’t show the combined distribution without the update against already achievable flops.
Am I correct in remembering that the combined distribution before that update was distinctly bimodal? That was one of my main takeaways from the presentation, and I want to make sure I’m remembering it correctly.
Hi John, I think I remember that presentation—the reason the graph there was quite bimodal is because the Lifetime Anchor I was using at the time was simply assuming ~1x human lifetime levels of computation. In the current model, I’m assuming ~1000x human lifetime levels of computation, because ~1x seemed like a much less likely version of that anchor. The code in the quantitative model will let you see the untruncated version of the distribution, and it looks a lot more smooth now, though still a modest bump.
Also, apologies for such a late reply, I don’t get email notifications for comments and haven’t been checking regularly!
I saw a presentation covering a bunch of this back in February, and the graphs I found most informative were those showing the training flop distributions before updating against already-achievable levels. There is one graph along these lines on page 13 in part 1 in the google docs, but it doesn’t show the combined distribution without the update against already achievable flops.
Am I correct in remembering that the combined distribution before that update was distinctly bimodal? That was one of my main takeaways from the presentation, and I want to make sure I’m remembering it correctly.
Hi John, I think I remember that presentation—the reason the graph there was quite bimodal is because the Lifetime Anchor I was using at the time was simply assuming ~1x human lifetime levels of computation. In the current model, I’m assuming ~1000x human lifetime levels of computation, because ~1x seemed like a much less likely version of that anchor. The code in the quantitative model will let you see the untruncated version of the distribution, and it looks a lot more smooth now, though still a modest bump.
Also, apologies for such a late reply, I don’t get email notifications for comments and haven’t been checking regularly!
Thanks, that makes sense.