When trying to fit an exponential curve, don’t weight all the points equally. Or if you’re using excel and just want the easy way, take the log of your values and then fit a straight line to the logs.
Because the noise usually grows as the signal does. Consider Moore’s law for transistors per chip. Back when that number was about 10^4, the standard deviation was also small—say 10^3. Now that density is 10^8, no chips are going to be within a thousand transiators of each other, the standard deviation is much bigger (~10^7).
This means that if you’re trying to fit the curve, being off by 10^5 is a small mistake when preducting current transistor #, but a huge mistake when predicting past transistor #. It’s not rare or implausible now to find a chip with 10^5 more transistors, but back in the ’70s that difference is a huge error, impossible under an accurate model of reality.
A basic fitting function, like least squares, doesn’t take this into account. It will trade off transistors now vs. transistors in the past as if the mistakes were of exactly equal importance. To do better you have to use something like a chi squared method, where you explicitly weight the points differently based on their variance. Or fit on a log scale using the simple method, which effectively assumes that the noise is proportional to the signal.
When trying to fit an exponential curve, don’t weight all the points equally
We didn’t. We fit a line in log space, but weighted the points by sqrt(y). The reason we did that is because it doesn’t actually appear linear in log space.
This is what it looks like if we don’t weight them. If you want to bite the bullet of this being a better fit, we can bet about it.
Interesting, thanks. This “unweighted” (on a log scale) graph looks a lot more like what I’d expect to be a good fit for a single-exponential model.
Of course, if you don’t like how an exponential curve fits the data, you can always change models—in this case, probably to a curve with 1 more free parameter (indicating a degree of slowdown of the exponential growth) or 2 more free parameters (to have 2 different exponentials stitched together at a specific point in time).
Of course, if you don’t like how an exponential curve fits the data, you can always change models—in this case, probably to a curve with 1 more free parameter (indicating a degree of slowdown of the exponential growth) or 2 more free parameters (to have 2 different exponentials stitched together at a specific point in time).
Oh that’s actually a pretty good idea. Might redo some analysis we built on top of this model using that.
When trying to fit an exponential curve, don’t weight all the points equally. Or if you’re using excel and just want the easy way, take the log of your values and then fit a straight line to the logs.
Um… why?
Because the noise usually grows as the signal does. Consider Moore’s law for transistors per chip. Back when that number was about 10^4, the standard deviation was also small—say 10^3. Now that density is 10^8, no chips are going to be within a thousand transiators of each other, the standard deviation is much bigger (~10^7).
This means that if you’re trying to fit the curve, being off by 10^5 is a small mistake when preducting current transistor #, but a huge mistake when predicting past transistor #. It’s not rare or implausible now to find a chip with 10^5 more transistors, but back in the ’70s that difference is a huge error, impossible under an accurate model of reality.
A basic fitting function, like least squares, doesn’t take this into account. It will trade off transistors now vs. transistors in the past as if the mistakes were of exactly equal importance. To do better you have to use something like a chi squared method, where you explicitly weight the points differently based on their variance. Or fit on a log scale using the simple method, which effectively assumes that the noise is proportional to the signal.
That makes perfect sense. Thanks.
We didn’t. We fit a line in log space, but weighted the points by sqrt(y). The reason we did that is because it doesn’t actually appear linear in log space.
This is what it looks like if we don’t weight them. If you want to bite the bullet of this being a better fit, we can bet about it.
Interesting, thanks. This “unweighted” (on a log scale) graph looks a lot more like what I’d expect to be a good fit for a single-exponential model.
Of course, if you don’t like how an exponential curve fits the data, you can always change models—in this case, probably to a curve with 1 more free parameter (indicating a degree of slowdown of the exponential growth) or 2 more free parameters (to have 2 different exponentials stitched together at a specific point in time).
Oh that’s actually a pretty good idea. Might redo some analysis we built on top of this model using that.