Thanks for writing this! This is definitely an important skill and it doesn’t seem like there was such a post on LW already.
Some mild theoretical justification: one reason to expect this procedure to be reliable, especially if you break up an estimate into many pieces and multiply them, is that you expect the errors in your pieces to be more or less independent. That means they’ll often more or less cancel out once you multiply them (e.g. one piece might be 4 times too large but another might be 5 times too small). More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding’s inequality).
Another mild justification is the notion of entangled truths. A lot of truths are entangled with the truth that there are about 300 million Americans and so on, so as long as you know a few relevant true facts about the world your estimates can’t be too far off (unless the model you put those facts into is bad).
More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding’s inequality).
If success of a fermi estimate is defined to be “within a factor of 10 of the correct answer”, then that’s a constant bound on the allowed error of the logarithm. No “compared to the expected value of the logarithm” involved. Besides, I wouldn’t expect the value of the logarithm to grow with number of pieces either: the log of an individual piece can be negative, and the true answer doesn’t get bigger just because you split the problem into more pieces.
So, assuming independent errors and using either Hoeffding’s inequality or the central limit theorem to estimate the error of the result, says that you’re better off using as few inputs as possible. The reason fermi estimates even involve more than 1 step, is that you can make the per-step error smaller by choosing pieces that you’re somewhat confident of.
Thanks for writing this! This is definitely an important skill and it doesn’t seem like there was such a post on LW already.
Some mild theoretical justification: one reason to expect this procedure to be reliable, especially if you break up an estimate into many pieces and multiply them, is that you expect the errors in your pieces to be more or less independent. That means they’ll often more or less cancel out once you multiply them (e.g. one piece might be 4 times too large but another might be 5 times too small). More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding’s inequality).
Another mild justification is the notion of entangled truths. A lot of truths are entangled with the truth that there are about 300 million Americans and so on, so as long as you know a few relevant true facts about the world your estimates can’t be too far off (unless the model you put those facts into is bad).
If success of a fermi estimate is defined to be “within a factor of 10 of the correct answer”, then that’s a constant bound on the allowed error of the logarithm. No “compared to the expected value of the logarithm” involved. Besides, I wouldn’t expect the value of the logarithm to grow with number of pieces either: the log of an individual piece can be negative, and the true answer doesn’t get bigger just because you split the problem into more pieces.
So, assuming independent errors and using either Hoeffding’s inequality or the central limit theorem to estimate the error of the result, says that you’re better off using as few inputs as possible. The reason fermi estimates even involve more than 1 step, is that you can make the per-step error smaller by choosing pieces that you’re somewhat confident of.
Oops, you’re absolutely right. Thanks for the correction!