I usually don’t use paper or spreadsheet for Fermi estimates; that would make them too expensive. Also, my Fermi estimates tend to overlap heavily with big-O estimates.
When programming, I tend to keep a big-O/Fermi estimate for the runtime and memory usage in the back of my head. The big-O part of it is usually just “linear-ish” (for most standard data structure operations and loops over nested data structures), “quadratic” (for looping over pairs), “cubic-ish” (matrix operations), or “exponential” (in which case I usually won’t bother doing it at all). The Fermi part of it is then, roughly, how big a data structure can I run this on while still getting reasonable runtime? Assume ~1B ops per second, so for linear-ish I can use a data structure with ~1B entries, for cubic-ish ~1k entries, for exponential ~30 entries.
This obviously steers algorithm/design choice, but more importantly it steers debugging. If I’m doing a loop which should be linear-ish over a data structure with ~1M elements, and it’s taking more than a second, then something is wrong. Examples where this comes up:
scikit implementations of ML algorithms—twice I found that they were using quadratic algorithms for things which should have been linear. Eventually I gave up on scikit, since it was so consistently terrible.
SQL queries in large codebases. Often, some column needs an index, or the query optimizer fails to use an existing index for a complicated query, and this makes queries which should be linear instead quadratic. In my experience, this is one of the most common causes of performance problems in day-to-day software engineering.
Aside from programming, it’s also useful when using other peoples’ software. If the software is taking visible amounts of time to do something which I know should be linear, then the software is buggy, and I should maybe look for a substitute or a setting which can fix the problem.
I also do a lot of Fermi estimates when researching a topic or making a model. Often these estimates calculate what a physicist would call “dimensionless quantitites”—we take some number, and express it in terms of some related number with the same units. For instance:
If I’m reading about government expenditures or taxes, I usually want it as a fraction of GDP.
When looking at results from a linear regression, the coefficients aren’t very informative, but the correlation is. It’s essentially a dimensionless regression coefficient, and gives a good idea of effect size.
Biological examples (the bionumbers book is great for this sort of thing):
When thinking about reaction rates or turnover of proteins/cells, it’s useful to calculate a half-life. This is the rough timescale on which the reaction/cell count will equilibrate. (And when there are many steps in a pathway, the slowest half-life typically controls the timescale for the whole pathway, so this helps us narrow in on the most important part.)
When thinking about sizes or distances in a cell, it’s useful to compare them to the size of a typical cell.
When thinking about concentrations, it’s useful to calculate number of molecules per cell. In general, there’s noise of order sqrt(molecule count), which is a large fraction of the total count when the count is low.
On the moon, you can get to orbit by building a maglev and just accelerating up to orbital speed. How long does the track need to be, assuming we limit the acceleration (to avoid pancaking an passengers)? Turns out, if we limit the acceleration to n times the surface gravity, then the distance needs to be 1/n times the radius of the moon. That’s the sort of clean intuitive result we hope for from dimensionless quantities.
In general, the trigger for these is something like “see a quantity for which you have no intuition/poor intuition”, and the action is “express it relative to some characteristic parameter of the system”.
I usually don’t use paper or spreadsheet for Fermi estimates; that would make them too expensive. Also, my Fermi estimates tend to overlap heavily with big-O estimates.
When programming, I tend to keep a big-O/Fermi estimate for the runtime and memory usage in the back of my head. The big-O part of it is usually just “linear-ish” (for most standard data structure operations and loops over nested data structures), “quadratic” (for looping over pairs), “cubic-ish” (matrix operations), or “exponential” (in which case I usually won’t bother doing it at all). The Fermi part of it is then, roughly, how big a data structure can I run this on while still getting reasonable runtime? Assume ~1B ops per second, so for linear-ish I can use a data structure with ~1B entries, for cubic-ish ~1k entries, for exponential ~30 entries.
This obviously steers algorithm/design choice, but more importantly it steers debugging. If I’m doing a loop which should be linear-ish over a data structure with ~1M elements, and it’s taking more than a second, then something is wrong. Examples where this comes up:
scikit implementations of ML algorithms—twice I found that they were using quadratic algorithms for things which should have been linear. Eventually I gave up on scikit, since it was so consistently terrible.
SQL queries in large codebases. Often, some column needs an index, or the query optimizer fails to use an existing index for a complicated query, and this makes queries which should be linear instead quadratic. In my experience, this is one of the most common causes of performance problems in day-to-day software engineering.
Aside from programming, it’s also useful when using other peoples’ software. If the software is taking visible amounts of time to do something which I know should be linear, then the software is buggy, and I should maybe look for a substitute or a setting which can fix the problem.
I also do a lot of Fermi estimates when researching a topic or making a model. Often these estimates calculate what a physicist would call “dimensionless quantitites”—we take some number, and express it in terms of some related number with the same units. For instance:
If I’m reading about government expenditures or taxes, I usually want it as a fraction of GDP.
When looking at results from a linear regression, the coefficients aren’t very informative, but the correlation is. It’s essentially a dimensionless regression coefficient, and gives a good idea of effect size.
Biological examples (the bionumbers book is great for this sort of thing):
When thinking about reaction rates or turnover of proteins/cells, it’s useful to calculate a half-life. This is the rough timescale on which the reaction/cell count will equilibrate. (And when there are many steps in a pathway, the slowest half-life typically controls the timescale for the whole pathway, so this helps us narrow in on the most important part.)
When thinking about sizes or distances in a cell, it’s useful to compare them to the size of a typical cell.
When thinking about concentrations, it’s useful to calculate number of molecules per cell. In general, there’s noise of order sqrt(molecule count), which is a large fraction of the total count when the count is low.
On the moon, you can get to orbit by building a maglev and just accelerating up to orbital speed. How long does the track need to be, assuming we limit the acceleration (to avoid pancaking an passengers)? Turns out, if we limit the acceleration to n times the surface gravity, then the distance needs to be 1/n times the radius of the moon. That’s the sort of clean intuitive result we hope for from dimensionless quantities.
In general, the trigger for these is something like “see a quantity for which you have no intuition/poor intuition”, and the action is “express it relative to some characteristic parameter of the system”.