I’m tangentially reminded of professional modeler & health economist froolow’s refactoring of GiveWell’s cost-effectiveness models in his A critical review of GiveWell’s 2022 cost-effectiveness model (sections 3 and 4), which I think of as complementary to your post in that it teaches-via-case-study how to level up your spreadsheet modeling.
Here’s GiveWell’s model architecture:
And here’s froolow’s refactoring:
The difference in micro-level architecture is also quite large:
As someone who’s spent a lot of his (short) career building dashboards and models in Google Sheets, and having seen GiveWell’s CEAs, I empathized with froolow’s remarks here:
After the issue of uncertainty analysis, I’d say the model architecture is the second biggest issue I have with the GiveWell model, and really the closest thing to a genuine ‘error’ rather than a conceptual step which could be improved. Model architecture is how different elements of your model interact with each other, and how they are laid out to a user.
It is fairly clear that the GiveWell team are not professional modellers, in the same way it would be obvious to a professional programmer that I am not a coder (this will be obvious as soon as you check the code in my Refactored model!). That is to say, there’s a lot of wasted effort in the GiveWell model which is typical when intelligent people are concentrating on making something functional rather than using slick technique. A very common manifestation of the ‘intelligent people thinking very hard about things’ school of model design is extremely cramped and confusing model architecture. This is because you have to be a straight up genius to try and design a model as complex as the GiveWell model without using modern model planning methods, and people at that level of genius don’t need crutches the rest of us rely on like clear and straightforward model layout. However, bad architecture is technical debt that you are eventually going to have to service on your model; when you hand it over to a new member of staff it takes longer to get that member of staff up to speed and increases the probability of someone making an error when they update the model.
Thanks, I found this interesting! I remember reading that piece by Froolow but I didn’t realize the refactoring was such a big part of it (and that the GiveWell CEA was formatted in such a dense way, wow).
This resonates a lot with my experience auditing sprawling, messy Excel models back in my last job (my god are there so many shitty Excel models in the world writ large).
FWIW if I were building a model this complex, I’d personally pop it into Squiggle / Squigglehub — if only because at that point, properly multiplying probabilities together and keeping track of my confidence interval starts to really matter to me :)
Great post, especially the companion piece :)
I’m tangentially reminded of professional modeler & health economist froolow’s refactoring of GiveWell’s cost-effectiveness models in his A critical review of GiveWell’s 2022 cost-effectiveness model (sections 3 and 4), which I think of as complementary to your post in that it teaches-via-case-study how to level up your spreadsheet modeling.
Here’s GiveWell’s model architecture:
And here’s froolow’s refactoring:
The difference in micro-level architecture is also quite large:
As someone who’s spent a lot of his (short) career building dashboards and models in Google Sheets, and having seen GiveWell’s CEAs, I empathized with froolow’s remarks here:
Thanks, I found this interesting! I remember reading that piece by Froolow but I didn’t realize the refactoring was such a big part of it (and that the GiveWell CEA was formatted in such a dense way, wow).
This resonates a lot with my experience auditing sprawling, messy Excel models back in my last job (my god are there so many shitty Excel models in the world writ large).
FWIW if I were building a model this complex, I’d personally pop it into Squiggle / Squigglehub — if only because at that point, properly multiplying probabilities together and keeping track of my confidence interval starts to really matter to me :)