The even bigger danger is that at some point you risk crossing from Goodhart’s Law problems to 2nd level Goodhart’s Law problems, where you are effectively optimizing a metric that was originally predicting who would be able to successfully game the original metric, and so on.
Maybe. I am torn between ‘this does seem like an important thing no one has noticed or at least no one has pointed out explicitly, that I understand and could explain to them’ and ‘but that’s the whole post, the rest is obvious, no?’
I think in the same way that part of the point of books (that could be distilled into one blogpost), is to give you more time to meditate on them and reflect on them, a blogpost that could be one sentence is useful to give the sentence more hooks into the rest of your thinking.
Like, maybe just listing a few different real examples.
Anyone have good real life examples of 2nd Level Goodhart to throw out there? (e.g. where you have T as a measure of U which is supposed to be a measure of V, but you can’t measure U directly either, so you end up optimizing for T). Can be either ‘it works out mostly OK’ or where it works out totally not OK.
State average altitude as a proxy for individual altitude, as a proxy for altitude of water source, as a proxy for water contamination. Jury is out on correctness.
state is never the right level of data to look at except for laws
County-level obesity datasets are mostly based on educated guesses that vary widely rather than actual measurements. I have found several of those datasets that correlate very poorly with one another. Variables such as median household income often correlate more strongly with obesity in some of those datasets than different obesity estimates correlate with each other.
AFAICT, state-level obesity estimates are way more reliable. The estimates generated with BRFSS data seem to be based on large sample sizes in each state, which is something that we do not have for each individual county. So I think it makes sense to look at obesity at the state level.
That’s only true if people within states are more similar to each other on the relevant axes than to people in other states, right? If the real divide is rural/urban or education, then comparing states isn’t very useful even if some states are more rural or educated than others.
The fact that the county-level data is bad is unfortunate and makes the county-level analysis less useful, but doesn’t fix any of the problems with state-level data.
This feels maybe worthy of a post.
Maybe. I am torn between ‘this does seem like an important thing no one has noticed or at least no one has pointed out explicitly, that I understand and could explain to them’ and ‘but that’s the whole post, the rest is obvious, no?’
I think in the same way that part of the point of books (that could be distilled into one blogpost), is to give you more time to meditate on them and reflect on them, a blogpost that could be one sentence is useful to give the sentence more hooks into the rest of your thinking.
Like, maybe just listing a few different real examples.
Anyone have good real life examples of 2nd Level Goodhart to throw out there? (e.g. where you have T as a measure of U which is supposed to be a measure of V, but you can’t measure U directly either, so you end up optimizing for T). Can be either ‘it works out mostly OK’ or where it works out totally not OK.
State average altitude as a proxy for individual altitude, as a proxy for altitude of water source, as a proxy for water contamination. Jury is out on correctness.
https://slimemoldtimemold.com/2021/07/13/a-chemical-hunger-part-iii-environmental-contaminants/
(I’m broadly on board with contaminant theory of obesity but state is never the right level of data to look at except for laws).
County-level obesity datasets are mostly based on educated guesses that vary widely rather than actual measurements. I have found several of those datasets that correlate very poorly with one another. Variables such as median household income often correlate more strongly with obesity in some of those datasets than different obesity estimates correlate with each other.
See this Google Colab notebook for a few comparisons.
AFAICT, state-level obesity estimates are way more reliable. The estimates generated with BRFSS data seem to be based on large sample sizes in each state, which is something that we do not have for each individual county. So I think it makes sense to look at obesity at the state level.
That’s only true if people within states are more similar to each other on the relevant axes than to people in other states, right? If the real divide is rural/urban or education, then comparing states isn’t very useful even if some states are more rural or educated than others.
The fact that the county-level data is bad is unfortunate and makes the county-level analysis less useful, but doesn’t fix any of the problems with state-level data.