A large part of the post would be proofs about what the distributions of X and V must be for limt→∞E[V|V+X>t]=0, where X and V are independent random variables with mean zero. It’s clear that
X must be heavy-tailed (or long-tailed or something)
X must have heavier tails than V
The proof seems messy though; Drake Thomas and I have spent ~5 person-days on it and we’re not quite done. Before I spend another few days proving this, is it a standard result in statistics? I looked through a textbook and none of the results were exactly what I wanted.
Note that a couple of people have already looked at it for ~5 minutes and found it non-obvious, but I suspect it might be a known result anyway on priors.
Doesn’t answer your question, but we also came across this effect in the RM Goodharting work, though instead of figuring out the details we only proved that it when it’s definitely not heavy tailed it’s monotonic, for Regressional Goodhart (https://arxiv.org/pdf/2210.10760.pdf#page=17). Jacob probably has more detailed takes on this than me.
In any event my intuition is this seems unlikely to be the main reason for overoptimization—I think it’s much more likely that it’s Extremal Goodhart or some other thing where the noise is not independent
Is bullet point one true, or is there a condition that I’m not assuming? E.g if $V$ is the constant $0$ random variable and $X$ is $N(0, 1)$ then the limit result holds, but a Gaussian is neither heavy- nor long-tailed.
I’m planning to write a post called “Heavy-tailed error implies hackable proxy”. The idea is that when you care about V and are optimizing for a proxy U=V+X, Goodhart’s Law sometimes implies that optimizing hard enough for U causes V to stop increasing.
A large part of the post would be proofs about what the distributions of X and V must be for limt→∞E[V|V+X>t]=0, where X and V are independent random variables with mean zero. It’s clear that
X must be heavy-tailed (or long-tailed or something)
X must have heavier tails than V
The proof seems messy though; Drake Thomas and I have spent ~5 person-days on it and we’re not quite done. Before I spend another few days proving this, is it a standard result in statistics? I looked through a textbook and none of the results were exactly what I wanted.
Note that a couple of people have already looked at it for ~5 minutes and found it non-obvious, but I suspect it might be a known result anyway on priors.
Doesn’t answer your question, but we also came across this effect in the RM Goodharting work, though instead of figuring out the details we only proved that it when it’s definitely not heavy tailed it’s monotonic, for Regressional Goodhart (https://arxiv.org/pdf/2210.10760.pdf#page=17). Jacob probably has more detailed takes on this than me.
In any event my intuition is this seems unlikely to be the main reason for overoptimization—I think it’s much more likely that it’s Extremal Goodhart or some other thing where the noise is not independent
Is bullet point one true, or is there a condition that I’m not assuming? E.g if $V$ is the constant $0$ random variable and $X$ is $N(0, 1)$ then the limit result holds, but a Gaussian is neither heavy- nor long-tailed.
I’m also assuming V is not bounded above.