Empirically I very commonly see people around me (and myself) trying to agree, which is one of the reasons I wrote this post.
And that’s not obviously wrong. I’m pro sharing models and info more than trying to nail down perfect agreement on a variable, but it seems fairly sensible to me to use disagreement as a sign of a place where model-sharing has high returns on investment. The reason this post is necessary is not because optimising for agreement is terrible, but because it’s useful. Yet I think that many achieve the failure modes of causal goodharting.
Empirically I very commonly see people around me (and myself) trying to agree, which is one of the reasons I wrote this post.
And that’s not obviously wrong. I’m pro sharing models and info more than trying to nail down perfect agreement on a variable, but it seems fairly sensible to me to use disagreement as a sign of a place where model-sharing has high returns on investment. The reason this post is necessary is not because optimising for agreement is terrible, but because it’s useful. Yet I think that many achieve the failure modes of causal goodharting.