My fav part: > In the context of quantilization, we apply limited steam to projects to protect ourselves from Goodhart. “Full steam” is classically rational, but we do not always want that. We might even conjecture that we never want that.
To elaborate a bit:
It seems to me that when I let projects pull me insofar as they pull me, and when I find a thing that is interesting enough that it naturally “gains steam” in my head, it somehow increases the extent to which I am locally immune from Goodhardt (e.g., my actions/writing goes deeper than I might’ve expected). OTOH, when I try hard on a thing despite losing steam as I do it, I am more subject to Goodhardt (e.g., I complete something with the same keywords and external checksums as I thought I needed to hit, but it has less use and less depth than I might’ve expected given that).
I love this post. (Somehow only just read it.)
My fav part:
> In the context of quantilization, we apply limited steam to projects to protect ourselves from Goodhart. “Full steam” is classically rational, but we do not always want that. We might even conjecture that we never want that.
To elaborate a bit:
It seems to me that when I let projects pull me insofar as they pull me, and when I find a thing that is interesting enough that it naturally “gains steam” in my head, it somehow increases the extent to which I am locally immune from Goodhardt (e.g., my actions/writing goes deeper than I might’ve expected). OTOH, when I try hard on a thing despite losing steam as I do it, I am more subject to Goodhardt (e.g., I complete something with the same keywords and external checksums as I thought I needed to hit, but it has less use and less depth than I might’ve expected given that).
I want better models of this.