Forcing Yourself is Self Harm, or Don’t Goodhart Yourself
I recently wrote about how forcing yourself to keep your identity small harms you in the same way suppressing emotions is self harm. I got two kinds of criticism on the piece. One was that people seemed to read “actively working to keep your identity small is bad” as “keeping your identity small is bad” and disliked what they perceived as a contrary position on advice they like. This was at least partially my fault because the original title of the post didn’t make this very clear. The other criticism was that I had made a general argument that “forcing yourself to do X is self harm” but had used it only to address keeping your identity small.
So, here is a better titled and generalized version of that post.
There are any number of virtuous things we might strive for. Here’s a small list of the kinds of things I have in mind:
productivity
being nice
eating healthy
exercising regularly
getting enough sleep
keeping your identity small
meditating
learning something you considers important, like a foreign language or math
writing more
keeping the house clean
A tempting strategy to achieve these things is some variant of forcing, striving, trying real hard, or otherwise just actively and directly pushing yourself to achieve the goal. This typically looks a few ways:
punishing yourself when you fail
creating rewards for yourself when you succeed
beating yourself up for not being better
creating incentives to “trick” yourself
exerting self control or willpower
The problem is that these methods all have a common fatal flaw: they require optimizing towards a measured target and so are subject to Goodhart effects.
Here’s an illustrative story of what can happen:
You want to be more productive, so you start measuring how productive you are. The metric used isn’t that important for this example, so let’s say it’s something like words written per hour. Using a spreadsheet and a simple word counting program, you capture this information and find that you can write about 500 words of useable content per hour at your most productive. Great, now you can start holding yourself accountable and expecting, let’s say, at least 400 words per hour.
Uh oh, it’s a Tuesday, you look at the clock, and you realize you’ve only written 100 words in the last hour. Well, that’s okay, you’ll do better next hour. But another hour passes and you only wrote 100 more words. This is looking bad. The day goes on like this. You go to bed thinking “well, I’ll do better tomorrow”.
The days pass and you keep having bouts where you don’t meet the target. You start to feel some mental anguish at the thought of sitting down to write because you reasonably predict that you might not meet your target. You feel less motivated to write, and find yourself in a positive feedback loop, writing less because you are distracted feeling bad about not writing enough. Pretty soon you rarely have productive periods of writing where you crank out more than 400 words in an hour.
Now you want to write, but hesitate to because of the pain of failing to meet your own expectations. You agonize over the cognitive dissonance of both wanting to write and wanting not to fail to meet your target. Absent some external motivation, you just stop writing, because the joy of producing words doesn’t compare with the distress of not producing enough words. You’ve forgotten you started out trying to be more productive and are now trapped doing less than nothing, because you spend time pining for the days when you would write anything at all.
Most people have been through something like this, but replace writing and words with learning and grades, work and money, love and dates, or health and calories.
A common way to think about what happened is to put it in terms of multi-agent models of mind. Internal Family Systems provides a good model here: some parts of your mind try to protect you by “exiling” other parts of you mind that want things you think are bad. In the story above it was the part that would notice you didn’t hit your writing target. If one of those exiles tries to “sneak back in” (e.g. you think you can write, which implicitly means letting back in the part that notices how much you wrote), the protecting part of you uses a host of techniques like guilt, forgetfulness, and anger to keep the exile out. Over time this exiling can lead to dissociation at best and cognitive fusion at worst, causing you to become confused about what’s happening in your own mind as a feature of protecting yourself from doing “bad” things (e.g. you want to write and don’t know why you don’t or how to change the situation).
As an alternative to a multi-agent model, you can also think of this simply in terms of competing desires and beliefs with some overpowering others because they are not in complete reflective equilibrium. There’s also probably a predictive processing take on what’s happening, but I’ve not worked out the details, so I’ll leave it as an exercise to the reader. The point is that, however you conceptualize what’s happening, by trying to force yourself to be a particular way you harm yourself by engaging mental behaviors that will make you less aware of yourself and so ultimately less fluid and agenty because that’s the most direct way to achieve what you want, i.e. you never fail if you never try.
This leaves us with two issues to address: what should you do if you’ve already harmed yourself in the way described, and what should you do instead of Goodharting?
To answer the first question, a few things seem helpful here. If you are out of touch with yourself, especially your emotions or feelings, focusing seems like a great choice. Even if you don’t think you’re cut off from your emotions, focusing can still be a good thing to try because sometimes a person can become so cutoff they don’t realize they are, akin to not realizing you can be well rested because work, school, etc. schedules keep you sleep deprived for years and you forget that there’s another way. Other simple interventions you can try yourself include journaling and long, solitary walks that may help you discover what you’ve lost connection with in yourself.
If those don’t help, therapy may be necessary, as the therapist can act as a mirror to help you see yourself in ways you can’t on your own. Internal Family Systems, for example, is not just a model of the psyche but a theory of psychotherapy, and a therapist can help with reintegrating parts.
I’d also be remiss if I didn’t suggest that meditation practice can help, because it can, but it requires a large commitment to get results. Ten minutes a day of mindfulness meditation with an app is nice, but it’s unlikely to help you find a path out of the kind of self-harm discussed above, so when I suggest meditation I have in mind something more on the order of daily practice that involves a community and a teacher.
As to what you should do instead of Goodharting yourself by forcing yourself to do things, I’d generally describe the alternative strategy as nonmonotonically seeking Pareto improvements. In less jargony terms, that’s trying to find ways to make yourself broadly rather than narrowly better, accepting that you might get worse at some things for a while as you explore the space, and achieving virtuous things as a downstream consequence or side-effect of becoming generally more virtuous across many dimensions simultaneously. This can be frustrating because the feedback loop is often long and you have to trust in and have patience with yourself to keep going even when it’s not clear what to do next or when you’ll see results.
Ways you might carry such a program out include self-improvement through things like positive psychology, a mentorship relationship with someone who can help guide you, therapy, and, as you might expect me to suggest, Zen or another Buddhist practice. This is the path of becoming more virtuous overall so that necessarily you’re more virtuous along any particular dimension you choose to look.
The best endorsement of this approach I can offer is myself. In my teens and twenties I tried to narrowly optimize myself in various ways to correct what I felt were defects, like not reading enough, not learning math fast enough, not caring for other people enough, and spending too much time doing frivolous things. The result was that I was a mess of depression and anxiety and profoundly out of touch with my emotions and desires. Things got better only after I failed to force myself to achieve something for the Nth time and finally accepted that it was a failing strategy for living a life I would find fulfilling. At that point I stepped back and just started taking care of myself, and slowly discovered that focusing on general capacity building worked way better and undid much of the damage I had done to myself.
Now in my late 30s I’m living the best life I have so far: I’m doing things I find fulfilling, I’m happy, and I seem to keep getting better at things I was previously bad at. Past performance may be no guarantee of future gains, but given the natural experiment I carried out on myself it certainly seems that, at least for me, the theory that focusing on Pareto improvements over narrow improvements is better for mental health has proven correct. Maybe it will be for you, too.
- Seeking Truth Too Hard Can Keep You from Winning by 30 Nov 2021 2:16 UTC; 27 points) (
- The Map-Territory Distinction Creates Confusion by 4 Jan 2022 15:49 UTC; 25 points) (
- 12 Apr 2021 5:42 UTC; 21 points) 's comment on People Will Listen by (
- Fundamental Uncertainty: Prelude by 6 Feb 2022 2:26 UTC; 19 points) (
- 6 Jan 2022 15:55 UTC; 4 points) 's comment on Illegible impact is still impact by (EA Forum;
The following comment isn’t exactly a criticism. It’s more just exploring the idea.
I still struggle to really get on board with the advice you offer here while at the same time thinking that the general idea has a lot of merit. I think that both making yourself broadly better and focusing on narrow areas is maybe the best approach.
Take your illustrative story. I’d say the problem here is not that the person is trying to focus on the narrow area of increasing productivity. It’s that they picked a bad metric and a bad way of continual measuring themselves against the metric. The story just kind of glosses over what I would say is the most important part!
I’d say that 65%-75% of the problem this person has is that they apparently didn’t seriously think about this stuff before hand and pre-commit to a good strategy for measurement.
The person who looks and says “I only wrote 100 words last hour?!??!” kind of reminds me of the investor checking their stock prices every day.
For this person three months or six months or a year might be a better time frame for checking how they’re doing. Regardless, the main point I want to make is that how well this person would be able to improve themselves in this area while maintaining their well being is largely dependent upon making good decisions on this very important question.
On the other hand, making good decisions about this is also part of your advice...aka, keeping broad self-improvement in mind.
FWIW, I’ve lived most of my adult life (I’m in my mid 40s) basically with this sort of mindset...focusing on specific areas of self-improvement but also being well aware of how it might affect my broad well-being and taking that into account. I think everyone who knows me would tell you I’m a well-adjusted, friendly, and happy person.
That being said, I feel like that a lot of that is inherent in my personality so I’m not sure how much weight to give my personal experience.
Also, I wanted to say that I know many people who really came into their own in their mid-to-late thirties. I think a lot of people just start getting their life into order by that time, so I’m also not sure how much weight to give your personal experiences in this area.
If this is like, established fact or something...I did not know this, and I understand why the hypothetical person was also unaware of this.
Yes. But since I don’t expect to see an RCT anytime soon*, if anyone—you (Dustin) or the OP (Gordon)** -wrote posts about ‘things that improved my life’, I’d be interested to see those posts, and read them while keeping in mind that they’re not (necessarily) literal laws of physics, and different things might work for different people—especially when things are as vague as ‘keep your identity small’ and ‘don’t force that’. (How small is small? I don’t think I’ve seen ‘Make your identity big’ (and I won’t write it because I don’t know how to make it bigger.))
*If you’ve heard of something let me know.
**or both
I have heard this advice repeatedly, but I guess it is quite easy to miss it. People probably either don’t know it, or consider it too obvious to mention. (And a few people benefit from you not knowing the advice, so they can profit from your fear, advising you to sell what you have and buy something else, charging you a commission, and you are so thankful that someone observes these changes 24⁄7 for you.)
On the other hand, many people seem unable to follow this advice even if they hear it. Like, a few of my colleagues who bought Bitcoins. Every day there were like “look, it’s $10 higher, we made a profit!” or “oh, it’s $10 lower, this sucks!”, and I was like “guys, calm down, it’s not important what happens in a day, the only thing that matters is what happens over years” and tried to explain that the $10 is not even worth the amount of stress they feel… and that the only thing they need to do is to simply stop watching the news, return a year later, with certain probability they will lose everything (if this is not acceptable for you, do not invest), and with certain probability they will make a nontrivial profit.
I guess following this advice is emotionally difficult. About a month later, some colleagues said they sold the Bitcoins, because it was “too stressful” for them to think about it. Seems like “simply do not think about it” is a difficult skill, almost like meditation. On the positive side, if you succeed to do it for a few weeks, it becomes much easier, because now you are used to the situation and it is no longer exciting. (I have a similar experience with Facebook and other websites: the longer you are without them, the less you miss them.)
As Viliam says, it’s something I’ve heard constantly throughout my life. However, the hypothetical person not having heard of it relates to the points I’m trying to make. I’m saying that rather than telling them to focus on something other than performance, telling them how to better measure themselves might be the better course.
To be clear, this is exactly why I tried to couch all my language in this thread in “might”, “I think”, and other terms to indicate that not only am I not sure, but I’m not sure how anyone can be sure about this subject.
When I say to the OP, “I’m also not sure how much weight to give your personal experiences in this area.”, I think I’m saying the same thing you’re saying. I’m not trying to say in a roundabout way that I don’t believe the experiences of the OP. I’m saying my literal state of mind. I also want the OP to post posts like this one for the same reasons you describe.
This is one of the weird issues with what I see as the problem I’m trying to illustrate with the story and the limitations of telling a single story about it.
What you say is true, but it’s a reduction of the problem to be less bad by applying weaker optimization pressure rather than an actual elimination of the problem. Weak Goodharting is still Goodharting and it will still, eventually, subtly screw you up.
This post is also advice, and so aimed mostly at folks less like you and more like the kind of person who doesn’t realize they’re actively making their life worse rather than better by trying too hard.
I think all self improvement is subject to Goodharting, even the type you recommend.
The best things available to us to do about that:
Be nimble and self-aware. Adjust your processes to notice when you’re harming yourself.
Be thoughtful in how you measure success.
I do not think this is actually a contradiction to your post, but, at least for me, it seems like a more actionable framing of the issue.
In particular, I’d worry that “not Goodharting yourself” is Goodharting yourself. Dunno that I have this very coherently, but things that feel like hooks:
Selling nonapples.
Don’t try to “get better at rationality”; rationality needs to have a goal outside itself.
“How are you doing at your goals?” “Well, I stopped measuring things that weren’t perfect metrics for them.”
This describes me and my issues exactly! THANKYOU for verbalizing this and providing something to try that’s worked for you! This seems right to me and has given me some hope!! Much appreciated