pjeby comments on The Curse Of The Counterfactual

pjeby 1 Nov 2019 23:43 UTC
24 points
2
Hi Ben, thanks for commenting.

What I’d first like to say is that negative reinforcement and punishment are actually two different things. What you’re describing as “punishment” is actually just negative feedback: i.e. noticing that something you’re doing isn’t working. But punishment is something we do to raise someone’s costs for bad action. This does not necessarily result in any reinforcement for the subject of the punishment.

In “Ingvar’s” case, for example, he constantly punished himself for surfing the internet, but this was actually positively reinforcing for the behavior of self-punishment itself, and did nothing to discourage the internet surfing behavior!

Even within the technical context of behaviorist learning, “punish” and “negatively reinforce” are two different things… and punishment does not do what you seem to be thinking it does.

Technically, what happens when you punish an animal or person, is that you end up positively reinforcing whatever works quickest to stop the punishment. Punishment, in and of itself, does not actually alter behavior. The only thing it trains you (or any other animal) to do is to avoid the punishment.

And when you are applying social punishment of the type described in this article, the thing that stops it is (e.g. in Sara’s case) ideation. The thing that turns off self-punishment is imagining a future in which you are a better person and the bad thing can’t happen any more. So, in a behaviorist reinforcement sense, by punishing yourself in this fashion you are training yourself to imagine better futures, because that’s the fastest way to stop the pain.

IOW, properly understood, the only functional use of punishment is to raise the costs of bad behavior. But in a self-applied case, raising your own costs is not a functional thing to do, especially when you factor in the moral licensing for being virtuously self-punishing, and effectively training yourself to imagine things being better, instead of actually doing anything to make them better.

So in that sense, I will say, no, it’s not the case that punishing yourself (using either the social or behaviorist definition) is a useful strategy for anything other than convincing others not to punish you (worse) for the same thing. That is the one way in which punishing yourself is actually useful, and it’s often how we learned to do it. (That is, to punish ourselves for the same things our parents punished us for, to lessen their desire to punish us.)

That being said, we probably have different definitions of what “punishment” actually consists of. In this post, I mean in the sense of “attacking reputation to raise the target’s costs”, not “negative feedback to shape behavior”, which is something else altogether.

People routinely confuse these two things, because our moral bias tells us that we must not let wrongs go unpunished. So we distort what behaviorism actually says about learning into “reward and punishment”, when in fact neither reward nor punishment are reliable reinforcement strategies! (For one thing, rewards and punishments are usually too far away in time from the actual behavior to have any meaningful effect, though that’s not the only difference.)

The mindset of reinforcing actual behavior, vs. rewarding and punishing what we think should be done, are very, very different in practice, but our brains are biased towards confusing the two.

As for Sara, I think perhaps you are overgeneralizing from Carlos’s example. I have different examples in the article because there are many different ways for “punishing based on counterfactuals” to manifest. What I did not cover in Sara’s case (or Ingvar’s for that matter) is that the surface-level “shoulds” being discussed were not the root issue. As I mention later in the article, one begins with whatever one is aware of, but working on these initial “should” statements then leads us deeper into the belief network.

For example, Ingvar believed he should have been working, and should have been able to finish in a certain amount of time. But the solution to this problem was not “grieve for not having worked”! It was discovering that the real issue was believing he was a bad person unless he was working. Removing that belief stopped him from generating counterfactuals about how he should have been working, which then led to him thinking of ways to actually get the work done.

IOW, it’s the deactivation of the punishment system that’s relevant here, because its activation blocked him from thinking about the actual process of work and the trade-offs involved, due to the “sacredness” of punishing himself for being a lazy evildoer who wasn’t working.

In the same way, Sara’s root issue isn’t that she’s punishing herself for her failed actions, it’s that she believes she needs to prove herself… or else she’s not a capable person. It’s that underlying belief which motivates the generation of the counterfactuals in the first place.

The full chain of events (for Sara and Ingvar) looks something like this:
- Step 1: Learn that a personal quality or behavior is subject to punishment by others (e.g. badness, incompetence)
- Step 2: Try to avoid feeling bad by creating an ideal of some kind (e.g. punish one’s self for evil, seek recognition to prove competence) that will counteract this and avoid future punishment
- Step 3: Encounter situations in life that remind one’s self of the quality learned about in step 1
- Step 4: Generate counterfactuals based on the ideal to stop the punishment (Sara) or punish one’s self for failing to make the ideal happen (Ingvar)
Here’s the thing: the only part of this cycle that you can meaningfully change is the learning found in step 1, because otherwise every time they encounter a reminder in the world, the punishment will be remembered, and sustain the motivation for avoidance. Without this punishment cycle in effect, the person can actually think about what would be a good way to reach their goals. But with the cycle in effect, all the person can think about when it comes up is what’s the fastest way to make the hurting stop!

I covered this more with the Ingvar example than the Sara one, but knowing how to do something doesn’t help in this cycle, because it produces the “yeah, but...” response. From inside of this cycle, practical advice literally seems irrelevant or off-topic, or at best misguided. People inside the loop say things like, “yeah, but it’s not that simple” or “you just don’t understand”, when you try to give them practical advice.

ISTM that you have overgeneralized from Carlos’ example that this is process is all about grief. But even in Sara’s case, it’s important to understand that she cannot actually accept or act on negative feedback without first acknowledging what actually happened. If there’s a semantic stop sign in her brain that pops up every time she tries to consider ways to behave (because in order to do that she has to think about what she actually did or might do), then she can’t really think about how to act differently, only ruminate about how she ought to have done something else.

So when we say “we should have done X” or “I should do Y”, we are not actually saying the full truth. What we are doing is denying the underlying reality that we did not do X, and we don’t want to do Y.

Sara actually knew, going into the conference, that she tended to be stubborn, and specifically thought ahead of time that she should not be. The problem is that “I should not do X” is an argument with reality: you know full well ahead of time that you probably will do X, but see this as wrong (in a moral sense, rather than a functional one). This motivates you to deflect the perception (and associated punishment) by asserting that you should do the right thing. (Like Ingvar asserting he should get the work done in an afternoon.)

I hope that the above explanation clarifies better what this article is driving at. The issue is that anytime we start thinking about what we or other people “ought” to do—as a moral judgment—we immediately “taboo tradeoffs” and disengage from practical reasoning. We’re no longer in a state of mind where feedback from what actually happened is even being taken into account, let alone learned from.

Finally, as for your comments on relationships, I’m just going to say that most of what you said has no real bearing on Carlos’s actual situation, which I will not comment further on as it would reduce his anonymity. But I do want to address this point:

I read the section on Carlos, and it seems like the explicit content was that you should always give up on relationships when they’re making you angry, and while there’s a deep truth to that with long-term relationships, I don’t think it should be the standard the solution. The standard solution to being angry at someone is to follow-through and make sure the cause is resolved, such that your anger reaches its natural conclusion. This is true even when it’s built up for a while. Often there’s something important that’s been left unsaid, and needs communicating.

So, this is an overgeneralization, again, because nothing in this post recommends any object-level behaviors. What the post discusses is the fact that, when you are counterfactualizing with moral judgment attached, you cannot reason properly. Your brain hijacks your reasoning in the service of your moral judgment, so you have literally no idea what actually should be done on the object level of the situation.

The solution to this problem, then, is to disable the hijacker so you can get back in the cockpit of the plane and figure out where you want to fly. In Ingvar’s case, he immediately began seeing other ways he could behave that would get to his goals better, and I had no need to advise him on the object level. The issue was that with his moral judgment system active, he literally could not even consider those options seriously, because they weren’t “punish someone” or “make the pain go away NOW”.

With regard to relationships, as with everything else this article talks about, the solution is to begin with whatever the actual ground truth of the situation is. If you are insisting that the other person in a relationship “should” be doing something, and that the only solution is to express anger in their direction, then you will miss the clue that sometimes, being angry at people doesn’t change them… but positive reinforcement might.

(But of course, when we’re thinking morally rather than strategically, we think it’s wrong to use positive reinforcement, because the other person doesn’t “deserve” it. They should just do the right thing without being rewarded, and they should be punished for not doing the right thing. So saith the moral judgment brain, so shall it be!)

Another problem is where you say, “make sure the cause is resolved, such that your anger reaches its natural conclusion”. The thing is, our anger’s “natural conclusion” is when somebody has suffered enough. (Notice, for example, how somebody who accedes to angry demands, but does not appear remorseful, will often result in the demander getting angrier. If it were about resolving the actual issue, this would not make sense.) And suffering enough doesn’t always correspond with an actual solution, either: note how often people end up stuck in abusive relationships because the abuser is really good at appearing remorseful!

So, following anger to its “natural conclusion” can easily lead you astray, compared to clearing your head and acting strategically. It can be almost impossible to enact, say, “tough love”, when you are stuck in your own moralizing about how someone ought to behave, both because you can’t think it through, and because it’s hard to do the “love” part while your brain is urging you to make someone to suffer for their sins.

Anyway, in summary: if you are arguing object-level recommendations from this article, you’ve confused your inferences with my statements. The only advice this post actually gives is to disengage your moral judgment if you want to be able to actually solve your problems, instead of just ruminating about them or punishing yourself for them. (And I guess, to avoid recursively making a “should” out of this idea, since that’s just doing more of the problem!)

[Edit to add: I have added a new section to the article, called The Disclaimer, to clarify that none of the stories contain, nor are intended to imply, any object-level advice for the depicted situations, and that rather, the article’s focus is on the problem of moral judgment impairing our ability to reason about the truth, and even perceive what it is in the first place.]