Here’s my definitely-wrong-and-overly-precise model of productivity. I’d be happy if someone pointed out where it’s wrong.
It has three central premises: a) I have proximal (basal; hardcoded) and distal (PFC; flexible) rewards. b) Additionally, or perhaps for the same reasons, my brain uses temporal-difference learning, but I’m unclear on the details. c) Hebbian learning: neurons that fire together, wire together.
If I eat blueberry muffins, I feel good. That’s a proximal reward. So every time my brain produces a motivation to eat blueberry muffins, and I take steps that makes me *predict* that I am closer to eating blueberry muffins, the synapses that produced *that particular motivation* gets reinforced and are more likely to fire again next time.
The brain gets trained to produce the motivations that more reliably produce actions that lead to rewards.
If I get out of bed quickly after the alarm sounds, there are no hardcoded rewards for that. But after I get out of bed, I predict that I am better able to achieve my goals, and that prediction itself is the reward that reinforces the behaviour. It’s a distal reward. Every time the brain produces motivations that in fact get me to take actions that I in fact predict will make me more likely to achieve my goals, those motivations get reinforced.
But I have some marginal control over *which motivations I choose to turn into action*, and some marginal control over *which predictions I make* about whether those actions take me closer to my goals. Those are the two levers with which I am able to gradually take control over which motivations my brain produces, as long as I’m strategic about it. I’m a fledgling mesa-optimiser inside my own brain, and I start out with the odds against me.
I can also set myself up for failure. If I commit to, say, study math for 12 hours a day, then… I’m able to at first feel like I’ve committed to that as long as I naively expect, right then and there, that the commitment takes me closer to my goals. But come the next day when I actually try to achieve this, I run out of steam, and it becomes harder and harder to resist the motivations to quit. And when I quit, *the motivations that led me to quit get reinforced because I feel relieved* (proximal reward). Trying-and-failing can build up quitting-muscles.
If you’re a sufficiently clever mesa-optimiser, you *can* make yourself study math for 12 hours a day or whatever, but you have to gradually build up to it. Never make a large ask of yourself before you’ve sufficiently starved the quitting-pathways to extinction. Seek to build up simple well-defined trigger-action rules that you know you can keep to every single time they’re triggered. If more and more of input-space gets gradually siphoned into those rules, you starve alternative pathways out of existence.
Thus, we have one aspect of the maxim: “You never make decisions, you only ever decide between strategies.”
Here’s my definitely-wrong-and-overly-precise model of productivity. I’d be happy if someone pointed out where it’s wrong.
It has three central premises: a) I have proximal (basal; hardcoded) and distal (PFC; flexible) rewards. b) Additionally, or perhaps for the same reasons, my brain uses temporal-difference learning, but I’m unclear on the details. c) Hebbian learning: neurons that fire together, wire together.
If I eat blueberry muffins, I feel good. That’s a proximal reward. So every time my brain produces a motivation to eat blueberry muffins, and I take steps that makes me *predict* that I am closer to eating blueberry muffins, the synapses that produced *that particular motivation* gets reinforced and are more likely to fire again next time.
The brain gets trained to produce the motivations that more reliably produce actions that lead to rewards.
If I get out of bed quickly after the alarm sounds, there are no hardcoded rewards for that. But after I get out of bed, I predict that I am better able to achieve my goals, and that prediction itself is the reward that reinforces the behaviour. It’s a distal reward. Every time the brain produces motivations that in fact get me to take actions that I in fact predict will make me more likely to achieve my goals, those motivations get reinforced.
But I have some marginal control over *which motivations I choose to turn into action*, and some marginal control over *which predictions I make* about whether those actions take me closer to my goals. Those are the two levers with which I am able to gradually take control over which motivations my brain produces, as long as I’m strategic about it. I’m a fledgling mesa-optimiser inside my own brain, and I start out with the odds against me.
I can also set myself up for failure. If I commit to, say, study math for 12 hours a day, then… I’m able to at first feel like I’ve committed to that as long as I naively expect, right then and there, that the commitment takes me closer to my goals. But come the next day when I actually try to achieve this, I run out of steam, and it becomes harder and harder to resist the motivations to quit. And when I quit, *the motivations that led me to quit get reinforced because I feel relieved* (proximal reward). Trying-and-failing can build up quitting-muscles.
If you’re a sufficiently clever mesa-optimiser, you *can* make yourself study math for 12 hours a day or whatever, but you have to gradually build up to it. Never make a large ask of yourself before you’ve sufficiently starved the quitting-pathways to extinction. Seek to build up simple well-defined trigger-action rules that you know you can keep to every single time they’re triggered. If more and more of input-space gets gradually siphoned into those rules, you starve alternative pathways out of existence.
Thus, we have one aspect of the maxim: “You never make decisions, you only ever decide between strategies.”