Your brain already has the ability to update its cognitive strategies (this is called “meta-cognitive reinforcement learning”). However, the usual mechanism works with unnecessary levels of indirection, as in:
Cognitive strategy → Thought → Action → Reward or punishment
You get rewarded or punished for what you do (as measured by your brain’s chemical responses). Good thoughts are more likely to be followed by good actions. Good cognitive strategies are more likely to generate good thoughts. On average, your brain will slowly update its cognitive strategies in the right direction.
Cognitive strategy → Thought → Reward or punishment
You have learned to be happy or unhappy about having certain ideas, even when you don’t yet know how they apply to the real world. Now your brain gets rewarded or punished for thoughts, and on average good thoughts are more likely to be generated by good cognitive strategies. Your brain can update cognitive strategies faster, according to heuristics about what makes ideas “good”.
However, by carefully looking at the “deltas” between conscious thoughts, we can get rid of the last remaining level of indirection (this is the key insight of this whole page!):
Cognitive strategy → Reward or punishment
You have learned to perceive your cognitive strategies as they happen, and developed some heuristics that tell you whether they are good or bad. Now your brain can update cognitive strategies immediately, and do it regardless of the topic of your thoughts.
Even when you generate a useless idea from another useless idea, you can still track whether the cognitive strategy behind it was sound, and learn from the experience.
</quote>
(It doesn’t look like it’s possible to quote bullet points, especially not nested bullet points, and I didn’t want to remove more than one layer of bullets because I thought they made the whole thing more clear.)
The rest of the linked post is mostly about how to actually go about implementing this. (And, I feel like that probably deserves a book and regular practice, rather than just a short blog post. So, if you want notice and learn better cognative strategies, reading the full thing is well-worth the time investment.)
TL;DR: The core concept is this:
<quote>
Your brain already has the ability to update its cognitive strategies (this is called “meta-cognitive reinforcement learning”). However, the usual mechanism works with unnecessary levels of indirection, as in:
Cognitive strategy → Thought → Action → Reward or punishment
You get rewarded or punished for what you do (as measured by your brain’s chemical responses). Good thoughts are more likely to be followed by good actions. Good cognitive strategies are more likely to generate good thoughts. On average, your brain will slowly update its cognitive strategies in the right direction.
Cognitive strategy → Thought → Reward or punishment
You have learned to be happy or unhappy about having certain ideas, even when you don’t yet know how they apply to the real world. Now your brain gets rewarded or punished for thoughts, and on average good thoughts are more likely to be generated by good cognitive strategies. Your brain can update cognitive strategies faster, according to heuristics about what makes ideas “good”.
However, by carefully looking at the “deltas” between conscious thoughts, we can get rid of the last remaining level of indirection (this is the key insight of this whole page!):
Cognitive strategy → Reward or punishment
You have learned to perceive your cognitive strategies as they happen, and developed some heuristics that tell you whether they are good or bad. Now your brain can update cognitive strategies immediately, and do it regardless of the topic of your thoughts.
Even when you generate a useless idea from another useless idea, you can still track whether the cognitive strategy behind it was sound, and learn from the experience.
</quote>
(It doesn’t look like it’s possible to quote bullet points, especially not nested bullet points, and I didn’t want to remove more than one layer of bullets because I thought they made the whole thing more clear.)
The rest of the linked post is mostly about how to actually go about implementing this. (And, I feel like that probably deserves a book and regular practice, rather than just a short blog post. So, if you want notice and learn better cognative strategies, reading the full thing is well-worth the time investment.)