Summary:
Sometimes, reinforcement learning goes wrong: how can this be prevented?
Example: math education
One student simply “learns to follow along”, and the other “learns to predict what comes next”
The other student may gain the ability to solve math problems on their own, while the first plausibly won’t.
Turbocharging, general notes:
Idea: You get better at the things you practice, and it pays off to think about what, mechanistically, you want to learn.
You won’t just learn “what you intend”:
If you intend to gain the skill of disarmament of people but hand the weapon back during training with a partner, then that is what you learn.
Example of math student revisited: are they…
Actively thinking about the symbols?
Calling up related material from memory?
Generating hypotheses (instead of falling prey to hindsight bias)?
Thinking about the underlying structure of the problem?
← These questions determine what’s actually practiced.
The Turbocharging Algorithm:
Select a skill to be acquired/improved
Select a practice method (to be evaluated or to be strengthened/developed)
Evaluate the resemblance between method and skill:
Does/Do the “practice trigger(s)” resemble the real-world trigger, or at least plausibly generalize?
Does/Do the “practice action(s)” resemble real-world actions, or at least plausibly generalize?
Possibly adjust the practice method in response to the previous answers
Further Notes
Declarative and Procedural Knowledge require different types of learning
Turbocharging is for procedural learning, which is more of what applied rationality is about
The article lists many counterexamples of the theory that turbocharging is “the one and only” way to gain procedural knowledge.
Summary:
Sometimes, reinforcement learning goes wrong: how can this be prevented?
Example: math education
One student simply “learns to follow along”, and the other “learns to predict what comes next”
The other student may gain the ability to solve math problems on their own, while the first plausibly won’t.
Turbocharging, general notes:
Idea: You get better at the things you practice, and it pays off to think about what, mechanistically, you want to learn.
You won’t just learn “what you intend”:
If you intend to gain the skill of disarmament of people but hand the weapon back during training with a partner, then that is what you learn.
Example of math student revisited: are they…
Actively thinking about the symbols?
Calling up related material from memory?
Generating hypotheses (instead of falling prey to hindsight bias)?
Thinking about the underlying structure of the problem?
← These questions determine what’s actually practiced.
The Turbocharging Algorithm:
Select a skill to be acquired/improved
Select a practice method (to be evaluated or to be strengthened/developed)
Evaluate the resemblance between method and skill:
Does/Do the “practice trigger(s)” resemble the real-world trigger, or at least plausibly generalize?
Does/Do the “practice action(s)” resemble real-world actions, or at least plausibly generalize?
Possibly adjust the practice method in response to the previous answers
Further Notes
Declarative and Procedural Knowledge require different types of learning
Turbocharging is for procedural learning, which is more of what applied rationality is about
The article lists many counterexamples of the theory that turbocharging is “the one and only” way to gain procedural knowledge.