Shard Theory notes thread
Values
Shaped by the reward system via RL mechanisms
Contextually activated heuristics shaped by the reward circuitry
Underlying Assumptions
The cortex is “basically locally randomly initialised”
The brain does self supervised learning
The brain does reinforcement learning
Genetically hardcoded reward circuitry
Reinforces cognition that historically lead to reward
RL is the mechanism by which shards form and are strengthened/weakened
Shards and Bidding
“Shard of value”: “contextually active computations downstream of similar historical reinforcement events”
Shards activate more strongly in contexts similar to those where they were historically reinforced
“Subshard”: “contextually activated component of a shard”
Bidding
Shards bid for actions historically responsible for receiving reward (“reward circuit activation”) and not directly for reward
Credit assignment plays a role in all this that I don’t understand well yet
Shard Theory notes thread
Values
Shaped by the reward system via RL mechanisms
Contextually activated heuristics shaped by the reward circuitry
Underlying Assumptions
The cortex is “basically locally randomly initialised”
The brain does self supervised learning
The brain does reinforcement learning
Genetically hardcoded reward circuitry
Reinforces cognition that historically lead to reward
RL is the mechanism by which shards form and are strengthened/weakened
Shards and Bidding
“Shard of value”: “contextually active computations downstream of similar historical reinforcement events”
Shards activate more strongly in contexts similar to those where they were historically reinforced
“Subshard”: “contextually activated component of a shard”
Bidding
Shards bid for actions historically responsible for receiving reward (“reward circuit activation”) and not directly for reward
Credit assignment plays a role in all this that I don’t understand well yet