I’ve always (but not always consciously) been slightly confused about two aspects of shard theory:
The process by which your weak, reflex-agents amalgamate together into more complicated contextually activated heuristics, and the process by which more complicated contextually activated heuristics amalgamate together to form an agent which cares about worlds-outside-their-actions.
If you look at many illustrations of what the feedback loop for developing shards in humans looks like, you run into issues where there’s not a spectacular intrinsic separation between the reinforcement parts of humans and the world-modelling parts of humans. So why does shard theory latch so hard onto the existence of a world model separate from the shard composition?
Both seem resolvable by an application of the predictive processing theory of value. An example: If you are very convinced that you will (say) be able to pay rent in a month, and then you don’t pay rent, this is a negative update on the generators of the belief, and also on the actions you performed leading up to the due date. If you do, then its a positive update on both.
This produces consequentialist behaviors when the belief-values are unlikely without significant action on your part (satisfying the last transition confusion of (1) above), and also produces capable agents with beliefs and values hopelessly confused with each other, leaning into the confusion of (2).
h/t @Lucius Bushnaq for getting me to start thinking in this direction.
I’ve always (but not always consciously) been slightly confused about two aspects of shard theory:
The process by which your weak, reflex-agents amalgamate together into more complicated contextually activated heuristics, and the process by which more complicated contextually activated heuristics amalgamate together to form an agent which cares about worlds-outside-their-actions.
If you look at many illustrations of what the feedback loop for developing shards in humans looks like, you run into issues where there’s not a spectacular intrinsic separation between the reinforcement parts of humans and the world-modelling parts of humans. So why does shard theory latch so hard onto the existence of a world model separate from the shard composition?
Both seem resolvable by an application of the predictive processing theory of value. An example: If you are very convinced that you will (say) be able to pay rent in a month, and then you don’t pay rent, this is a negative update on the generators of the belief, and also on the actions you performed leading up to the due date. If you do, then its a positive update on both.
This produces consequentialist behaviors when the belief-values are unlikely without significant action on your part (satisfying the last transition confusion of (1) above), and also produces capable agents with beliefs and values hopelessly confused with each other, leaning into the confusion of (2).
h/t @Lucius Bushnaq for getting me to start thinking in this direction.
A confusion about predictive processing: Where do the values in predictive processing come from?
lol, either this confusion has been resolved, or I have no clue what I was saying here.