the model-based system is the Learning System, except that the Learning System doesn’t calculate value but only learns to model better via reward prediction error.
the Pavlovian system is the Steering System and is the only system that provides ground truth “value” (this value is low-level reward; abstract concepts of value are formed by the learning system around this ground truth, but these exist only in so far as they are useful to predict the ground truth).
the model-free system doesn’t exist as a separate system but is in the shallower parts of the Learning System. I don’t think it maps to the Thought Assessor but may be wrong.
In this framework, one could say, as Eliezer suspected, that the value originated outside the model-based system.
I think the three sub-systems can be loosely mapped to the structure discussed in the [Intro to brain-like-AGI safety] 3. Two subsystems: Learning & Steering as follows:
the model-based system is the Learning System, except that the Learning System doesn’t calculate value but only learns to model better via reward prediction error.
the Pavlovian system is the Steering System and is the only system that provides ground truth “value” (this value is low-level reward; abstract concepts of value are formed by the learning system around this ground truth, but these exist only in so far as they are useful to predict the ground truth).
the model-free system doesn’t exist as a separate system but is in the shallower parts of the Learning System. I don’t think it maps to the Thought Assessor but may be wrong.
In this framework, one could say, as Eliezer suspected, that the value originated outside the model-based system.