All learning modes are variants of predicting Y from X, they just differ in the partition that separates Y and X.
UL/SSL: Y is just a subset of X (often the future of X).
RL: Y is a special sensory function, specifically crafted/optimized by evolution or human engineering.
SL: Y comes from some external model (human brains, or other internal submodules).
I disagree somewhat with the framing in the pic above, in that RL is not inherently more ‘cherry’ than SL. You could have RL setups that provide numerous bits per sample (although not typical, possible in theory), and SL setups that provide few bits for some samples (mostly in a mixed SL/UL scenario).
And where does empowerment/self-motivated learning fit in? It’s RL in the sense of being a special function of the sensory (and or action) history, but it can potentially provide dense bit rates like UL.
All learning modes are variants of predicting Y from X, they just differ in the partition that separates Y and X.
UL/SSL: Y is just a subset of X (often the future of X).
RL: Y is a special sensory function, specifically crafted/optimized by evolution or human engineering.
SL: Y comes from some external model (human brains, or other internal submodules).
I disagree somewhat with the framing in the pic above, in that RL is not inherently more ‘cherry’ than SL. You could have RL setups that provide numerous bits per sample (although not typical, possible in theory), and SL setups that provide few bits for some samples (mostly in a mixed SL/UL scenario).
And where does empowerment/self-motivated learning fit in? It’s RL in the sense of being a special function of the sensory (and or action) history, but it can potentially provide dense bit rates like UL.