“Supervised learning” means sensory inputs are presented and paired with indications of the desired associated motor outputs. Just using a reward signal is usually unsupervised: reinforcement learning.
The term “supervised learning” doesn’t have to do just with things for which there are motor outputs. If you want to train a system to recognize numbers, and you provide it with 100,000 photographs of handwritten numbers, and each photo is labelled with the number it pictures, that’s supervised learning.
The reward signal is like a label. You need an oracle that provides the proper reward signal. Therefore, supervised learning.
You should treat “motor outputs” as a synonym for “actuator signals” in the above comment if it is causing confusion.
Your definition of supervised learning doesn’t seem to be the conventional one. Supervised learning is normally contrasted with reinforcement learning:
“Reinforcement learning differs from the supervised learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected.”
As I tried to explain in the post, a complete system that uses some function to generate its own reward signal is unsupervised. If you don’t know how that reward signal is generated, and are just looking at the learning done with it, you’re looking at a supervised system, which is part of a more-mysterious unsupervised system.
‘Unsupervised’ is sexier, and people are motivated to bend the term to cover whatever they’re working on. But for the purposes of this post, it doesn’t matter one bit which term you use.
This all sounds very strange to me. If there is a supervisor—but all they do is use a carrot and a stick—then I think that would generally be classified as reinforcement learning. Supervised learning is where the learner gets given the correct outputs—or is told the right answers.
“Supervised learning” means sensory inputs are presented and paired with indications of the desired associated motor outputs. Just using a reward signal is usually unsupervised: reinforcement learning.
http://en.wikipedia.org/wiki/Supervised_learning
The term “supervised learning” doesn’t have to do just with things for which there are motor outputs. If you want to train a system to recognize numbers, and you provide it with 100,000 photographs of handwritten numbers, and each photo is labelled with the number it pictures, that’s supervised learning.
The reward signal is like a label. You need an oracle that provides the proper reward signal. Therefore, supervised learning.
You should treat “motor outputs” as a synonym for “actuator signals” in the above comment if it is causing confusion.
Your definition of supervised learning doesn’t seem to be the conventional one. Supervised learning is normally contrasted with reinforcement learning:
“Reinforcement learning differs from the supervised learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected.”
http://en.wikipedia.org/wiki/Reinforcement_learning
As I tried to explain in the post, a complete system that uses some function to generate its own reward signal is unsupervised. If you don’t know how that reward signal is generated, and are just looking at the learning done with it, you’re looking at a supervised system, which is part of a more-mysterious unsupervised system.
‘Unsupervised’ is sexier, and people are motivated to bend the term to cover whatever they’re working on. But for the purposes of this post, it doesn’t matter one bit which term you use.
This all sounds very strange to me. If there is a supervisor—but all they do is use a carrot and a stick—then I think that would generally be classified as reinforcement learning. Supervised learning is where the learner gets given the correct outputs—or is told the right answers.
http://en.wikipedia.org/wiki/Supervised_learning
http://en.wikipedia.org/wiki/Unsupervised_learning
http://en.wikipedia.org/wiki/Semi-supervised_learning
I’m saying that applying carrot/stick is equivalent to saying yes/no.
I deleted the whole paragraph about supervised/unsupervised, since it contributed nothing and was obviously a distraction.