PhilGoetz comments on Backchaining causes wishful thinking

PhilGoetz 19 May 2010 21:22 UTC
0 points
As I tried to explain in the post, a complete system that uses some function to generate its own reward signal is unsupervised. If you don’t know how that reward signal is generated, and are just looking at the learning done with it, you’re looking at a supervised system, which is part of a more-mysterious unsupervised system.

‘Unsupervised’ is sexier, and people are motivated to bend the term to cover whatever they’re working on. But for the purposes of this post, it doesn’t matter one bit which term you use.
- timtyler 19 May 2010 21:58 UTC
  0 points
  Parent
  This all sounds very strange to me. If there is a supervisor—but all they do is use a carrot and a stick—then I think that would generally be classified as reinforcement learning. Supervised learning is where the learner gets given the correct outputs—or is told the right answers.
  
  http://en.wikipedia.org/wiki/Supervised_learning
  
  http://en.wikipedia.org/wiki/Unsupervised_learning
  
  http://en.wikipedia.org/wiki/Semi-supervised_learning
  - PhilGoetz 19 May 2010 22:57 UTC
    0 points
    Parent
    I’m saying that applying carrot/stick is equivalent to saying yes/no.
    
    I deleted the whole paragraph about supervised/unsupervised, since it contributed nothing and was obviously a distraction.