Note that there’s a link to Joshua Tenenbaum’s “How to Grow a Mind” which EVERYONE SHOULD READ.
It’s an accessibly written framework for how learning happens, from a very interesting cognitive scientist at MIT. And it gives one of the best explanations for Bayesian inference I’ve seen:
Why, given three examples of different kinds of horses, would a child generalize the word “horse” to all and only horses (h1)? Why not h2, “all horses except Clydesdales”; h3, “all animals”; or any other rule consistent with the data? Likelihoods favor the more specific patterns, h1 and h2; it would be a highly suspicious coincidence to draw three random examples that all fall within the smaller sets h1 or h2 if they were actually drawn from the much larger h3 (18). The prior favors h1 and h3, because as more coherent and distinctive categories, they are more likely to be the referents of common words in language (1). Only h1 scores highly on both terms.
Note that there’s a link to Joshua Tenenbaum’s “How to Grow a Mind” which EVERYONE SHOULD READ.
It’s an accessibly written framework for how learning happens, from a very interesting cognitive scientist at MIT. And it gives one of the best explanations for Bayesian inference I’ve seen:
paywalled.
Depaywalled.
Lecture version. Related papers.