I think LessWrong could use more posts on actual technical topics in machine learning, and this is a nice first step. It would be good if there was a sequence on it.
If you’re a smart Bayesian agent, then, you’ll pick p(theta) to be a conjugate prior
While conjugate priors can be very useful computationally, it might also be the case that your data is not well-modeled by the conjugate prior (if you’re using the Naieve bayes model then this might not seem like a huge problem, but once you start trying to build hierarchical models using conjugate priors, you have more potential to run into problems).
I would love to see an LW sequence on machine learning! I imagine that LW would have a lot of interesting things to say about the philosophical aspects of ML in addition to the practical aspects.
I’m not sure I’d be qualified to contribute much to such a sequence, since I am just an undergrad, but I did have an outline in mind for an intuitive introduction to MLE and EM. If people would find that interesting, I could certainly post it on LW once it was written up!
I’m fairly inexperienced in ML, so all the models I’ve worked with are simple enough that they’ve had conjugate priors. (I think it’s really cool that Dirichlet priors can be used for something as complicated as an HMM, but I guess the HMM is still just a whole bunch of multinomials.) I’m less familiar with hierarchical models. What is an example of a model for which is it difficult to use conjugate priors? The only hierarchical process I’ve heard about is the Dirichlet process, and I was under the impression (based on the name) that it involved Dirichlet priors somewhere; is this incorrect? I have been meaning to read about hierarchical models, so if you know of any good tutorials or papers on them, I would very much appreciate a link!
Cyan’s observation about mixtures of conjugate priors being conjugate kills the example I had in mind. Ill think for a bit and let you know if I think of any examples. If I haven’t replied in a couple weeks, remind me and ill make sure to reply.
Dirichlet processes aren’t inherently hierarchical, they are just self-conjugate, so you can make the output of one the input to the other. If you connect them up in a tree structure, you get a hierarchical dirichlet process.
I think LessWrong could use more posts on actual technical topics in machine learning, and this is a nice first step. It would be good if there was a sequence on it.
You might want to include the link to the Wikipedia table of conjugate priors in your post, and at least a mention of exponential families.
While conjugate priors can be very useful computationally, it might also be the case that your data is not well-modeled by the conjugate prior (if you’re using the Naieve bayes model then this might not seem like a huge problem, but once you start trying to build hierarchical models using conjugate priors, you have more potential to run into problems).
I would love to see an LW sequence on machine learning! I imagine that LW would have a lot of interesting things to say about the philosophical aspects of ML in addition to the practical aspects.
I’m not sure I’d be qualified to contribute much to such a sequence, since I am just an undergrad, but I did have an outline in mind for an intuitive introduction to MLE and EM. If people would find that interesting, I could certainly post it on LW once it was written up!
I’m fairly inexperienced in ML, so all the models I’ve worked with are simple enough that they’ve had conjugate priors. (I think it’s really cool that Dirichlet priors can be used for something as complicated as an HMM, but I guess the HMM is still just a whole bunch of multinomials.) I’m less familiar with hierarchical models. What is an example of a model for which is it difficult to use conjugate priors? The only hierarchical process I’ve heard about is the Dirichlet process, and I was under the impression (based on the name) that it involved Dirichlet priors somewhere; is this incorrect? I have been meaning to read about hierarchical models, so if you know of any good tutorials or papers on them, I would very much appreciate a link!
Cyan’s observation about mixtures of conjugate priors being conjugate kills the example I had in mind. Ill think for a bit and let you know if I think of any examples. If I haven’t replied in a couple weeks, remind me and ill make sure to reply.
Dirichlet processes aren’t inherently hierarchical, they are just self-conjugate, so you can make the output of one the input to the other. If you connect them up in a tree structure, you get a hierarchical dirichlet process.
Andrew Gelman wrote a comment on someone else’s paper that might prove to be a useful introduction to hierarchical models.