It’s a very efficient algorithm for modelling that operates as follows: 1) perform EM to find the local optimum 2) examine the clusters to determine which two are most similar, and combine them 3) examine the clusters to determine which one is least representative of the data it’s supposed to describe, and break it into two 4) perform EM on the three adjusted clusters 5) Repeat 1-4 until the change in likelihood between iterations drops below some epsilon.
I think this is actually quite isomorphic to Goal Factoring (from Geoff Anders / CFAR) in that you’re trying to combine things that are similar and then break up things that are inefficient. At least, I spent an entire summer working on an SMEM clustering program (though some of that was UI) and they feel similar to me.
I propose making an analogy to Split & Merge Expectation Maximization (SMEM) instead.
It’s a very efficient algorithm for modelling that operates as follows:
1) perform EM to find the local optimum
2) examine the clusters to determine which two are most similar, and combine them
3) examine the clusters to determine which one is least representative of the data it’s supposed to describe, and break it into two
4) perform EM on the three adjusted clusters
5) Repeat 1-4 until the change in likelihood between iterations drops below some epsilon.
I think this is actually quite isomorphic to Goal Factoring (from Geoff Anders / CFAR) in that you’re trying to combine things that are similar and then break up things that are inefficient. At least, I spent an entire summer working on an SMEM clustering program (though some of that was UI) and they feel similar to me.