Thanks for posting this! :D I’m curious to see where you go next.
Whereas, when you’ve chosen one of the two green boxes at random, the curve looks like this:
It seems odd to me that the mode for the left mixture is to the right of 0. I would have put it at 0, and made that mixture twice as tall so the area underneath would still be the same.
Yup, it’s definitely wrong! I was hoping no one would notice. I thought it would be a distraction to explain why the two are different (if that’s not obvious), and also I didn’t want to figure out exactly what the right math was to feed to my plotting package for this case. (Is the correct form of the curve for the p=0 case obvious to you? It wasn’t obvious to me, but this isn’t my area of expertise...)
I thought it would be a distraction to explain why the two are different (if that’s not obvious)
I would have left it unexplained in the post, and then explained it in the comments when the first person asked about it. In my experience, causally remarked semi-obvious true facts like that (“why are these two not equally tall?” “Because the area underneath is what matters”) are useful at convincing people of technical ability.
Is the correct form of the curve for the p=0 case obvious to you? It wasn’t obvious to me, but this isn’t my area of expertise...
I probably would have gone with the point mass approximation- i.e. a big circle at (0,.5), a line down to (0,0), a line over to (.9,0), and then a line up to a big circle at (.9,.5), then also a line from (.9,0) to (1,0). Using the Gaussian mixtures, though, I’d probably give them the same variance and just give the left one twice the weight of the right one, center them at 0 and .9, and then display only between 0 and 1. Using the pure functional form, that would look something like 2exp(-x^2/v)+exp(-(x-.9)^2/v).
Now, this is assuming we have some sort of Gaussian prior. We could also have a beta prior, which is conjugate to the binomial distribution, which is nice because that fits our testbed. Gaussian might be appropriate because we’ve actually opened the system up and we think the measurement system it uses has Gaussian noise.
I’m not sure I agree with the claim that the variance is the same; you could probably assert that chance the left one will pay out is 0 to arbitrarily high precision, and it seems likely the variance would depend on the number of plugs filled. That said, this doesn’t have much impact, and saying “we’ll approximate away the meta-meta-probability to simplify this example” seems like it goes against your general point, and is thus inadvisable.
Thanks for posting this! :D I’m curious to see where you go next.
It seems odd to me that the mode for the left mixture is to the right of 0. I would have put it at 0, and made that mixture twice as tall so the area underneath would still be the same.
Yup, it’s definitely wrong! I was hoping no one would notice. I thought it would be a distraction to explain why the two are different (if that’s not obvious), and also I didn’t want to figure out exactly what the right math was to feed to my plotting package for this case. (Is the correct form of the curve for the p=0 case obvious to you? It wasn’t obvious to me, but this isn’t my area of expertise...)
I would have left it unexplained in the post, and then explained it in the comments when the first person asked about it. In my experience, causally remarked semi-obvious true facts like that (“why are these two not equally tall?” “Because the area underneath is what matters”) are useful at convincing people of technical ability.
I probably would have gone with the point mass approximation- i.e. a big circle at (0,.5), a line down to (0,0), a line over to (.9,0), and then a line up to a big circle at (.9,.5), then also a line from (.9,0) to (1,0). Using the Gaussian mixtures, though, I’d probably give them the same variance and just give the left one twice the weight of the right one, center them at 0 and .9, and then display only between 0 and 1. Using the pure functional form, that would look something like 2exp(-x^2/v)+exp(-(x-.9)^2/v).
Now, this is assuming we have some sort of Gaussian prior. We could also have a beta prior, which is conjugate to the binomial distribution, which is nice because that fits our testbed. Gaussian might be appropriate because we’ve actually opened the system up and we think the measurement system it uses has Gaussian noise.
I’m not sure I agree with the claim that the variance is the same; you could probably assert that chance the left one will pay out is 0 to arbitrarily high precision, and it seems likely the variance would depend on the number of plugs filled. That said, this doesn’t have much impact, and saying “we’ll approximate away the meta-meta-probability to simplify this example” seems like it goes against your general point, and is thus inadvisable.