onsider the great anti-Bayesian Cosma Shalizi. He’s shown that the use of a prior is really equivalent to a method of smoothing, of regularization on your hypothesis space, trading off (frequentist) bias and variance.
It seems odd to interpret this point as anti-Bayesian. To me it seems pro-Bayesian: it means that whenever you use a regularizer you’re actually doing Bayesian inference. Any method that depends on a regularizer is open to the same critique of subjectivity to which Bayesian methods are vulnerable. Two frequentists using different regularizers will come to different conclusions based on the same evidence, and the choice of a regularizer is hardly inevitable or dictated by the problem.
If you have a link to a paper that contains anti-Bayesian arguments by Shalizi, I would be interested in reading it.
Well, it seems odd to me too. He has another rant up comparing Bayesian updating to evolution saying “okay, that’s why Bayesian updating seems to actually work OK in many cases”, whereas I see that as explaining why evolution works...
Is a good start. He also has a paper on the arXiv that is flat-out wrong, so ignore “The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic”, though showing how it goes wrong takes a fair bit of explaining of fairly subtle points.
He also has a paper on the arXiv that is flat-out wrong, so ignore “The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic”, though showing how it goes wrong takes a fair bit of explaining of fairly subtle points.
I’ve tried reading it before—for me to understand just the paper itself would also take a fair bit of explaining of fairly subtle points! I understand Shalizi’s sketch of his argument in words:
Observe your system at time 0, and invoke your favorite way of going from an observation to a distribution over the system’s states—say the maximum entropy principle. This distribution will have some Shannon entropy, which by hypothesis is also the system’s thermodynamic entropy. Assume the system’s dynamics are invertible, so that the state at time t determines the states at times t+1 and t-1. This will be the case if the system obeys the usual laws of classical mechanics, for example. Now let your system evolve forward in time for one time-step. It’s a basic fact about invertible dynamics that they leave Shannon entropy invariant, so it’s still got whatever entropy it had when you started. Now make a new observation. If you update your probability distribution using Bayes’s rule, a basic result in information theory shows that the Shannon entropy of the posterior distribution is, on average, no more than that of the prior distribution. There’s no way an observation can make you more uncertain about the state on average, though particular observations may be very ambiguous. (Noise-free measurements would let us drop the “on average” qualifer.) Repeating this, we see that entropy decreases over time (on average).
My problem is that I know a plausible-looking argument expressed in words can still quite easily be utterly wrong in some subtle way, so I don’t know how much credence to give Shalizi’s argument.
The problem with the quoted argument from Shalizi is that it is describing a decrease in entropy over time of an open system. To track a closed system, you have to include the brain that is making observations and updating its beliefs. Making the observations requires thermodynamic work that can transfer entropy.
It seems odd to interpret this point as anti-Bayesian. To me it seems pro-Bayesian: it means that whenever you use a regularizer you’re actually doing Bayesian inference. Any method that depends on a regularizer is open to the same critique of subjectivity to which Bayesian methods are vulnerable. Two frequentists using different regularizers will come to different conclusions based on the same evidence, and the choice of a regularizer is hardly inevitable or dictated by the problem.
If you have a link to a paper that contains anti-Bayesian arguments by Shalizi, I would be interested in reading it.
Well, it seems odd to me too. He has another rant up comparing Bayesian updating to evolution saying “okay, that’s why Bayesian updating seems to actually work OK in many cases”, whereas I see that as explaining why evolution works...
http://cscs.umich.edu/~crshalizi/weblog/cat_bayes.html
Is a good start. He also has a paper on the arXiv that is flat-out wrong, so ignore “The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic”, though showing how it goes wrong takes a fair bit of explaining of fairly subtle points.
I’ve tried reading it before—for me to understand just the paper itself would also take a fair bit of explaining of fairly subtle points! I understand Shalizi’s sketch of his argument in words:
My problem is that I know a plausible-looking argument expressed in words can still quite easily be utterly wrong in some subtle way, so I don’t know how much credence to give Shalizi’s argument.
The problem with the quoted argument from Shalizi is that it is describing a decrease in entropy over time of an open system. To track a closed system, you have to include the brain that is making observations and updating its beliefs. Making the observations requires thermodynamic work that can transfer entropy.
D’oh! Why didn’t I think of that?!
If you write such a post, I’ll almost certainly upvote it.
Just Google him—his website is full of tons of interesting stuff.