I was reflecting on this, and considering how statistics might look to a pure mathematician:
“Probability distribution, I know. Real number, I know. But what is this ‘rolling a die’/‘sampling’ that you are speaking about?”
Reflecting some more here (I hope this schizophrenic little monologue doesn’t bother anyone), I notice that none of this would trouble a pure computer scientist / reductionist:
“Probability? Yeah, well, I’ve got pseudo-random number generators. Are they ‘random’? No, of course not, there’s a seed that maintains the state, they’re just really hard to predict if you don’t know the seed, but if there aren’t too many bits in the seed, you can crack them. That’s happened to casino slot machines before; now they have more bits.”
“Philosophy of statistics? Well, I’ve got two software packages here: one of them fits a penalized regression and tunes the penalty parameter by cross validation. The other one runs an MCMC. They both give pretty similarly useful answers most of the time [on some particular problem]. You can’t set the penalty on the first one to 0, though, unless n >> log(p), and I’ve got a pretty large number of parameters. The regression code is faster [on some problem], but the MCMC let’s me answer more subtle questions about the posterior.
Have you seen the Church language or Infer.Net? They’re pretty expressive, although the MCMC algorithms need some tuning.”
Ah, but what does it mean when you run those algorithms?
“Mean? Eh? They just work. There’s some probability bounds in the machine learning community, but usually they’re not tight enough to use.”
[He had me until that last bit, but I can’t fault his reasoning. Probably Savage or de Finnetti could make him squirm, but who needs philosophy when you’re getting things done.]
Reflecting some more here (I hope this schizophrenic little monologue doesn’t bother anyone), I notice that none of this would trouble a pure computer scientist / reductionist:
“Probability? Yeah, well, I’ve got pseudo-random number generators. Are they ‘random’? No, of course not, there’s a seed that maintains the state, they’re just really hard to predict if you don’t know the seed, but if there aren’t too many bits in the seed, you can crack them. That’s happened to casino slot machines before; now they have more bits.”
“Philosophy of statistics? Well, I’ve got two software packages here: one of them fits a penalized regression and tunes the penalty parameter by cross validation. The other one runs an MCMC. They both give pretty similarly useful answers most of the time [on some particular problem]. You can’t set the penalty on the first one to 0, though, unless n >> log(p), and I’ve got a pretty large number of parameters. The regression code is faster [on some problem], but the MCMC let’s me answer more subtle questions about the posterior.
Have you seen the Church language or Infer.Net? They’re pretty expressive, although the MCMC algorithms need some tuning.”
Ah, but what does it mean when you run those algorithms?
“Mean? Eh? They just work. There’s some probability bounds in the machine learning community, but usually they’re not tight enough to use.”
[He had me until that last bit, but I can’t fault his reasoning. Probably Savage or de Finnetti could make him squirm, but who needs philosophy when you’re getting things done.]
Well, among others, someone who wonders whether the things I’m doing are the right things to do.
Fair point. Thanks, that hyperbole was ill advised.