Leon comments on A Fervent Defense of Frequentist Statistics

Leon 16 Feb 2014 9:39 UTC
3 points
Ah, good point. It’s like the prior, considered as a regularizer, is too “soft” to encode the constraint we want.

A Bayesian could respond that we rarely actually want sparse solutions—in what situation is a physical parameter identically zero? -- but rather solutions which have many near-zeroes with high probability. The posterior would satisfy this I think. In this sense a Bayesian could justify the Laplace prior as approximating a so-called “slab-and-spike” prior (which I believe leads to combinatorial intractability similar to the fully L0 solution).

Also, without L0 the frequentist doesn’t get fully sparse solutions either. The shrinkage is gradual; sometimes there are many tiny coefficients along the regularization path.

[FWIW I like the logical view of probability, but don’t hold a strong Bayesian position. What seems most important to me is getting the semantics of both Bayesian (= conditional on the data) and frequentist (= unconditional, and dealing with the unknowns in some potentially nonprobabilistic way) statements right. Maybe there’d be less confusion—and more use of Bayes in science—if “inference” were reserved for the former and “estimation” for the latter.]
- jsteinhardt 16 Feb 2014 22:27 UTC
  1 point
  Parent
  
  Also, without L0 the frequentist doesn’t get fully sparse solutions either. The shrinkage is gradual; sometimes there are many tiny coefficients along the regularization path.
  
  See this comment. You actually do get sparse solutions in the scenario I proposed.
  - Leon 17 Feb 2014 1:30 UTC
    2 points
    Parent
    Cool; I take that back. Sorry for not reading closely enough.