I too would like to see a good explanation of frequentist techniques,
especially one that also explains their relationships (if any) to
Bayesian techniques.
Based on the tiny bit I know of both approaches, I think one appealing
feature of frequentist techniques (which may or may not make up for their
drawbacks) is that your initial assumptions are easier to dislodge the
more wrong they are.
It seems to be the other way around with Bayesian techniques because of
a stronger built-in assumption that your assumptions are justified. You
can immunize yourself against any particular evidence by having a
sufficiently wrong prior.
It seems to be the other way around with Bayesian techniques because of a stronger built-in assumption that your assumptions are justified. You can immunize yourself against any particular evidence by having a sufficiently wrong prior.
But you won’t be able to convince other Bayesians who don’t share that radically wrong prior. Similarly, there doesn’t seem to be something intrinsic to frequentism that keeps you from being persistently wrong. Rather, frequentists are kept in line because, as Cyan said, they have to persuade each other. Fortunately, for Bayesians and frequentists alike, a technique’s being persuasive to the community correlates with its being liable to produce less wrong answers.
The ability to get a bad result because of a sufficiently wrong prior is not a flaw in Bayesian statistics; it is a flaw is our ability to perform Bayesian statistics. Humans tend to overestimate their confidence of probabilities with very low or very high values. As such, the proper way to formulate a prior is to imagine hypothetical results that will bring the probability into a manageable range, ask yourself what you would want your posterior to be in such cases, and build your prior from that. These hypothetical results must be constructed and analyzed before the actual result is obtained to eliminate bias.
As Tyrrell said, the ability of a wrong prior to result in a bad conclusion is a strength because other Bayesians will be able to see where you went wrong by disputing the prior.
It seems to be the other way around with Bayesian techniques because of a stronger built-in assumption that your assumptions are justified. You can immunize yourself against any particular evidence by having a sufficiently wrong prior.
Someone correct me if I’m wrong here, but I don’t think even having a strong prior P(H) against the evidence is much help, because that makes your likelihood ratio on the evidence P(E|H)/P(E|~H) that much stronger.
(This issue is one my stumbling blocks in Bayescraft.)
The likelihood ratio P(E|H)/P(E|~H) is entirely independent of the prior P(H)
In theory, yes, but we’re talking about a purported “unswayable Bayesian”. If someone strongly believes leprechauns don’t exist (low P(H), where H is “leprechauns exist” ), they should strongly expect not to see evidence of leprechauns (low P(E|~H), where E is direct evidence of leprechauns, like finding one in the forest), which suggests a high likelihood ratio P(E|H)/P(E|~H).
I remember Eliezer Yudkowsky referring to typical conversations that go like:
Non-rationalist: “I don’t think there will ever be an artificial general intelligence, because my religion says that can’t happen.” EY: “So if I showed you one, that means you’d leave your religion?”
I’m not entirely sure I understand your point. The example you’re citing is more the guy saying “I believe X, and X implies ~Y, therefore ~Y”, so Eliezer is saying “So Y implies ~X then?”
But the “X implies ~Y” belief can happen when one has low belief in X or high belief in X.
Or are you saying “the likelihoods assigned led to past interpretation of analogous (lack of) evidence, and that’s why the current prior is what it is?
komponisto nailed the intuition I was going from: the likelihood ratio is independent of the prior, but an unswayable Bayesian fixes P(E), forcing extreme priors to have extreme likelihood ratios.
*blinks* I think I’m extra confused. The law of conservation of probability is basically just saying that the change in belief may be large or small, so evidence may be strong or weak in that sense. But that doesn’t leave the likelihoods up for grabs, (well, okay, P(E|~H) could depend on how you distribute your belief over the space of hypotheses other than H, but… I’m not sure that was your point)
Okay, point conceded … that still doesn’t generate a result that matches the intuition I had. I need to spend more time on this to figure out what assumptions I’m relying on to claim that “extremely wrong beliefs force quick updates”.
The quantities P(H), P(E|H), and P(E|~H) are in general independent of each other, in the sense that you can move any one of them without changing the others—provided you adjust P(E) accordingly.
I too would like to see a good explanation of frequentist techniques, especially one that also explains their relationships (if any) to Bayesian techniques.
Based on the tiny bit I know of both approaches, I think one appealing feature of frequentist techniques (which may or may not make up for their drawbacks) is that your initial assumptions are easier to dislodge the more wrong they are.
It seems to be the other way around with Bayesian techniques because of a stronger built-in assumption that your assumptions are justified. You can immunize yourself against any particular evidence by having a sufficiently wrong prior.
EDIT: Grammar
But you won’t be able to convince other Bayesians who don’t share that radically wrong prior. Similarly, there doesn’t seem to be something intrinsic to frequentism that keeps you from being persistently wrong. Rather, frequentists are kept in line because, as Cyan said, they have to persuade each other. Fortunately, for Bayesians and frequentists alike, a technique’s being persuasive to the community correlates with its being liable to produce less wrong answers.
The ability to get a bad result because of a sufficiently wrong prior is not a flaw in Bayesian statistics; it is a flaw is our ability to perform Bayesian statistics. Humans tend to overestimate their confidence of probabilities with very low or very high values. As such, the proper way to formulate a prior is to imagine hypothetical results that will bring the probability into a manageable range, ask yourself what you would want your posterior to be in such cases, and build your prior from that. These hypothetical results must be constructed and analyzed before the actual result is obtained to eliminate bias. As Tyrrell said, the ability of a wrong prior to result in a bad conclusion is a strength because other Bayesians will be able to see where you went wrong by disputing the prior.
Someone correct me if I’m wrong here, but I don’t think even having a strong prior P(H) against the evidence is much help, because that makes your likelihood ratio on the evidence P(E|H)/P(E|~H) that much stronger.
(This issue is one my stumbling blocks in Bayescraft.)
The likelihood ratio P(E|H)/P(E|~H) is entirely independent of the prior P(H)
Or did I misunderstand what you said?
In theory, yes, but we’re talking about a purported “unswayable Bayesian”. If someone strongly believes leprechauns don’t exist (low P(H), where H is “leprechauns exist” ), they should strongly expect not to see evidence of leprechauns (low P(E|~H), where E is direct evidence of leprechauns, like finding one in the forest), which suggests a high likelihood ratio P(E|H)/P(E|~H).
I remember Eliezer Yudkowsky referring to typical conversations that go like:
Non-rationalist: “I don’t think there will ever be an artificial general intelligence, because my religion says that can’t happen.”
EY: “So if I showed you one, that means you’d leave your religion?”
He did mention pulling that off once, but I don’t believe he said it was typical.
Thanks, that was what I had in mind.
I’m not entirely sure I understand your point. The example you’re citing is more the guy saying “I believe X, and X implies ~Y, therefore ~Y”, so Eliezer is saying “So Y implies ~X then?”
But the “X implies ~Y” belief can happen when one has low belief in X or high belief in X.
Or are you saying “the likelihoods assigned led to past interpretation of analogous (lack of) evidence, and that’s why the current prior is what it is?
komponisto nailed the intuition I was going from: the likelihood ratio is independent of the prior, but an unswayable Bayesian fixes P(E), forcing extreme priors to have extreme likelihood ratios.
*blinks* I think I’m extra confused. The law of conservation of probability is basically just saying that the change in belief may be large or small, so evidence may be strong or weak in that sense. But that doesn’t leave the likelihoods up for grabs, (well, okay, P(E|~H) could depend on how you distribute your belief over the space of hypotheses other than H, but… I’m not sure that was your point)
Okay, point conceded … that still doesn’t generate a result that matches the intuition I had. I need to spend more time on this to figure out what assumptions I’m relying on to claim that “extremely wrong beliefs force quick updates”.
Remember, though, that even fixing both P(E) and P(H), you can still make the ratio P(E|H)/P(E|~H) anything you want; the equation
a = bx + (1-b)(cx)
is guaranteed to have a solution for any a,b,c.
P(E) = P(E|H) P(H) + P(E|~H)P(~H)
The quantities P(H), P(E|H), and P(E|~H) are in general independent of each other, in the sense that you can move any one of them without changing the others—provided you adjust P(E) accordingly.
Thanks, that helps. See how I apply that point in my reply to Psy-Kosh here.