However, the probability for the well-known observation was already at 100%. How can a previously-known statement provide new support for the hypothesis, as if we are re-updating on evidence we’ve already updated on?
If it is a new theory, why is the evidence against which it is tested considered old? Further, how would this be different from using the theory to predict the precession of mercury if you were to test it deliberately? Intuitively, this feels to me like privileging where the evidence lies in the time domain over anything else about it.
I agree that those are reasons to not treat old evidence differently.
In terms of the problem of old evidence as usually presented (to the best of my understanding), the idea is: if you already know a thing, how can you update on it? This can be formalized more, as an objection to Bayesianism (though not a very good one I think).
At one point I had a link to further explanation of this in SEP, but it looks like I accidentally removed all links from my post at some point during editing.
Thank you for the link; reading it caused my confusion to increase. Forgive me if the question seems stupid, but doesn’t this prevent contradicting a theory?
Suppose we have evidence statement E, known for some time, and theory statement H, considered for some time. If we then discover that H implies -E, how does this not run into the same problem of old evidence? I would expect in Bayesian confirmation theory and also scientific practice for confidence in H to be reduced, but it seems to be the same operation as conditionalizing on E.
Come to think of it, how can we assert that our current confidence in H is correct if we discover anything new about its relationship to E? My guts rebel at the idea; intuitively it seems like we should conclude our previous update of H was an error, undo the previous conditionalization on E, and re-conditionalize on what we now know to be correct.
In Bayesian confirmation theory, you have to already have considered all the implications of a hypothesis. You can’t be thinking of a hypothesis H and not know H->(-E) from the beginning. Discovering implications of a theory means you have logical uncertainty. Our best theory of logical uncertainty at the moment seems to be logical induction, and it behaves somewhat counterintuitively: noticing the implication H->(-E) would indeed disprove H, but if H merely confers high probability to -E, noticing this doesn’t necessarily drive the belief in H down. This is actually an important feature, because if it always counted against H, you could always drive belief in H down by biasing the order in which you prove things.
Come to think of it, how can we assert that our current confidence in H is correct if we discover anything new about its relationship to E? My guts rebel at the idea; intuitively it seems like we should conclude our previous update of H was an error, undo the previous conditionalization on E, and re-conditionalize on what we now know to be correct.
Yeah, if I didn’t know about LI, I would agree. “Bayes isn’t wrong; you’re specifying a wrong way for a being with finite computational resources to approximate Bayes! You re-do the calculations when you notice new hypotheses or new implications of hypotheses! Your old probability estimate was just poor; you don’t have to explain the way your re-calculation changed things within the Bayesian framework, so there’s no problem of old evidence.”
However, given LI, the picture is more complicated. We can now say a lot more about what it means for an agent with bounded computational resources to reason approximately about a computationally intractable structure, and it does seem like there’s a problem of old evidence.
I am hanging up consistently on the old/new verbiage, and you have provided enough resolution that I suspect my problem lies beneath the level we’re discussing. So while I go do some review, I have a tangentially related question:
Are you familiar with the work of Glenn Schafer and Vladimir Vovk in building a game-theoretic treatment of probability? I mention it because of the prediction market comment for LI and the post about logical dutch book; their core mechanisms appear to have similar intuitions at work, so I thought it might be of interest.
Here’s an example where you OBVIOUSLY don’t want to award points for old evidence: every time the stock market goes up or down, your friend says “I saw that coming”. When you ask how, they give a semi-plausible story of how recent news made them suspect the stock market would move in that direction.
I’ve heard of Schafer & Vovk’s work! Haven’t looked into it yet, but Sam Eisenstat was reading it.
That much makes intuitive sense to me—I might go as far as to say that when we cherry-pick we are deliberately trolling ourselves with old evidence. I think I keep expecting that many of these problems are resolved by considering the details of how we, the agent, actually do the procedure. For example, say you have a Bayesian Confirmation Theoretic treatment of a hypothesis, but then you learn about LI, does re-interpreting the evidence with LI still count as the old evidence problem? Do we have a formal account of how to transition from one interpretation to the other, like a gauge theory of decisions (I expect not)?
I wrote a partial review of Shafer & Vovk’s book on the subject here. I am still reading the book and it was published in 2001, so it doesn’t reflect the current state of scholarship—but if you’ll take a lay opinion, I recommend it.
I am having trouble parsing this section:
However, the probability for the well-known observation was already at 100%. How can a previously-known statement provide new support for the hypothesis, as if we are re-updating on evidence we’ve already updated on?
If it is a new theory, why is the evidence against which it is tested considered old? Further, how would this be different from using the theory to predict the precession of mercury if you were to test it deliberately? Intuitively, this feels to me like privileging where the evidence lies in the time domain over anything else about it.
I agree that those are reasons to not treat old evidence differently.
In terms of the problem of old evidence as usually presented (to the best of my understanding), the idea is: if you already know a thing, how can you update on it? This can be formalized more, as an objection to Bayesianism (though not a very good one I think).
At one point I had a link to further explanation of this in SEP, but it looks like I accidentally removed all links from my post at some point during editing.
Thank you for the link; reading it caused my confusion to increase. Forgive me if the question seems stupid, but doesn’t this prevent contradicting a theory?
Suppose we have evidence statement E, known for some time, and theory statement H, considered for some time. If we then discover that H implies -E, how does this not run into the same problem of old evidence? I would expect in Bayesian confirmation theory and also scientific practice for confidence in H to be reduced, but it seems to be the same operation as conditionalizing on E.
Come to think of it, how can we assert that our current confidence in H is correct if we discover anything new about its relationship to E? My guts rebel at the idea; intuitively it seems like we should conclude our previous update of H was an error, undo the previous conditionalization on E, and re-conditionalize on what we now know to be correct.
In Bayesian confirmation theory, you have to already have considered all the implications of a hypothesis. You can’t be thinking of a hypothesis H and not know H->(-E) from the beginning. Discovering implications of a theory means you have logical uncertainty. Our best theory of logical uncertainty at the moment seems to be logical induction, and it behaves somewhat counterintuitively: noticing the implication H->(-E) would indeed disprove H, but if H merely confers high probability to -E, noticing this doesn’t necessarily drive the belief in H down. This is actually an important feature, because if it always counted against H, you could always drive belief in H down by biasing the order in which you prove things.
Yeah, if I didn’t know about LI, I would agree. “Bayes isn’t wrong; you’re specifying a wrong way for a being with finite computational resources to approximate Bayes! You re-do the calculations when you notice new hypotheses or new implications of hypotheses! Your old probability estimate was just poor; you don’t have to explain the way your re-calculation changed things within the Bayesian framework, so there’s no problem of old evidence.”
However, given LI, the picture is more complicated. We can now say a lot more about what it means for an agent with bounded computational resources to reason approximately about a computationally intractable structure, and it does seem like there’s a problem of old evidence.
I am hanging up consistently on the old/new verbiage, and you have provided enough resolution that I suspect my problem lies beneath the level we’re discussing. So while I go do some review, I have a tangentially related question:
Are you familiar with the work of Glenn Schafer and Vladimir Vovk in building a game-theoretic treatment of probability? I mention it because of the prediction market comment for LI and the post about logical dutch book; their core mechanisms appear to have similar intuitions at work, so I thought it might be of interest.
Here’s an example where you OBVIOUSLY don’t want to award points for old evidence: every time the stock market goes up or down, your friend says “I saw that coming”. When you ask how, they give a semi-plausible story of how recent news made them suspect the stock market would move in that direction.
I’ve heard of Schafer & Vovk’s work! Haven’t looked into it yet, but Sam Eisenstat was reading it.
That much makes intuitive sense to me—I might go as far as to say that when we cherry-pick we are deliberately trolling ourselves with old evidence. I think I keep expecting that many of these problems are resolved by considering the details of how we, the agent, actually do the procedure. For example, say you have a Bayesian Confirmation Theoretic treatment of a hypothesis, but then you learn about LI, does re-interpreting the evidence with LI still count as the old evidence problem? Do we have a formal account of how to transition from one interpretation to the other, like a gauge theory of decisions (I expect not)?
I wrote a partial review of Shafer & Vovk’s book on the subject here. I am still reading the book and it was published in 2001, so it doesn’t reflect the current state of scholarship—but if you’ll take a lay opinion, I recommend it.
Maybe Shafer & Vovk would like to hear about logical induction.