One of these does log( prob/ 1 - prob) the other does log( prob) …
I get your point about orders of magnitude difference, but for me this ends up more confusing then anything.
One of these does log( prob/ 1 - prob) the other does log( prob) …
I get your point about orders of magnitude difference, but for me this ends up more confusing then anything.
Ordered Probabilistic Reasoning in Intelligent Systems . Looking forward to reading it.
I wrote this out for myself in attempt to fully grasp this and maybe someone else might find it useful:
You have two theories, A and B. A is more complex then B, but has sharper/more precise predictions for it’s observables. i.e. given a test, where it’s either +-ve or -ve (true or false), then we necessitate that P(+ | A) > P(+ | B).
Say that P(+ | A) : P(+ | B) = 10 : 1, a favorable likelihood ratio.
Then each successful +-ve test gives 10 : 1 odds for theory A over theory B. You can penalize A initially for algorithmic complexity and estimate/assign it 1 : 10^5 odds for it; i.e. you think it is borderline absurd.
But if you get 5 consecutive +-ve tests, then your posterior odds become 1 : 1; meaning your initial odds estimate was grossly wrong. In fact, given 5 more consecutive +-ve tests, it is theory B which should at this point be considered absurd.
Of course in real problems, the favorable likelihood ratio could be as low as 1.1 : 1, and your prior odds are not as ridiculous; maybe 1 : 100 against. Then you’d need about 50 updates before you get posterior odds of about 1 : 1. You then seriously question the validity of your prior odds. After another 50 updates, you’re essentially fully convinced that the new theory contestant is much better then the original theory.