Squark comments on Maximize Worst Case Bayes Score

Squark 19 Jun 2014 11:00 UTC
2 points
Firstly, I’d love to see the counterexample for the distributions being the same.

Secondly, are you sure \mu nowhere zero is essential? Intuitively, your uniqueness result must work whenever for every two models M1, M2 there is a sentence \phi separating them with \mu(\phi) non-zero. But I haven’t checked it formally.
- Scott Garrabrant 19 Jun 2014 19:32 UTC
  0 points
  Parent
  At very least my conjecture is not true if \mu is not nowhere zero, which was enough for me to ignore that case, because (see my response to cousin_it) what I actually believe is that there are three very different definitions that all give the same distribution, which I think makes the distribution stand out a lot more as a good idea. Also if \mu is sometimes zero, we also lose uniqueness because we dont know what to do with sets that our Bayes score does not care about. The fact that we can do whatever we want with these sets also takes away coherence (although maybe we could artificially require coherence) I really don’t want to do that, because the whole reason I like this approach so much is that it didnt require coherence and coherence came out for free.
  
  Well for example, if Si only has one set A, Abram will think we are in that set, I will think we are in that set with probability ¹⁄₂. Now, you could require that every sentence has the same \mu as its negation, (corresponding to putting the sentence or its negation in with probability ¹⁄₂ in Abram’s procedure) in which case partition X into 3 sets, A, B, and C for which the in or not in A question is given weight muA, and similarly define muB and muC.
  
  Let muA=1/2, muB=1/4 and muC=1/4.
  
  Abrams procedure will with probability ¹⁄₄ choose A first, probability ¹⁄₈ choose B first, probability ¹⁄₈ choose C first, with probability ¹⁄₈ choose not A first then choose B with probability ¹⁄₈ choose not A first then choose C with probability ¹⁄₁₆ choose not C first and end up with B, with probability ¹⁄₁₆ choose not B first then end up with C, and with probability ¹⁄₈ choose not B or not C first then end up with A. In the end P(A) is .375.
  
  Notice that Abrams solution gives a different Bayes score when in set A than when in the other 2 sets. My Bayes score will not. My bayes score will give P(A) probability p where the bayes score is constant:
  
  ¹⁄₂ log p+1/4 log (1-(1-p)/2)+1/4 log (1-(1-p)/2)=1/2 log (1-p)+1/4 log ((1-p)/2)+1/4 log (1-(1-p)/2)
  
  2 log p+log (1-(1-p)/2)=2 log (1-p)+ log ((1-p)/2)
  
  p^2(1-(1-p)/2)=(1-p)^2((1-p)/2)
  
  p^2(1+p)=(1-p)^3
  
  p=.39661
  
  If you check this p value, you should see that bayes score is independent of model.