Yeah, the Thompson sampling and Nash bargaining are different in that the Thompson sampling proposal has two argmaxes, where as Nash bargaining only has one. There are really two things being brought it with Thompson sampling, and Plurality is what you get if you only add the inner argmax, and something like Nash bargaining is what you get if you only add the geometric part. There is no reason you have to add the two things at the same place. All I know is Thompson sampling has some pretty nice asymptotic guarantees.
You could just Nash bargain between your hypotheses directly, but then you are dependent on where the 0 point is. One nice thing about Thompson sampling is that it gives you a semi-principled place to put the 0, because the inner argmax means we convert everything to probabilities.
Yeah, the Thompson sampling and Nash bargaining are different in that the Thompson sampling proposal has two argmaxes, where as Nash bargaining only has one. There are really two things being brought it with Thompson sampling, and Plurality is what you get if you only add the inner argmax, and something like Nash bargaining is what you get if you only add the geometric part. There is no reason you have to add the two things at the same place. All I know is Thompson sampling has some pretty nice asymptotic guarantees.
You could just Nash bargain between your hypotheses directly, but then you are dependent on where the 0 point is. One nice thing about Thompson sampling is that it gives you a semi-principled place to put the 0, because the inner argmax means we convert everything to probabilities.