gwern comments on Estimating the Number of Players from Game Result Percentages

gwern 29 Apr 2024 23:25 UTC
2 points
0
I’m not sure what “margin of error” is. This is just rounding, is it not? It’s not like the website is adding random epsilon numbers to screw with you: it is simply rounding off percentages. They are exact and deterministic up to rounding.

Though the bigger issue is the number of players can’t strictly be computed based on percentages alone.

Since it’s a non-deterministic problem, you’d represent all possible answers in ascending order as a lazy generator. You can then filter it by any known constraints or requirements (maybe you know players have to be paired, so it’s always an even number of players, and you can filter out all odd values). Since the number of possible valid values will increase greatly as the total N increases, this might get slow and require some sort of sieve. (At a guess, since it’s rounding, the range of possible values presumably increases rapidly, and so it might make more sense to instead returns pairs of (lower,upper) bounds?)

Personally, I would first start with the forward problem, since it is so simple. Then I could test any algorithms or tweaks by generating a random N of players and testing that all of the generator values ⇐ N are correct.

This has the benefit that you can also easily do simple Bayesian updates by ABC without the rigmarole of pymc or Stan etc: just draw a sample from the prior over the n players, feed it in, see if you replicate the exact observed %s, and if not, delete the sample; the samples you keep are the new posterior.