This is a really interesting topic, there are heaps of things I want to say about it. I was initially waiting to see what your results were first, to avoid spoilers with my guesses, but that’s no way to have a conversation.
First—I think there’s an error in the program:
When you compute p[i][j] you take a sum then divide by N, but it looks like you should divide by the number of guesses you are adding, which can be more than N since it includes multiple rounds of guesses.
My (inconsistent) thoughts about how the model would behave:
They’d quickly learn the ratio of correct initial guesses everyone had, and make near-perfect use of that information. But they don’t distinguish between the initial guesses and later updates, so that’s not right.
Even the bad guessers will get most of their updated estimates right by the end, so their opinions will be assumed to correlate with the truth. If you then went back and posed everyone a new question, all the bad guessers could significantly mislead everyone. That’s not the procedure in your code, but you could try it.
At the start of the simulation, all the guessers are simply seeing who else agrees with them. The good guessers might be converging to a correct consensus, while the bad guessers could converge to the opposite. But as the simulation progressed and the answers were revealed, the bad guessers would lose confidence in their whole subgroup, including themselves, and follow the good guessing group.
Ideas for variants:
Make the initial guess accuracy depend on both guesser accuracy and problem difficultly/deceptiveness. I proposed a formula for this in my previous comment. In this case, the best way to update from the initial guesses would seem to be to follow the average opinion of a few of the best guessers and maybe the reverse of the worst few guessers, but I’m not sure how it would play out in the simulation where you don’t know who they are, and you have to update on each other’s updated guesses.
Make the initial guess accuracy depend on both the skill of the guesser and the difficulty of the question, but vary what weight is given to skill—some questions can be just as hard for skilled guessers as everyone else. In this case, a way to update from an initial guess would be to look at enough of the best guessers that you’re confident which way they guess on average (you’d need to sample more if they are near 50%)
Repeat the exercise—after the first set of N answers are revealed, continue with N more questions. This time the guessers start with data about each other’s accuracy. Then after they are done, N more, etc.
Instead of everyone getting the same number of updates, let some update more often.
Instead of updating everyone and revealing one answer each round, randomly pick between updating a random person and randomly revealing a correct answer just to one person, which they will be certain of for the rest of the game. You could give different people different chances of updating from group opinions, and of getting the correct answer revealed. Since people don’t know who’s had what answers revealed they don’t stop counting them when evaluating each other’s accuracy.
This is a really interesting topic, there are heaps of things I want to say about it. I was initially waiting to see what your results were first, to avoid spoilers with my guesses, but that’s no way to have a conversation.
First—I think there’s an error in the program: When you compute p[i][j] you take a sum then divide by N, but it looks like you should divide by the number of guesses you are adding, which can be more than N since it includes multiple rounds of guesses.
My (inconsistent) thoughts about how the model would behave:
They’d quickly learn the ratio of correct initial guesses everyone had, and make near-perfect use of that information. But they don’t distinguish between the initial guesses and later updates, so that’s not right.
Even the bad guessers will get most of their updated estimates right by the end, so their opinions will be assumed to correlate with the truth. If you then went back and posed everyone a new question, all the bad guessers could significantly mislead everyone. That’s not the procedure in your code, but you could try it.
At the start of the simulation, all the guessers are simply seeing who else agrees with them. The good guessers might be converging to a correct consensus, while the bad guessers could converge to the opposite. But as the simulation progressed and the answers were revealed, the bad guessers would lose confidence in their whole subgroup, including themselves, and follow the good guessing group.
Ideas for variants:
Make the initial guess accuracy depend on both guesser accuracy and problem difficultly/deceptiveness. I proposed a formula for this in my previous comment. In this case, the best way to update from the initial guesses would seem to be to follow the average opinion of a few of the best guessers and maybe the reverse of the worst few guessers, but I’m not sure how it would play out in the simulation where you don’t know who they are, and you have to update on each other’s updated guesses.
Make the initial guess accuracy depend on both the skill of the guesser and the difficulty of the question, but vary what weight is given to skill—some questions can be just as hard for skilled guessers as everyone else. In this case, a way to update from an initial guess would be to look at enough of the best guessers that you’re confident which way they guess on average (you’d need to sample more if they are near 50%)
Repeat the exercise—after the first set of N answers are revealed, continue with N more questions. This time the guessers start with data about each other’s accuracy. Then after they are done, N more, etc.
Instead of everyone getting the same number of updates, let some update more often.
Instead of updating everyone and revealing one answer each round, randomly pick between updating a random person and randomly revealing a correct answer just to one person, which they will be certain of for the rest of the game. You could give different people different chances of updating from group opinions, and of getting the correct answer revealed. Since people don’t know who’s had what answers revealed they don’t stop counting them when evaluating each other’s accuracy.