After thinking a bit about your proposed scheme, I see three non-negligeable drawbacks, which don’t make it useless, but which (at least in my opinion) significantly reduce the cases in which it can be safely used (at least, without modifications).
The first issue is the one I already spoke about (but about prediction market) in the teaser : your scheme will work well if (like in the case of the charity board deciding which project to finance), the ones taking the decisions don’t have much involvement in the project later on. But if you try to apply that to situations like a group of engineers deciding which technical solution to use for their own project, the incentive effect will be very dangerous : if I voted against the solution that was finally chosen, my interest is now to ensure the project will fail, so I would have been right.
The second issue is easier to explain with an example. Imagine you’ve 10 persons in the board of a charity that has to approve or refuse projects. One of the 10, person O, is very optimist. On the latest 100 proposal, he approves 80 of them, and disapproved only 20 of them. On the 80 he approved, 35 were later on judged to be bad projects and 45 good ones. On the 20 he refused, only one was in fact a good project (and the 19 others bad ones). The other, person N, is normally optimist. He approved 50 projects, disapproved 50. On the 50 he approved, 10 were bad, 40 were good. On the 50 he disapproved, 40 were bad, 10 were good (they didn’t vote always on the same, so numbers don’t have to match exactly). So if you make ratio, O was right only 68% of the time when he approved, but he was right 95% of the time when he disapproved. N was right 80% in all cases. Well, with such a record, I would give O’s vote more weight when he opposes to a project, and less when he approves one.
The third issue is about risk taking. If you consider a board of directors of a (for profit or not) research agency, who have to approve funding to research projects. Two projects arrive on the table, both require 100 units of financing. One is a low-risk low-gain project A, which is 90% likely to succeed, and will lead to 150 units of gain if it succeeds. The other is a high-risk high-gain project B, which is only 10% likely to succeed, but will lead to 2 000 units of gain if it succeeds. In expectancy, project A is worth 150 0.9 − 100 = 35, project B is worth 2000 0.1 − 100 = 100. There are cases in which it’s better to chose project A—but most of the time, it would be better to chose B. But if you chose B, you’re very likely to be found to have been wrong in hindsight. So with a scheme like the one you propose, decision-makers would favor project A over project B, even if net gain expectancy is only about one third..
The last two issues are a bit of the same : your scheme is interesting, but seems too “binary” (you were right or you were wrong, and we average how often you were right/wrong as your global credibility), and therefore doesn’t work well with some of the complexity of decision taking (optimism vs pessimism, low-risk low-gain vs high-risk high-gain, or motivational/incentive issues). But if you are in a case in which those issues don’t matter much, then it sounds very promising.
After thinking a bit about your proposed scheme, I see three non-negligeable drawbacks, which don’t make it useless, but which (at least in my opinion) significantly reduce the cases in which it can be safely used (at least, without modifications).
The first issue is the one I already spoke about (but about prediction market) in the teaser : your scheme will work well if (like in the case of the charity board deciding which project to finance), the ones taking the decisions don’t have much involvement in the project later on. But if you try to apply that to situations like a group of engineers deciding which technical solution to use for their own project, the incentive effect will be very dangerous : if I voted against the solution that was finally chosen, my interest is now to ensure the project will fail, so I would have been right.
The second issue is easier to explain with an example. Imagine you’ve 10 persons in the board of a charity that has to approve or refuse projects. One of the 10, person O, is very optimist. On the latest 100 proposal, he approves 80 of them, and disapproved only 20 of them. On the 80 he approved, 35 were later on judged to be bad projects and 45 good ones. On the 20 he refused, only one was in fact a good project (and the 19 others bad ones). The other, person N, is normally optimist. He approved 50 projects, disapproved 50. On the 50 he approved, 10 were bad, 40 were good. On the 50 he disapproved, 40 were bad, 10 were good (they didn’t vote always on the same, so numbers don’t have to match exactly). So if you make ratio, O was right only 68% of the time when he approved, but he was right 95% of the time when he disapproved. N was right 80% in all cases. Well, with such a record, I would give O’s vote more weight when he opposes to a project, and less when he approves one.
The third issue is about risk taking. If you consider a board of directors of a (for profit or not) research agency, who have to approve funding to research projects. Two projects arrive on the table, both require 100 units of financing. One is a low-risk low-gain project A, which is 90% likely to succeed, and will lead to 150 units of gain if it succeeds. The other is a high-risk high-gain project B, which is only 10% likely to succeed, but will lead to 2 000 units of gain if it succeeds. In expectancy, project A is worth 150 0.9 − 100 = 35, project B is worth 2000 0.1 − 100 = 100. There are cases in which it’s better to chose project A—but most of the time, it would be better to chose B. But if you chose B, you’re very likely to be found to have been wrong in hindsight. So with a scheme like the one you propose, decision-makers would favor project A over project B, even if net gain expectancy is only about one third..
The last two issues are a bit of the same : your scheme is interesting, but seems too “binary” (you were right or you were wrong, and we average how often you were right/wrong as your global credibility), and therefore doesn’t work well with some of the complexity of decision taking (optimism vs pessimism, low-risk low-gain vs high-risk high-gain, or motivational/incentive issues). But if you are in a case in which those issues don’t matter much, then it sounds very promising.