Only about 10 percent of new social programs in fields like education, criminology and social welfare demonstrate statistically significant benefits in RCTs
This is a higher rate than I’d expected. It implies that current policies in these three fields are not really thoroughly thought out, or at least not to the extent that I had expected. It seems that there is substantial room for improvement.
Remember that programs will not even be tested unless there are good reasons to expect improvement over current protocol. Most programs that are explicitly considered are worse than those that are tested, and most possible programs are worse than those that are explicitly considered. Therefore we can expect that far, far fewer than ten percent of possible programs would yield significant improvements.
That is true. However, there is a second filtering process, after filtering by experts; and that is what I will refer to as filtering by experiment (i.e. we’ll try this, and if it works we keep doing it, and if it doesn’t we don’t). Evolution is basically a mix of random mutation and filtering by experiment, and it shows that, given enough time, such a filter can be astonishingly effective. (That time can be drastically reduced by adding another filter—such as filtering-by-experts—before the filtering-by-experiment step)
The one-to-two percent expectation that I had was a subconscious expectation of the comparison of the effectiveness of the filtering-by-experts in comparison to the filtering-by-experiment over time. Investigating my reasoning more thoroughly, I think that what I had failed to appreciate is probably that there really hasn’t been enough time for filtering-by-experiment to have as drastic an effect as I’d assumed; societies change enough over time that what was a good idea a thousand years ago is probably not going to be a good idea now. (Added to this, it likely takes more than a month to see whether such a social program actually is effective or not; so there hasn’t really been time for all that many consecutive experiments, and there hasn’t really been a properly designed worldwide experimental test model, either).
It implies that current policies in these three fields are not really thoroughly thought out, or at least not to the extent that I had expected.
That’s one possible explanation.
Another possible explanation is that there is a variety of powerful stakeholders in these fields and the new social programs are actually designed to benefit them and not whoever the programs claim to help.
This is a higher rate than I’d expected. It implies that current policies in these three fields are not really thoroughly thought out, or at least not to the extent that I had expected. It seems that there is substantial room for improvement.
I would have expected perhaps one or two percent.
Remember that programs will not even be tested unless there are good reasons to expect improvement over current protocol. Most programs that are explicitly considered are worse than those that are tested, and most possible programs are worse than those that are explicitly considered. Therefore we can expect that far, far fewer than ten percent of possible programs would yield significant improvements.
That is true. However, there is a second filtering process, after filtering by experts; and that is what I will refer to as filtering by experiment (i.e. we’ll try this, and if it works we keep doing it, and if it doesn’t we don’t). Evolution is basically a mix of random mutation and filtering by experiment, and it shows that, given enough time, such a filter can be astonishingly effective. (That time can be drastically reduced by adding another filter—such as filtering-by-experts—before the filtering-by-experiment step)
The one-to-two percent expectation that I had was a subconscious expectation of the comparison of the effectiveness of the filtering-by-experts in comparison to the filtering-by-experiment over time. Investigating my reasoning more thoroughly, I think that what I had failed to appreciate is probably that there really hasn’t been enough time for filtering-by-experiment to have as drastic an effect as I’d assumed; societies change enough over time that what was a good idea a thousand years ago is probably not going to be a good idea now. (Added to this, it likely takes more than a month to see whether such a social program actually is effective or not; so there hasn’t really been time for all that many consecutive experiments, and there hasn’t really been a properly designed worldwide experimental test model, either).
That’s one possible explanation.
Another possible explanation is that there is a variety of powerful stakeholders in these fields and the new social programs are actually designed to benefit them and not whoever the programs claim to help.
Remember, you expect 5% to give a statistically significant result just by chance...
That’s only true of the programs which can be expected to produce no detriments, surely?