Note that 8 is pretty small, so if the true base rate was 32%, you’d still expect to see 2 discontinuities (8 choose 2) * 0.32^2 * 0.68^6 = 0.283 = 28% of the time, vs (8 choose 3) * 0.32^3 * 0.68^5 = 0.266 = 27% of the time for 3 discontinuities, so I take the 3⁄8 as indicative that I’m on the right ballpark, rather than sweating too much about it.
The broad vs specific categories is an interesting hypothesis, but note that it could also be cofounded with many other factors. For example, broader fields could attract more talent and funds, and it might be harder to achieve proportional improvement in a bigger domain (e.g., it might be easier to improve making candles by 1% than to improve aviation by 1%). That is, you can have the effect that you have more points of leverage (more subfields, as you point out), but that each point of leverage affects the result less (an x% discontinuity in a subfield corresponds to much less than x% in the super-field).
Then broad categories get 24% big discontinuities, 29% medium discontinuities, 14% small discontinuities and 33% no discontinuities. In comparison, less broad categories get 24% big discontinuities, 10% medium discontinuities, 28% small discontinuities and 38% no discontinuities, i.e., no effect at the big discontinuity level but a noticeable difference at the medium level, which is somewhat consistent with your hypothesis. Data available here, the criteria used was “does this have more than one broad subcategory” combined my own intuition as to whether the field feels “broad”.
Note that 8 is pretty small, so if the true base rate was 32%, you’d still expect to see 2 discontinuities (8 choose 2) * 0.32^2 * 0.68^6 = 0.283 = 28% of the time, vs (8 choose 3) * 0.32^3 * 0.68^5 = 0.266 = 27% of the time for 3 discontinuities, so I take the 3⁄8 as indicative that I’m on the right ballpark, rather than sweating too much about it.
The broad vs specific categories is an interesting hypothesis, but note that it could also be cofounded with many other factors. For example, broader fields could attract more talent and funds, and it might be harder to achieve proportional improvement in a bigger domain (e.g., it might be easier to improve making candles by 1% than to improve aviation by 1%). That is, you can have the effect that you have more points of leverage (more subfields, as you point out), but that each point of leverage affects the result less (an x% discontinuity in a subfield corresponds to much less than x% in the super-field).
If I look at it in my database, categorizing:
Broad: aviation, ceramics, cryptography, film, furniture, glass, nuclear, petroleum, photography, printing, rail transport, robotics, water supply and sanitation, artificial life, bladed weapons, aluminium, automation, perpetual motion machines, nanotechnology, timekeeping devices, wind power
Not broad: motorcycle, multitrack recording, Oscilloscope history, paper, polymerase chain reaction, portable gas stove, roller coaster, steam engine, telescope, cycling, spaceflight, rockets, calendars, candle making, chromatography, condoms, diesel car, hearing aids, radar, radio, sound recording, submarines, television, automobile, battery, telephone, transistor, internal combustion engine, manufactured fuel gases.
Then broad categories get 24% big discontinuities, 29% medium discontinuities, 14% small discontinuities and 33% no discontinuities. In comparison, less broad categories get 24% big discontinuities, 10% medium discontinuities, 28% small discontinuities and 38% no discontinuities, i.e., no effect at the big discontinuity level but a noticeable difference at the medium level, which is somewhat consistent with your hypothesis. Data available here, the criteria used was “does this have more than one broad subcategory” combined my own intuition as to whether the field feels “broad”.