After removing the individuals that I couldn’t find data for, we had a sample size of 86. A paired t-test, comparing the number of first-borns with the expected number of first-borns (one data point for each of the 86 mathematicians) was statistically significant, t(85)=2.86, p = 0.00529.
Wouldn’t this be a chi-squared/proportion test? Or a binomial regression? (What would you be comparing means of, taking birth category as an integer and averaging them?)
For each mathematician, actual firstbornness was coded as 0 or 1, and expected firstbornness as 1/n (where n is the number of children that their parents had). Then we just did a paired t-test, which is equivalent to subtracting actual minus expected for each data point and then doing a one sample t-test against a mean of 0. You can see this all in Eli’s spreadsheet here; the data are also all there for you to try other statistical tests if you want to.
I not sure t-tests are the best approach to take compared to something non-parametric, given smallish sample, considerable skew, etc. (this paper’s statistical methods section is pretty handy). Nonetheless I’m confident the considerable effect size (in relative terms, almost a doubling) is not an artefact of statistical technique: when I plugged the numbers into a chi-squared calculator I got P < 0.001, and I’m confident a permutation technique or similar would find much the same.
Wouldn’t this be a chi-squared/proportion test? Or a binomial regression? (What would you be comparing means of, taking birth category as an integer and averaging them?)
For each mathematician, actual firstbornness was coded as 0 or 1, and expected firstbornness as 1/n (where n is the number of children that their parents had). Then we just did a paired t-test, which is equivalent to subtracting actual minus expected for each data point and then doing a one sample t-test against a mean of 0. You can see this all in Eli’s spreadsheet here; the data are also all there for you to try other statistical tests if you want to.
I not sure t-tests are the best approach to take compared to something non-parametric, given smallish sample, considerable skew, etc. (this paper’s statistical methods section is pretty handy). Nonetheless I’m confident the considerable effect size (in relative terms, almost a doubling) is not an artefact of statistical technique: when I plugged the numbers into a chi-squared calculator I got P < 0.001, and I’m confident a permutation technique or similar would find much the same.