A1987dM comments on 2012 Survey Results

A1987dM 1 Dec 2012 22:16 UTC

9 points

Error bars, please!

gwern 1 Dec 2012 23:24 UTC

5 points

The summary data:

2009: n=67, 145.88(14.02)
2011: n=331; 140.10(13.07)
2012: n=346; 138.30(12.58); graphed:

The basic formula for a confidence interval of a population is: mean ± (z-score of confidence × (standard deviation / √n)). So for z-score=95%=1.96:

$145\.88 \\pm 1\.96 \\times \\frac\{14\.02\}\{\\sqrt\{67\}\}$
= the range 142.5-149.2
$140\.10 \\pm 1\.96 \\times \\frac\{13\.07\}\{\\sqrt\{331\}\}$
= the range 141.5-138.7
$138\.30 \\pm 1\.96 \\times \\frac\{12\.58\}\{\\sqrt\{346\}\}$
= the range 137-139.6

Or to run the usual t-tests and look at the confidence interval they calculate for the difference; for 2009 & 2012, the 95% CI for the difference in mean IQ is 3.563-10.578:

R> lw2009 <- read.csv("lw-2009.csv")
R> lw2011 <- read.csv("lw-2011.csv")
R> lw2012 <- read.csv("lw-2012.csv")

R> # lwi2009 <- lw2009$IQ[!is.na(lw2009$IQ)]
R> # hand-cleaned:
R> lwi2009 <- c(120,125,128,129,130,130,130,130,130,130,130,130,130,131,132,132,133,134,136,138,138,139,139,140,
                140,140,140,140,140,140,140,140,140,141,142,144,145,145,145,148,148,150,150,150,150,152,154,154,
                155,155,155,155,156,158,158,160,160,160,160,162,163,164,165,166,170,171,173,180)
R> lwi2011 <- lw2011$IQ[!is.na(lw2011$IQ)]
R> lwi2012 <- lw2012$IQ[!is.na(lw2012$IQ)]
R>
R> t.test(lwi2009, lwi2012)

    Welch Two Sample t-test

data:  lwi2009 and lwi2012
t = 4.004, df = 91.49, p-value = 0.0001264
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  3.563 10.578
sample estimates:
mean of x mean of y
    145.4     138.3
R> t.test(lwi2009, lwi2011)

    Welch Two Sample t-test

data:  lwi2009 and lwi2011
t = 2.968, df = 94.8, p-value = 0.003791
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 1.752 8.830
sample estimates:
mean of x mean of y
    145.4     140.1
R> t.test(lwi2011, lwi2012)

    Welch Two Sample t-test

data:  lwi2011 and lwi2012
t = 1.804, df = 670.4, p-value = 0.07174
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1578  3.7174
sample estimates:
mean of x mean of y
    140.1     138.3

gwern 2 Dec 2012 0:56 UTC

3 points

Parent

To add a linear model (for those unfamiliar, see my HPMoR examples) which will really just recapitulate the simple averages calculation:

R> lw2009 <- read.csv("lw-2009.csv")
R> lw2011 <- read.csv("lw-2011.csv")
R> lw2012 <- read.csv("lw-2012.csv")
R>
R> # lwi2009 <- lw2009$IQ[!is.na(lw2009$IQ)]
R> # hand-cleaned:
R> lwi2009 <- c(120,125,128,129,130,130,130,130,130,130,130,130,130,131,132,132,133,134,136,138,138,139,139,140,
R>              140,140,140,140,140,140,140,140,141,142,144,145,145,145,148,148,150,150,150,150,152,154,154,
R>              155,155,155,156,158,158,160,160,160,160,162,163,164,165,166,170,171,173,180)
R> lwi2011 <- lw2011$IQ[!is.na(lw2011$IQ)]
R> lwi2012 <- lw2012$IQ[!is.na(lw2012$IQ)]
R>
R> xs <- c(rep(as.Date("2009-03-01"), length(lwi2009)), rep(as.Date("2011-11-01"), length(lwi2011)), rep(as.Date("2012-11-01"), length(lwi2012)))
R> ys <- c(lwi2009, lwi2011, lwi2012)
R> model <- lm(ys ~ xs)
R> summary(model)

Call:
lm(formula = ys ~ xs)

Residuals:
   Min     1Q Median     3Q    Max
-38.29  -8.29  -0.29   6.73  63.81

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 219.49064   19.42751   11.30  < 2e-16
xs           -0.00519    0.00126   -4.11  4.5e-05

Residual standard error: 12.9 on 741 degrees of freedom
Multiple R-squared: 0.0222,    Adjusted R-squared: 0.0209
F-statistic: 16.9 on 1 and 741 DF,  p-value: 4.48e-05

What links here?

A1987dM's comment on Poll—Is endless September a threat to LW and what should be done? by Epiphany (13 Feb 2013 19:26 UTC; 0 points)

satt 2 Dec 2012 10:21 UTC
9 points
Parent
Note that Epiphany dates the 2009 survey to around March, while the other two surveys happened around November, so inputting the survey dates just as years lowballs the time gap between the first & second surveys. Your linear trend’ll be a bit exaggerated.
- gwern 2 Dec 2012 18:55 UTC
  5 points
  Parent
  I’ve fixed it as appropriate.
  
  Your linear trend’ll be a bit exaggerated.
  
  Before, the slope per year was −2.24 (minus 2.25 points a year), now the slope spits out as −0.00519 but if I’m understanding my changes right, the unit has switched from per year to per day and 365.25 times −0.005 IQ points per day is −1.896 per year.
  
  2.25 vs 1.9 is fairly different.

Kindly 1 Dec 2012 22:53 UTC
1 point
Parent
I was lazy and ignored all non-numerical IQ comments, so I got slightly different numbers. But my 95% confidence intervals are:
- 145.18±3.27 in 2009
- 140.12±1.41 in 2011
- 138.42±1.33 in 2012