For the last 30+ days, I’ve been randomly taking vitamin D before bed and recording sleep data with Zeo as well as my best guess whether it was D or placebo: http://www.gwern.net/Zeo#vitamin-d
But I’m going to finish within the next <10 days, and then it’s time for analysis. I would like to do a real statistical analysis, like a ‘one-tailed T test’, on the ZQ and components like length of REM sleep, but I don’t actually know how—I’ve never used the standard statistical programs or packages like R. (Yes, yes, I know, I should learn real statistics. I haven’t yet, though.)
My needs aren’t too complex, I’m just asking whether the vitamin D did anything (bad), so maybe there is an easy to use online tool or something which would handle it?
If you have prior programming experience, R is simple for basic analysis. I don’t know of any online tools. The following would be close to sufficient for what you seem to want:
mydata <- read.table("c:/sleepdata.csv", header=TRUE, sep=",") # Read in data w/ variable names in header
fit <- lm(ZQ ~ guess + placebo, data=mydata) # Compute linear regression
summary(fit) # Get estimated coefficients with t-statistics and p-values
For cookbook code, see Quick R. If you do use R, the RStudio IDE is very useful.
I really only know Haskell, but to my surprise, R wasn’t too hard to work with. I didn’t go with your linear regression code but just some straight t-tests on the variables of interest. I worked the averages and p-values into the text, and put the full R session output into 2 footnotes:
My alma mater hosts a widely used web-based tool for statistical procedures that includes t-tests. Find it here. Click on “t-Tests & procedures”, then go to “Two-Sample t-Test for Independent or Correlated Samples”, then click on “independent samples” (given the description of your set-up above), then enter in the values for each condition, click “calculate”, and viola you’ll have your results.
Cool experiment by the way. Note that if you do multiple tests (like on both ZQ and length of REM), you might want to do a multiple comparisons correction to maintain that familywise error rate.
Thanks for the link! That seems like what I want; for example, I didn’t have any problem plugging in my placebo/D ZQ scores to get a one-tailed p = 0.078395.
you might want to do a multiple comparisons correction
The only one I know is the Bonferroni one, but that’s for independent tests, IIRC, while I strongly expect correlations among the results (ZQ is made partially out of things like REM and deep sleep length, so there’d be correlations by definition, and one would expect my sleep quality rating to correlate with ZQ even assuming that’s not being factored into ZQ already).
Reading Wikipedia, I get the impression that using Bonferroni when I know the tests to not be independent would result in fewer false positives, but many many more false negatives. Since my data has so little power as it is...
Yes, that’s a good point. I suggest that if you are testing many hypothesis, you use the false discovery rate (here’s the useful, original pdf, cited 10,000+ times).
As an example, let’s say that you test 6 hypotheses, corresponding to different features of your zeo data. You could use a t-test for each, as above. Then aggregate and sort all the p-values in ascending order. Let’s say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.
Assume, arbitrarily, that you want the overall false discovery rate to be 0.05, which is in this context called the q-value. You would then sequentially test, from the last value to the first, whether the current p-value is less than ((the current index * the false discovery rate) / the overall number of hypotheses). You stop when you get to the first true inequality and call the p-values of the rest of the hypotheses significant.
So in this example, you would stop when you correctly call 0.030 < ((4 * 0.05) / 6), and hypotheses corresponding to the first four p-values would be called significant.
Interesting procedure. I tried it out on my melatonin and one-legged standing data, putting the results in the same footnotes as the R sessions, and no surprise, nothing survives. (A little depressing, but it’s not like there were very many p-values in the 0.01 or lower range.)
EDIT: however, one result from my vitamin D experiment did survive multiple correction!
The Khan academy lectures (and worked examples therein) are pretty awesome at showing you how to do relatively simple things like t-tests. Although I should add that I’ve taken stats in the past so it wasn’t completely new to me, YMMV. But since each lecture is all of 10 minutes long I don’t think the costs are too high.
Badger’s solution works, but an alternative is to post a link to the data here and let LW take a crack at it. I can fit a wide variety of models in a matter of minutes in R, and wouldn’t mind spending an hour or two doing so and writing up the results.
Well, I usually include an export of my Zeo CSV data for anyone to look at (not that anyone apparently has for the earlier bigger melatonin data), and I was going to link the results here in an open thread or something.
If you have a Graphing calculator it can probably do it for you.
If you have a TI-84 its Stats->tests I think you’d want “2-SampTTest” but its been a while since I’ve used that function.
Their website gives more info.
-edit to add detail/clarification
For the last 30+ days, I’ve been randomly taking vitamin D before bed and recording sleep data with Zeo as well as my best guess whether it was D or placebo: http://www.gwern.net/Zeo#vitamin-d
But I’m going to finish within the next <10 days, and then it’s time for analysis. I would like to do a real statistical analysis, like a ‘one-tailed T test’, on the ZQ and components like length of REM sleep, but I don’t actually know how—I’ve never used the standard statistical programs or packages like R. (Yes, yes, I know, I should learn real statistics. I haven’t yet, though.)
My needs aren’t too complex, I’m just asking whether the vitamin D did anything (bad), so maybe there is an easy to use online tool or something which would handle it?
If you have prior programming experience, R is simple for basic analysis. I don’t know of any online tools. The following would be close to sufficient for what you seem to want:
For cookbook code, see Quick R. If you do use R, the RStudio IDE is very useful.
I really only know Haskell, but to my surprise, R wasn’t too hard to work with. I didn’t go with your linear regression code but just some straight t-tests on the variables of interest. I worked the averages and p-values into the text, and put the full R session output into 2 footnotes:
http://www.gwern.net/Zeo#fn4
http://www.gwern.net/Zeo#fn7
I’ve finished my experiment; if you want to check my R interpreter usage, I put it in the footnotes in http://www.gwern.net/Zeo#vitamin-d-analysis
My alma mater hosts a widely used web-based tool for statistical procedures that includes t-tests. Find it here. Click on “t-Tests & procedures”, then go to “Two-Sample t-Test for Independent or Correlated Samples”, then click on “independent samples” (given the description of your set-up above), then enter in the values for each condition, click “calculate”, and viola you’ll have your results.
Cool experiment by the way. Note that if you do multiple tests (like on both ZQ and length of REM), you might want to do a multiple comparisons correction to maintain that familywise error rate.
Thanks for the link! That seems like what I want; for example, I didn’t have any problem plugging in my placebo/D ZQ scores to get a one-tailed p = 0.078395.
The only one I know is the Bonferroni one, but that’s for independent tests, IIRC, while I strongly expect correlations among the results (ZQ is made partially out of things like REM and deep sleep length, so there’d be correlations by definition, and one would expect my sleep quality rating to correlate with ZQ even assuming that’s not being factored into ZQ already).
Reading Wikipedia, I get the impression that using Bonferroni when I know the tests to not be independent would result in fewer false positives, but many many more false negatives. Since my data has so little power as it is...
Yes, that’s a good point. I suggest that if you are testing many hypothesis, you use the false discovery rate (here’s the useful, original pdf, cited 10,000+ times).
As an example, let’s say that you test 6 hypotheses, corresponding to different features of your zeo data. You could use a t-test for each, as above. Then aggregate and sort all the p-values in ascending order. Let’s say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.
Assume, arbitrarily, that you want the overall false discovery rate to be 0.05, which is in this context called the q-value. You would then sequentially test, from the last value to the first, whether the current p-value is less than ((the current index * the false discovery rate) / the overall number of hypotheses). You stop when you get to the first true inequality and call the p-values of the rest of the hypotheses significant.
So in this example, you would stop when you correctly call 0.030 < ((4 * 0.05) / 6), and hypotheses corresponding to the first four p-values would be called significant.
Interesting procedure. I tried it out on my melatonin and one-legged standing data, putting the results in the same footnotes as the R sessions, and no surprise, nothing survives. (A little depressing, but it’s not like there were very many p-values in the 0.01 or lower range.)
EDIT: however, one result from my vitamin D experiment did survive multiple correction!
The Khan academy lectures (and worked examples therein) are pretty awesome at showing you how to do relatively simple things like t-tests. Although I should add that I’ve taken stats in the past so it wasn’t completely new to me, YMMV. But since each lecture is all of 10 minutes long I don’t think the costs are too high.
Badger’s solution works, but an alternative is to post a link to the data here and let LW take a crack at it. I can fit a wide variety of models in a matter of minutes in R, and wouldn’t mind spending an hour or two doing so and writing up the results.
Well, I usually include an export of my Zeo CSV data for anyone to look at (not that anyone apparently has for the earlier bigger melatonin data), and I was going to link the results here in an open thread or something.
Got a link to you melatonin data?
First sentence in http://www.gwern.net/Zeo#melatonin-analysis
If you have a Graphing calculator it can probably do it for you. If you have a TI-84 its Stats->tests I think you’d want “2-SampTTest” but its been a while since I’ve used that function. Their website gives more info. -edit to add detail/clarification
I don’t, no.