what kinds of statistical tests would be most appropriate for measuring the results?
What question about your game and learning math/probability are you trying to answer?
If you want “an effect” you want a comparison of two arms. But you can only have one arm have an intervention, and the other just be the baseline arm with no treatment at all (or just the ‘background treatment’ of being a college undergraduate). For example, you can take a set of undergrads, and advertise that you are testing probability aptitude or something, and then the control arm just gets the test, while the test arm gets your game and the test afterwards.
I don’t know about your advisor, but I would accept a study like that.
I always found it slightly puzzling that LW folks who get into practical data analysis start with F methods, and not B. Isn’t B kind of a LW “thing?”
Starting to think about measuring results via ANOVA et al is, to me, starting at the wrong level of abstraction (I realize I may differ on this from a lot of statisticians). For example, ANOVA can test for the null. What does that null mean? Well, you are interested in some causal effect. Maybe this: E[test result | assigned to game] - E[test result | baseline undergrad].
Or maybe you give them a questionaire first, and learn how much math they have had (or even what particular classes). Maybe you want to actually look at an effect conditional on math preparation level. Does your game possibly have an ‘interaction’ with background math sophistication level? Then you need to model that. Then maybe if you decide on the model, you decide for how to test for the null. Or maybe you don’t want the null, but the size of the effect itself. etc. etc.
You think about what you want first, the stats technique afterwards.
What question about your game and learning math/probability are you trying to answer?
Mostly 1) do the players actually learn anything that would transfer outside the immediate game 2) how much (if at all) things like their enjoyment affect whether they learn
If you want “an effect” you want a comparison of two arms. But you can only have one arm have an intervention, and the other just be the baseline arm with no treatment at all (or just the ‘background treatment’ of being a college undergraduate). For example, you can take a set of undergrads, and advertise that you are testing probability aptitude or something, and then the control arm just gets the test, while the test arm gets your game and the test afterwards.
Thanks! Isn’t “undergrads with only the test vs. undergrads with the game and then the test” kinda the same as “undergrads with only test vs. undergrads after the pretest and the game”, though?
I always found it slightly puzzling that LW folks who get into practical data analysis start with F methods, and not B. Isn’t B kind of a LW “thing?”
F is what we’ve been taught, and what most of our supervisors understand. I’m not really familiar with B stats.
What question about your game and learning math/probability are you trying to answer?
If you want “an effect” you want a comparison of two arms. But you can only have one arm have an intervention, and the other just be the baseline arm with no treatment at all (or just the ‘background treatment’ of being a college undergraduate). For example, you can take a set of undergrads, and advertise that you are testing probability aptitude or something, and then the control arm just gets the test, while the test arm gets your game and the test afterwards.
I don’t know about your advisor, but I would accept a study like that.
I always found it slightly puzzling that LW folks who get into practical data analysis start with F methods, and not B. Isn’t B kind of a LW “thing?”
Starting to think about measuring results via ANOVA et al is, to me, starting at the wrong level of abstraction (I realize I may differ on this from a lot of statisticians). For example, ANOVA can test for the null. What does that null mean? Well, you are interested in some causal effect. Maybe this: E[test result | assigned to game] - E[test result | baseline undergrad].
Or maybe you give them a questionaire first, and learn how much math they have had (or even what particular classes). Maybe you want to actually look at an effect conditional on math preparation level. Does your game possibly have an ‘interaction’ with background math sophistication level? Then you need to model that. Then maybe if you decide on the model, you decide for how to test for the null. Or maybe you don’t want the null, but the size of the effect itself. etc. etc.
You think about what you want first, the stats technique afterwards.
Mostly 1) do the players actually learn anything that would transfer outside the immediate game 2) how much (if at all) things like their enjoyment affect whether they learn
Thanks! Isn’t “undergrads with only the test vs. undergrads with the game and then the test” kinda the same as “undergrads with only test vs. undergrads after the pretest and the game”, though?
F is what we’ve been taught, and what most of our supervisors understand. I’m not really familiar with B stats.