I’ve often wondered if a large-userbase data collecting website could help solve problems like this by looking for very weak statistical correlations among coinciding events over large datafields. I.e. see how often people self-report eating X, see how often people self-report feeling Y, see how often one precedes the other and when they happen independently. The function to users would be letting them track their own actions (e.g. diet, health, etc) according to preset (or high-karma member-submitted) input:data -sets. I should think with members in the thousands such a thing would become useful. Especially were the service entangled with some social app to get users and some very good statistics processing to get results. Does anything like this at all exist? (Any obvious ideas why it doesn’t, barring there possibly being lack of incentive to use it, lack of an incentive for a company/person to program it?)
They’re actively running experiments and collecting data but are in “beta testing” and are very exclusive on whom they allow to join. I’m disappointed they didn’t choose me when I filled out their request for a beta invite.
A huge problem with collecting data like this in the US population, is that everyone has a similar diet. There’s so few people totally excluding gluten, you can’t expect to measure it’s effects with epidemiological diet surveys: you need to actually do a controlled trial where you tell people to avoid it.
In China where only about half of people eat foods with gluten the biggest epidemiological study ever performed (the China Study) did find that wheat intake was independently correlated with overall mortality (http://rawfoodsos.com/the-china-study/). They never published this finding themselves, but the correlation is clearly there in the data.
There’s a lot of question about their methodology- they didn’t keep or report data on individuals, but lumped whole communities together as single data points. There’s likely a lot of highly correlated regional habits that weren’t on the questionnaire, and I tend to find the whole study pretty questionable. For the most part, it’s just comparing the health of rural farmers with wealthier urban Chinese- the two groups have radically different health, lifestyle, and diets and we can only control for the few questions they actually asked.
Perhaps now that gluten avoidance seems to be becoming a “fad diet” in western countries, suddenly it will be possible to actually collect good data on this.
That looks like it could prove really useful / interesting; thanks for linking.
I guess the entry requirements for beta are strict because they’re trying to keep to a small set of variables for the people to check? It would have been really interesting to spy in on though. Regarding the China study, it sounds either like there was no effort to control for other obvious/statistically-true correlates or that there is no possible overlap at all to abstract a controlled comparison from. A fraction of that data might be useful (all data is useful! …yum!). I think with sufficient (though perhaps improbably large) sample size even user-submitted data with large amounts of noise becomes useful. Any empirical paradigm more open and faster than the current is bound to be a good thing, even despite inaccuracy, for reasons of sheer brute force.
At least with user submitted noisy data you have individual data points, and potential to track individuals over time… unlike the China Study where entire communities were just averaged into a single point.
There’s some usable information in the China Study, but not as much as people think… it’s being touted as “proof” that all animal-based foods cause cancer (in a popular diet book by the primary investigator Dr. Campbell) because the two were well correlated in the data, when it’s nothing of the sort.
I’ve often wondered if a large-userbase data collecting website could help solve problems like this by looking for very weak statistical correlations among coinciding events over large datafields. I.e. see how often people self-report eating X, see how often people self-report feeling Y, see how often one precedes the other and when they happen independently. The function to users would be letting them track their own actions (e.g. diet, health, etc) according to preset (or high-karma member-submitted) input:data -sets. I should think with members in the thousands such a thing would become useful. Especially were the service entangled with some social app to get users and some very good statistics processing to get results. Does anything like this at all exist? (Any obvious ideas why it doesn’t, barring there possibly being lack of incentive to use it, lack of an incentive for a company/person to program it?)
Yes, it exists: http://genomera.com
They’re actively running experiments and collecting data but are in “beta testing” and are very exclusive on whom they allow to join. I’m disappointed they didn’t choose me when I filled out their request for a beta invite.
A huge problem with collecting data like this in the US population, is that everyone has a similar diet. There’s so few people totally excluding gluten, you can’t expect to measure it’s effects with epidemiological diet surveys: you need to actually do a controlled trial where you tell people to avoid it.
In China where only about half of people eat foods with gluten the biggest epidemiological study ever performed (the China Study) did find that wheat intake was independently correlated with overall mortality (http://rawfoodsos.com/the-china-study/). They never published this finding themselves, but the correlation is clearly there in the data.
There’s a lot of question about their methodology- they didn’t keep or report data on individuals, but lumped whole communities together as single data points. There’s likely a lot of highly correlated regional habits that weren’t on the questionnaire, and I tend to find the whole study pretty questionable. For the most part, it’s just comparing the health of rural farmers with wealthier urban Chinese- the two groups have radically different health, lifestyle, and diets and we can only control for the few questions they actually asked.
Perhaps now that gluten avoidance seems to be becoming a “fad diet” in western countries, suddenly it will be possible to actually collect good data on this.
That looks like it could prove really useful / interesting; thanks for linking.
I guess the entry requirements for beta are strict because they’re trying to keep to a small set of variables for the people to check? It would have been really interesting to spy in on though. Regarding the China study, it sounds either like there was no effort to control for other obvious/statistically-true correlates or that there is no possible overlap at all to abstract a controlled comparison from. A fraction of that data might be useful (all data is useful! …yum!). I think with sufficient (though perhaps improbably large) sample size even user-submitted data with large amounts of noise becomes useful. Any empirical paradigm more open and faster than the current is bound to be a good thing, even despite inaccuracy, for reasons of sheer brute force.
At least with user submitted noisy data you have individual data points, and potential to track individuals over time… unlike the China Study where entire communities were just averaged into a single point.
There’s some usable information in the China Study, but not as much as people think… it’s being touted as “proof” that all animal-based foods cause cancer (in a popular diet book by the primary investigator Dr. Campbell) because the two were well correlated in the data, when it’s nothing of the sort.