There’s an easier study I’d like to do before the lithium experiment: compare water contamination to obesity rates. I have two decent but not amazing datasets to do this (the water one tracks 18 contaminants but not lithium by zip code, the weight one tracks % obese not BMI by county) and a statistician who will analyze the results if I get the data in a single spreadsheet. I expect there are APIs to do that fairly easily but haven’t had time to dig into it myself. If someone gets the 18 contaminants plus % obese in a single spreadsheet, and fixes the issues with county vs. zipcode aggregation, I can get this analysis done fairly quickly. Bonus points if you find better DBs than I do (I wish it was BMI by zipcode, not % obese by county) or incorporate additional data, like income, age, well water usage, and density.
I’ve discussed this with SMTM previously, they agree it’s worth doing although are less excited than me because the water quality database doesn’t include lithium. So additional bonus points if someone finds a database of lithium concentration in drinking water by county.
I’d happily pay $100 + credit on the eventual blog post for the spreadsheet combining the two DBs listed, and am open to negotiation on larger amounts if it’s more difficult than I think it is or additional features are included.
I emailed both sources, and County Health Rankings got back! They offer a spreadsheet download here.
I’ve copied the data to this Google sheet here (under tab “Ranked Measure Data”, column BN) for easier access. What’s remaining before we can get it to the statistician:
Get access to the water database (they might charge for this? not super sure, I just pinged them again)
Line up counties to zip codes (I think this link should suffice)
Write a script to combine these into a single zip code (I could probably do this)
The Tap Water Database seems to be less forthcoming with their data. Their response: “We don’t share the back end of the database with anyone.… I’m happy to run your proposal by the science team. Just send me a few detailed sentences about your research, whom you’re affiliated with, and where you’re going to publish it.”
I have no credentials in this space (my background is in software dev); would anyone with a relevant background be willing to help compose a reply + lend their affiliation?
There’s an easier study I’d like to do before the lithium experiment: compare water contamination to obesity rates. I have two decent but not amazing datasets to do this (the water one tracks 18 contaminants but not lithium by zip code, the weight one tracks % obese not BMI by county) and a statistician who will analyze the results if I get the data in a single spreadsheet. I expect there are APIs to do that fairly easily but haven’t had time to dig into it myself. If someone gets the 18 contaminants plus % obese in a single spreadsheet, and fixes the issues with county vs. zipcode aggregation, I can get this analysis done fairly quickly. Bonus points if you find better DBs than I do (I wish it was BMI by zipcode, not % obese by county) or incorporate additional data, like income, age, well water usage, and density.
I’ve discussed this with SMTM previously, they agree it’s worth doing although are less excited than me because the water quality database doesn’t include lithium. So additional bonus points if someone finds a database of lithium concentration in drinking water by county.
I’d happily pay $100 + credit on the eventual blog post for the spreadsheet combining the two DBs listed, and am open to negotiation on larger amounts if it’s more difficult than I think it is or additional features are included.
I emailed both sources, and County Health Rankings got back! They offer a spreadsheet download here.
I’ve copied the data to this Google sheet here (under tab “Ranked Measure Data”, column BN) for easier access. What’s remaining before we can get it to the statistician:
Get access to the water database (they might charge for this? not super sure, I just pinged them again)
Line up counties to zip codes (I think this link should suffice)
Write a script to combine these into a single zip code (I could probably do this)
Happy for anyone else to jump in too!
The Tap Water Database seems to be less forthcoming with their data. Their response: “We don’t share the back end of the database with anyone.… I’m happy to run your proposal by the science team. Just send me a few detailed sentences about your research, whom you’re affiliated with, and where you’re going to publish it.”
I have no credentials in this space (my background is in software dev); would anyone with a relevant background be willing to help compose a reply + lend their affiliation?
Hi Austin- I keep holding off responding until I have a new plan, but now I’m swamped so that’s going to take a big. Thank you for trying!