I always had fairly good mathematical thinking (I think) and loved learning about beautiful concepts in math—but i didn’t learn much at all in school (cause i had the choice). You can say i was “utilitarian” regarding learning math, i didn’t do it if i didn’t see how it can enrich my life.
so my knowledge of math is quite disorganized, i know more about Bayes theorem then many much simpler concepts (i know, it really shouldn’t be that way).
Now i want to be able to analyze data, but i don’t want to learn math that i won’t use for it, if possible.
So here’s my question—what basic stuff do i need to learn in order to be able to calculate probabilities, statistics, do Bayesian math, and overall do things within data analysis that I may yet be aware of?
If you also have suggestions for how to learn those things, after i learn the basics, it will be much appreciated.
thank you :)
I personally found the Udacity course helpful but I see that someone has done a comparison of all the online data science courses they could find here. Hopefully one of those might be what you’re looking for.
Perhaps check out dataquest.io, which teaches the data scientist’s basic skillset.
This is really cool! i might go for it. even though looking at the subjects there, i understand that i meant something much more basic in “data analysis” than what it actually means :)
Check out John Hopcroft’s Foundations of Data Science
thanks. but already in the introduction (2.1) i got lost, it’s beyond the mathematical basis I’m in now. what do i need to learn in order to learn that? or even just probability and statistics, as a start. it seems i didn’t know what i was asking for when i said “data analysis”
You want basic undergraduate probability and linear algebra and some calculus on the side, but you should get along with those. Also some practice with reading academic texts so that you can try to extract some useful meaning from it without understanding every part helps. Also you need some general familiarity with how academic math papers are written, the concepts in 2.1 aren’t complex (high-dimensional space make random points stick together in clumps less), but the way the book writes it is going to be unfamiliar if you haven’t been exposed to academic math writing much before.
Not sure what’s a good place to get that other than “go to university, minor in math”. Khan Academy?
I think I was in your shoes last year. I *thought* I wanted to learn “data analysis”, took an online course, and became way over my head and also realized that I probably didn’t really know what “data analysis” meant.
It sounds like, at the minimum, an intro to statistics course might be useful. I don’t think there’s much math, but more ways of thinking about things like what “probability” means, was really helpful for me as a foundation for learning other related stuff.
Yup. definitely the shoes I’m in, glad to find beforehand that i might not want to take them for a walk ;)
Though i actually would like to learn the math (and math needed for it before that), not just the thought process—do you have any suggestions? or even know just the prerequisites?
I would say that logic is actually more important than math, though my knowledge of “data analysis” is very limited. Again, basic statistical knowledge and math is useful...things like what is/how to calculate standard deviations, correlation, regression, etc.
I’ve taken this class, and while it’s specific to Google Sheets, looking at the syllabus might give you some clues about what to study: https://courses.benlcollins.com/p/data-analysis-with-google-sheets
Also, non-math-related concepts like how to clean and organize data is very important, though I never even though about it until I started learning about data analysis. After all, garbage in, garbage out.
I began going through some basics on khan academy, and plan to then learn statistics and probability there.
i think I’ll wait with learning data analysis at least until after that.
can you elaborate? :)
I kind’a sort’a thought learning data analysis would give me “magical powers” to glean insight from data....like I could just throw a bunch of data on a spreadsheet, run some formulas and functions, and voila...enlightenment. But there’s a LOT that goes into deciding things like what kind of data to use, what to exclude, *how* to process the data, how to *interpret* the data *and* the results, etc. The formulas and statistics is just a small part of the toolbox used in data analysis.
There’s a lot of planning, pre-planning, figuring out what you want to find out and how to get there from what you have...you have to use a lot of logic, critical thinking skills, things like that before you even start doing the math and statistics, and certainly *after* you do the math. Does that make sense?
Yup, thanks :)
basically you say that pretty much all rationality skills (Logic, knogledge of biases and heuristics, etc...) are needed or beneficial to know how not to make mistakes while handling the data—right?
Yep. I think the best lessons I’ve learned revolve around actually *trying* to second guess myself. I’d crunch some numbers, feeling confident that I did everything right, only to realize that my assumptions or logic or something *other* than the mechanics of number crunching was off or wrong.