Firstly I do have a QS background and study bioinformatics.
Machine learning usually requires a lot of data. For QS person you usually only have one person so your data is limited.
If you throw a complex machine learning algorithm against the data, you are likely to overfit.
If you analyse your own data in QS you know something about the data and that’s not in the numbers. You might know that your mood on a particular day dropped because you ended a relationship but you don’t have a quantified variable that tracks the event of a relationship ending.
Hooray, people with credentials! Thank you for sharing your knowledge.
This was my most convincing reason to try to bother implementing the statistical guts myself in the first place; it was pretty easy to put together a little naive Bayes classifier that calculates the maximum likelihood estimate for all your other variables/predicates given the value of one variable/predicate intended to be minimized or maximized, and I’m pretty sure it works mostly correctly, and I’m pretty sure the additional return from using virtually any of the existing more sophisticated ML algorithms that I’ve yet to hear of won’t be nearly as high as the initial return from being able to answer the question “if this is the case, what other stuff was most likely the case?”. I’m starting to get the suspicion that the next most useful modeling-related task may be to focus on generating lots and lots of different compound predicates given all your raw variables in some way that doesn’t reek of unhelpful combinatorial explosion, then calculating the maximum likelihood and probability of that value for all of them using my existing dumb little classifier, which isn’t something I can recall seeing any work on. If that reminds you of something then I will desperately consume whatever resources or names of algorithms that come to your mind.
In the near term, very low-frequency events that have a very clear impact on other variables might be usefully grouped under the raw variable “significant and obvious other things that aren’t worthy of their own variable in my opinion”, and many of the more useful terminally-valued predicates could imaginably require that variable to take a certain value/range of values. Maybe “rejection by people currently in my social circle” with a holistic, multi-valued rating could be its own variable, if that happens unfortunately often for some people. I don’t expect to automate measuring something like this any time soon, but it is undoubtedly important to know about in figuring out optimal conditions for “normal days”, which makes me think the manual data entry part will be unavoidable if you want this to be particularly useful.
In the more distant future, I see no reason why such a thing couldn’t be learned into a variable that actually does an okay job of carving the important parts of your personal reality at the joints. If you have a software system that knows relationships are important to people, knows which of your relationships are important to you, knows who you were talking to, and knows the valence, arousal, duration, frequency, etc. of your interaction with that person over time, then, yes, something like “ended a relationship today” probably could be inferred. It doesn’t sound trivial but it sounds absolutely plausible, given sufficient effort.
If you have a software system that knows relationships are important to people, knows which of your relationships are important to you, knows who you were talking to, and knows the valence, arousal, duration, frequency, etc. of your interaction with that person over time, then, yes, something like “ended a relationship today” probably could be inferred. It doesn’t sound trivial but it sounds absolutely plausible, given sufficient effort.
It’s possible to track 1000 different variables with your model. If you do so, you will however get a lot of false positives.
I think about QS data like it gives you more than your five senses. In the end you still partly rely on your own ability of pattern matching. Graphs of data just gives you additional input to understand what’s going on that you can’t see or hear.
I plan on addressing false positives with a combination of sanity-checking/care-checking (“no, drinking tea probably doesn’t force me to sleep for exactly 6.5 hours the following night” or “so what if reading non-fiction makes me ravenous for spaghetti?”), and suggesting highest-information-content experimentation when neither of those applies (hopefully one would collect more data to test a hypothesis rather than immediately accept the program’s output in most cases). In this specific case, the raw conversation and bodily state data would probably not be nodes in the larger model—only the inferred “thing that really matters”, social life, would. Having constant feedback from the “expert”, who can choose which raw or derived variables to include in the model and which correlations don’t actually matter, seems to change the false positive problem.
Or, I mean, just use Facebook and other social media activity to identify the formation, strengthening, and slow or abrupt end of friendships and relationships. Many of us do basically already live in that world.
Firstly I do have a QS background and study bioinformatics.
Machine learning usually requires a lot of data. For QS person you usually only have one person so your data is limited. If you throw a complex machine learning algorithm against the data, you are likely to overfit.
If you analyse your own data in QS you know something about the data and that’s not in the numbers. You might know that your mood on a particular day dropped because you ended a relationship but you don’t have a quantified variable that tracks the event of a relationship ending.
Hooray, people with credentials! Thank you for sharing your knowledge.
This was my most convincing reason to try to bother implementing the statistical guts myself in the first place; it was pretty easy to put together a little naive Bayes classifier that calculates the maximum likelihood estimate for all your other variables/predicates given the value of one variable/predicate intended to be minimized or maximized, and I’m pretty sure it works mostly correctly, and I’m pretty sure the additional return from using virtually any of the existing more sophisticated ML algorithms that I’ve yet to hear of won’t be nearly as high as the initial return from being able to answer the question “if this is the case, what other stuff was most likely the case?”. I’m starting to get the suspicion that the next most useful modeling-related task may be to focus on generating lots and lots of different compound predicates given all your raw variables in some way that doesn’t reek of unhelpful combinatorial explosion, then calculating the maximum likelihood and probability of that value for all of them using my existing dumb little classifier, which isn’t something I can recall seeing any work on. If that reminds you of something then I will desperately consume whatever resources or names of algorithms that come to your mind.
In the near term, very low-frequency events that have a very clear impact on other variables might be usefully grouped under the raw variable “significant and obvious other things that aren’t worthy of their own variable in my opinion”, and many of the more useful terminally-valued predicates could imaginably require that variable to take a certain value/range of values. Maybe “rejection by people currently in my social circle” with a holistic, multi-valued rating could be its own variable, if that happens unfortunately often for some people. I don’t expect to automate measuring something like this any time soon, but it is undoubtedly important to know about in figuring out optimal conditions for “normal days”, which makes me think the manual data entry part will be unavoidable if you want this to be particularly useful.
In the more distant future, I see no reason why such a thing couldn’t be learned into a variable that actually does an okay job of carving the important parts of your personal reality at the joints. If you have a software system that knows relationships are important to people, knows which of your relationships are important to you, knows who you were talking to, and knows the valence, arousal, duration, frequency, etc. of your interaction with that person over time, then, yes, something like “ended a relationship today” probably could be inferred. It doesn’t sound trivial but it sounds absolutely plausible, given sufficient effort.
It’s possible to track 1000 different variables with your model. If you do so, you will however get a lot of false positives.
I think about QS data like it gives you more than your five senses. In the end you still partly rely on your own ability of pattern matching. Graphs of data just gives you additional input to understand what’s going on that you can’t see or hear.
I plan on addressing false positives with a combination of sanity-checking/care-checking (“no, drinking tea probably doesn’t force me to sleep for exactly 6.5 hours the following night” or “so what if reading non-fiction makes me ravenous for spaghetti?”), and suggesting highest-information-content experimentation when neither of those applies (hopefully one would collect more data to test a hypothesis rather than immediately accept the program’s output in most cases). In this specific case, the raw conversation and bodily state data would probably not be nodes in the larger model—only the inferred “thing that really matters”, social life, would. Having constant feedback from the “expert”, who can choose which raw or derived variables to include in the model and which correlations don’t actually matter, seems to change the false positive problem.
Or, I mean, just use Facebook and other social media activity to identify the formation, strengthening, and slow or abrupt end of friendships and relationships. Many of us do basically already live in that world.