A1987dM comments on Anti-akrasia tool: like stickK.com for data nerds

A1987dM 12 Oct 2011 16:51 UTC
3 points
I wonder what the connected red dots and the thin grey line are...
- dreeves 12 Oct 2011 17:39 UTC
  2 points
  Parent
  The thin line (supposed to be purple, not gray) is an exponentially weighted moving average. It’s what’s recommended in The Hacker’s Diet [http://dreev.es/hackdiet] as a way to keep from freaking out about the day-to-day fluctuations in your weight. As long as all your datapoints are below that line then you’re inexorably trending downward.
  
  The “rose-colored dots” are an attempt at a more normal-person-friendly version of that. It’s a transformation of your data to be as monotonic as possible such that the transformed datapoints (the rose-colored ones) are still within something like a standard deviation of the actual measured datapoint.
  
  There’s also the blue-green aura around your datapoints which is a very thick polynomial regression on your data.
  
  All of these things are an attempt to show you your true trend.
  
  (Also, they only apply to goals like weight loss where the measurements are noisy.)
  - A1987dM 12 Oct 2011 19:01 UTC
    2 points
    Parent
    Thank you very much...
    
    One thing I don’t like about John Walker’s algorithm is that it gives too much weight to the very first data point, and to data before a ‘break’, so that if you only report your weight n times every N days (as I do, because I don’t have a scale in the flat where I’m staying so I only weigh myself when I go back home on weekends—I know that makes the whole thing a lot harder), the trend line will change about N/n times as slowly as it should. I prefer this algorithm:
    
    factor = 0.9 numerator = denominator = 0. every day: numerator = factor * numerator denominator = factor * denominator if user reports weight: numerator = numerator + today.weight denominator = denominator + 1. today.smoothed_weight = numerator / denominator
    (which is equivalent to Walker’s algorithm if you report your weight every day and have done so for a while).
    - dreeves 12 Oct 2011 20:15 UTC
      0 points
      Parent
      Nice, thanks! Is that by chance equivalent to what this page is suggesting: http://stackoverflow.com/questions/1023860
      - A1987dM 12 Oct 2011 22:14 UTC
        1 point
        Parent
        It is equivalent to the answer by yairchu of Jun 21 ’09 at 15:53, as far as I can tell.
        A1987dM 12 Oct 2011 23:15 UTC
        2 points
        Parent
        I just had another idea (loosely inspired on the Glicko rating system): suppose that a person on Day 0 has an unknown “true weight” W_0, but because of measurement errors and unknown amount of body water etc. the scale reads w_0, which is normally distributed with mean W_0 and variance σ^2; suppose also that if we knew W_0 we would assign W_1 (the “true weight” on Day 1) a probability distribution with mean W_0 and variance c^2(t_1 − t_0). Now, if our probability distribution for W_n is a Gaussian with mean u_n and variance σ_n^2, on knowing the measured weight w_n we would update it to mean (u_n/σ_n^2 + w_n/σ^2)/(1/σ_n^2 + 1/σ^2) and variance 1//(1/σ_n^2 + 1/σ^2). Hence:
        
        sigma_sq = 1 c_sq = 0.01111111111111111111 smoothweight = 0 sigman_sq = infinity every day: sigman_sq = sigman_sq + c_sq if user reports weight: smoothweight = (smoothweight/sigman_sq + today.weight/sigma_sq)/(1./sigman_sq + 1./sigma_sq) sigman_sq = 1./(1./sigman_sq + 1./sigma_sq)
        (Multiplying sigma_sq and c_sq by the same constant doesn’t affect the values of smoothweight, as far as I can tell.) In the limit of data on a large number of consecutive days, sigman_sq approaches 0.1 and the algorithm becomes equivalent to the other ones. I’ve tried this with my own gapped data and the trend line changes faster than with the Hacker’s Diet algorithm but not as fast as with my old algorithm. But I now prefer this because it has a rationale resembling more a derivation from first principles than someone pulling stuff out of their ass.
        
        (Is there a way of getting real subscripts and superscripts?)
        A1987dM 15 Oct 2011 12:20 UTC
        1 point
        Parent
        Also, σ^2 and c^2 could in principle be found empirically: in this model, the difference of measured weights t days apart is normally distributed with variance (2σ^2 + tc^2). I found a file with a couple years’ worth of almost daily weight data of myself from a few years ago and computed the average of (w_n − w_(n − t))^2 for various values of t, and for not-too-large values that’s actually approximately linear in t (except it is slightly lower at multiples of 7 days, which I take to be an effect of week cycles—I tend to eat more on weekends). But the ratio between the c^2 and the σ^2 I found was nowhere near ¹⁄₉₀ -- it was actually about ¹⁄₅, which suggests that the Hacker’s Diet smoothed average responds to changes in weight much more slowly than it should, even if the weight is reported daily.
        
        (Will anyone bother to find out the formula for the ideal Bayesian estimate of c^2 and σ^2 in this model, assuming uninformative priors?)