Quantopian contest, but for food intake and weight

Twelve years ago, I lost 100 lbs. in a fairly boring manner by eating 1200 calories a day. Brutally unpleasant and catabolic, but mostly successful. It’s creeping back. There are no low-hanging fruits like sugarwater, fast food, eating out. The core problem is my satiety point is ~1000 calories above my RMR, and my body is not fooled by “high satiety” foods.

Graph of weight from 1999 to 2024 with an initial loss from 220 to 175 quickly regained and held at 270 for a decade followed by a rapid loss of 100 lbs. half of which crept back over the next decade.

Since then, I’ve recorded every bite eaten every day, whether 10 calories of broth on a weekend fast or a 4000 calorie binge of my favorite deep dish “lucent” pizza, using a food scale for everything, not once thoughtlessly pouring oil into a recipe or blindly applying peanut butter to a sandwich. This cultivated superhuman calorie estimation abilities. “You say this is 280 kcal? It tastes like 350. Let me see the recipe… I see, the food label for ghee is off by half.” My goal in recording calories is to predict true fat loss by subtracting intake from RMR so daily scale variations aren’t discouraging. This is why I believe my records to be so much more accurate than others’—my incentives reward precision over the default of “undercount to stay under budget.” I weigh frequently, have impedance body composition data, periodic RMR data from an indirect calorimeter (which I now own), sleep-wake times, all gym sessions, and daily step counts.

Until last year, I believed CICO was cause rather than effect. For whatever reason, I thought, I’m hungrier or have less willpower, and so I must suffer to achieve a healthy weight. My beliefs have since turned toward a meandering set point controlled by a yet unknown homeostat. Whether contamination, omega ratios, a gravitostat, or some combination, anecdotal analysis suggests the set point is a hidden variable, confounding simple analysis. Weight loss when above it is trivial, and below it, fiendishly difficult. This is why I believe a control system is the most likely candidate for a successful model. Anything simpler would be a low-hanging fruit of simple correlation long ago ferreted out by nutrition science. It’s unlikely I’d be this size a century ago—almost no one was—so what is it I’m getting too much or not enough of?

Ideally, the massive longitudinal dataset contains dozens or hundreds of overlapping micro experiments (months of keto, weekend fasts, low protein, high fiber, potatoes, waves of monotonous meals) that taken together exceed the statistical significance of weeks spent in a lab testing individual hypotheses. Very likely, the answer is already in the data—a few weeks spread across years I gave up too early without realizing the noise in daily weigh-ins masked over-unity loss. If the answer to obesity requires a complex overlap or sequence of conditions, it may be hidden within and first discovered through data mining rather than invented whole cloth by a brilliant hypothesizer. You’d be hard pressed to find more or better data to mine.

I am 41, mostly retired, no prescriptions or real health problems, and no stressors or even major ups/​downs. I have the budget and time to prepare food meeting any specifications, but cannot (out of squeamishness/​disgust) consume animals (35 years) or eggs (~5 years). Dairy I’m still OK with, so don’t send any factory farm videos and ruin those. Not a big supplement fan unless it’s basic with a clear upside like magnesium or substituting for a missing (animal) nutrient like taurine.

Recalling the fun of Quantopian’s contest to write the best stock market prediction algorithm, I’d like to offer the same. Can your model, backtested on my 13 years of intake and weight data, predict what my weight would be given specific foods or macros? It won’t be a simple fit based on RMR. I’ve already built that; it drifts significantly over time, or weight loss would be a trivial exercise in using protein leverage to minimize total intake. It looks more like my set point moves up or down only when some conditions are met, and I’d like to find the control system at work. As an example, made of pure whimsy:

RMR_rolling = 2500
RMR_today = RMR_rolling
if (trailing_average(omega-6:3, 30 days) > 10:1) {
	if (yesterday_protein > 50g)
		RMR_today = RMR_rolling - 500
} else
	RMR_today += min(max(intake - RMR_rolling, 0), 500)

if (steps_today * weight_current > 1,000,000 ft-lbs)
	set_point -= 0.1 lbs

If it were that simple, maybe I could build or outsource it, but I’m looking for a statistician with the interest and creativity to devise control systems I can’t imagine.

As far as the bounty, a control system description and code to test it that doesn’t statistically overfit is worth at least $5,000. Diagnosing a subset of foods to exclude or increase, action to take and its precise value (3 mile walks shift set point down 4 ounces), or sustainable, non-miserable insight (only eat potatoes? tried thrice, data is in there) is worth $10,000. If I can put it into practice successfully for 6 months without suffering (highly subjective), and it moves my set point down >30 lbs., that’s worth a minimum of $20,000. If in backtesting it can predict today’s intake from current weight, set point, recent macros, etc., that’s phenomenal.

My fantasy? A model that asks: were you wearing a weighted vest on your summer 2020 walks, because the gravitostat underpredicted loss. Or, were you on vacation in a high altitude area this month? These bounties are minimums for mediocre, technically correct conclusions. For genuinely life-changing insight that allows me to live ad libitum, I’d go into six figures. I could just get a ’tide drug, but I’d first like to try for a victory on my factory settings that permanently pushes my set point down.

I am technically competent and will help in any way I can. If this kind of data analysis is your wheelhouse and such games are beneath you, I’ll simply meet your hourly rate. Same if you wish to work on a subset of the problem, like massaging the data into a database or extracting PUFA ratios.

Repository of intake, weight, composition, steps, RMR, sleep, workouts