faul_sname comments on A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed

faul_sname 22 Aug 2024 23:31 UTC
36 points
3
Alright, I’m terrible at abstract thinking, so I went through the post and came up with a concrete example. Does this seem about right?
Suppose we have multiple distributions $P^{1}, \dots, P^{k}$ over the same random variables $X_{1}, \dots, X_{n}$ . (Speaking somewhat more precisely: the distributions are over the same set, and an element of that set is represented by values $(x_{1}, \dots, x_{n})$ .)
We are a quantitative trading firm. Our investment strategy is such that we care about the prices of the stocks in the S&P 500 at market close today ( $X_{1}, \dots, X_{n}$ ).
We have a bunch of models of the stock market ( $P^{1}, \dots, P^{k}$ ), where we can feed in a set of possible prices of stocks in the S&P 500 at market close, and the model spits out a probability of seeing that exact combination of prices (where a single combination of prices is $(x_{1}, \dots, x_{n})$ ).
We take a mixture of the distributions: $P [X] := \sum_{j} α_{j} P^{j} [X]$ , where $\sum_{j} α_{j} = 1$ and $α$ is nonnegative
We believe that some of our models are better than others, so our trading strategy is to take a weighted average of the predictions of each model, where the weight assigned to the $j$ th model $P^{j}$ is $α_{j}$ , and obviously the weights have to sum to 1 for this to be an “average”.
Mathematically: the natural latent over $P [X]$ is defined by $(x, λ \mapsto P [Λ = λ | X = x])$ , and naturality means that the distribution $(x, λ \mapsto P [Λ = λ | X = x] P [X = x])$ satisfies the naturality conditions (mediation and redundancy).
We believe that there is some underlying factor which we will call “market factors” ( $Λ$ ) such that if you control for “market factors”, you no longer learn (approximately) anything about the price of say MSFT when you learn about the price of AAPL, and also such that if you order the stocks in the S&P 500 alphabetically and then take the odd-indexed stocks (i.e. A, AAPL, ABNB, …) in that list and call them the S&P250odd, and call the even-indexed (i.e. AAL, ABBV, ABT, …) ones the S&P250even, you will come to (approximately) the same estimation of “market factors” by looking at either the S&P250odd or the S&P250even. Further, this means that if you estimate “market conditions” by looking at S&P250odd, then your estimation of the price of AAL will be approximately unchanged if you learn the price of ABT.
Then our theorem says: if an approximate natural latent exists over $P [X]$ , and that latent is robustly natural under changing the mixture weights $α$ , then the same latent is approximately natural over $P^{j} [X]$ for all $j$ .
Anyway, if we find that the above holds for the weighted sum we use in practice, and we also find that it robustly ^[1] holds when we change the weights, that actually means that all of our market price models take “market factors” into account.
Alternatively stated, it means that if one of the models was written by an intern that procrastinated until the end of his internship and then on the last morning wrote def predict_price(ticker): return numpy.random.lognormal(), then our weighted sum is not robust to changes in the weights.
Is this a reasonable interpretation? If so, I’m pretty interested to see where you go with this.
1. ^
  Terms and conditions apply. This information is not intended as, and shall not be understood or construed as, financial advice.
What links here?
- Zolmeister's comment on Raemon’s Shortform by Raemon (11 Sep 2024 1:52 UTC; 3 points)
- johnswentworth 22 Aug 2024 23:34 UTC
  6 points
  0
  Parent
  Nailed it, well done.
  - faul_sname 23 Aug 2024 0:15 UTC
    6 points
    0
    Parent
    One point of confusion I still have is what a natural latent screens off information relative to the prediction capabilities of.
    
    Let’s say one of the models “YTDA” in the ensemble knows the beginning-of-year price of each stock, and uses “average year-to-date market appreciation” as its latent., and so learning the average year-to-date market appreciation of the S&P250odd will tell it approximately everything about that latent, and learning the year-to-date appreciation of ABT will give it almost no information it knows how to use about the year-to-date appreciation of AMGN.
    
    So relative to the predictive capabilities of the YTDA model, I think it is true that “average year-to-date market appreciation” is a natural latent.
    
    However, another model “YTDAPS” in the ensemble might use “per-sector average year-to-date market appreciation” as its latent. Since both the S&P250even and S&P250odd contain plenty of stocks in each sector, it is again the case that once you know the YTDAPS’ latent conditioning on S&P250odd, learning the price of ABT will not tell the YTDAPS model anything about the price of AMGN.
    
    But then if both of these are latents, does that mean that your theorem proves that any weighted sum of natural latents is also itself a natural latent?