If there’s literally a linear projection of the residual stream into two dimensions which directly produces that fractal, with no further processing/transformation in between “linear projection” and “fractal”, then I would change my mind about the fractal structure being mostly an artifact of the visualization method.
There is literally a linear projection (well, we allow a constant offset actually, so affine) of the residual stream into two dimensions which directly produces that fractal. There’s no distributions in the middle or anything. I suspect the offset is not necessary but I haven’t checked ::adding to to-do list::
edit: the offset isn’t necessary. There is literally a linear projection of the residual stream into 2D which directly produces the fractal.
But the “fractal-ness” is mostly an artifact of the MSP as a representation-method IIUC; the stochastic process itself is not especially “naturally fractal”.
(As I said I don’t know the details of the MSP very well; my intuition here is instead coming from some background knowledge of where fractals which look like those often come from, specifically chaos games.)
I’m not sure I’m following, but the MSP is naturally fractal (in this case), at least in my mind. The MSP is a stochastic process, but it’s a very particular one—it’s the stochastic process of how an optimal observer’s beliefs (about which state an HMM is in) change upon seeing emissions from that HMM. The set of optimal beliefs themselves are fractal in nature (for this particular case).
Chaos games look very cool, thanks for that pointer!
Responding in reverse order:
There is literally a linear projection (
well, we allow a constant offset actually, so affine) of the residual stream into two dimensions which directly produces that fractal. There’s no distributions in the middle or anything. Isuspect the offset is not necessary but I haven’t checked ::adding to to-do list::edit: the offset isn’t necessary. There is literally a linear projection of the residual stream into 2D which directly produces the fractal.
I’m not sure I’m following, but the MSP is naturally fractal (in this case), at least in my mind. The MSP is a stochastic process, but it’s a very particular one—it’s the stochastic process of how an optimal observer’s beliefs (about which state an HMM is in) change upon seeing emissions from that HMM. The set of optimal beliefs themselves are fractal in nature (for this particular case).
Chaos games look very cool, thanks for that pointer!