LawrenceC comments on Thomas Kwa’s Shortform

LawrenceC 5 May 2024 18:29 UTC
8 points
2
Technically, the probability assigned to a hypothesis over time should be the martingale (i.e. have expected change zero); this is just a restatement of the conservation of expected evidence/law of total expectation.
The random walk model that Thomas proposes is a simple model that illustrates a more general fact. For a martingale $(S_{n})_{n \in Z^{+}}$ , the variance of $S_{t}$ is equal to the sum of variances of the individual timestep changes $X_{i} := S_{i} - S_{i - 1}$ (and setting $S_{0} := 0$ ): $Var (S_{t}) = \sum_{i = 1}^{t} Var (X_{i})$ . Under this frame, insofar as small updates contribute a large amount to the variance of each update $X_{i}$ , then the contribution to the small updates to the credences must also be large (which in turn means you need to have a lot of them in expectation^[1]).
Note that this does not require any strong assumption besides that the the distribution of likely updates is such that the small updates contribute substantially to the variance. If the structure of the problem you’re trying to address allows for enough small updates (relative to large ones) at each timestep, then it must allow for “enough” of these small updates in the sequence, in expectation.
While the specific +1/-1 random walk he picks is probably not what most realistic credences over time actually look like, playing around with it still helps give a sense of what exactly “conservation of expected evidence” might look/feel like. (In fact, in the dath ilan of Swimmer’s medical dath ilan glowfics, people do use a binary random walk to illustrate how calibrated beliefs typically evolve over time.)
Now, in terms of if it’s reasonable to model beliefs as Brownian motion (in the standard mathematical sense, not in the colloquial sense): if you suppose that there are many, many tiny independent additive updates to your credence in a hypothesis, your credence over time “should” look like Brownian motion at a large enough scale (again in the standard mathematical sense), for similar reasons as to why the sum of a bunch of independent random variables converges to a Gaussian. This doesn’t imply that your belief in practice should always look like Brownian motion, any more than the CLT implies that real world observables are always Gaussian. But again, the claim Thomas makes carries thorough
I also make the following analogy in my head: Bernouli:Gaussian ~= Simple Random Walk:Brownian Motion, which I found somewhat helpful. Things irl are rarely independent/time-invarying Bernoulli or Gaussian processes, but they’re mathematically convenient to work with, and are often ‘good enough’ for deriving qualitative insights.
1. ^
  Note that you need to apply something like the optional stopping theorem to go from the case of $S_{T}$ for fixed $T,$ to the case of $S_{τ}$ where $τ$ is the time you reach 0 or 1 credence and the updates stop.