Thomas Kwa comments on Thomas Kwa’s Shortform

Thomas Kwa 4 May 2024 8:35 UTC
LW: 19 AF: 7
−13
AF
You should update by +-1% on AI doom surprisingly frequently
This is just a fact about how stochastic processes work. If your p(doom) is Brownian motion in 1% steps starting at 50% and stopping once it reaches 0 or 1, then there will be about 50^2=2500 steps of size 1%. This is a lot! If we get all the evidence for whether humanity survives or not uniformly over the next 10 years, then you should make a 1% update 4-5 times per week. In practice there won’t be as many due to heavy-tailedness in the distribution concentrating the updates in fewer events, and the fact you don’t start at 50%. But I do believe that evidence is coming in every week such that ideal market prices should move by 1% on maybe half of weeks, and it is not crazy for your probabilities to shift by 1% during many weeks if you think about it often enough. [Edit: I’m not claiming that you should try to make more 1% updates, just that if you’re calibrated and think about AI enough, your forecast graph will tend to have lots of >=1% week-to-week changes.]
What links here?
- niplav's comment on quila’s Shortform by quila (12 May 2024 16:40 UTC; 4 points)
- LawrenceC 4 May 2024 15:58 UTC
  LW: 31 AF: 15
  12
  AF Parent
  The general version of this statement is something like: if your beliefs satisfy the law of total expectation, the variance of the whole process should equal the variance of all the increments involved in the process.^[1] In the case of the random walk where at each step, your beliefs go up or down by 1% starting from 50% until you hit 100% or 0% -- the variance of each increment is 0.01^2 = 0.0001, and the variance of the entire process is 0.5^2 = 0.25, hence you need 0.25/0.0001 = 2500 steps in expectation. If your beliefs have probability p of going up or down by 1% at each step, and 1-p of staying the same, the variance is reduced by a factor of p, and so you need 2500/p steps.
  (Indeed, something like this standard way to derive the expected steps before a random walk hits an absorbing barrier).
  Similarly, you get that if you start at 20% or 80%, you need 1600 steps in expectation, and if you start at 1% or 99%, you’ll need 99 steps in expectation.
  One problem with your reasoning above is that as the 1%/99% shows, needing 99 steps in expectation does not mean you will take 99 steps with high probability—in this case, there’s a 50% chance you need only one update before you’re certain (!), there’s just a tail of very long sequences. In general, the expected value of variables need not look like
  I also think you’re underrating how much the math changes when your beliefs do not come in the form of uniform updates. In the most extreme case, suppose your current 50% doom number comes from imagining that doom is uniformly distributed over the next 10 years, and zero after -- then the median update size per week is only 0.5/520 ~= 0.096%/week, and the expected number of weeks with a >1% update is 0.5 (it only happens when you observe doom). Even if we buy a time-invariant random walk model of belief updating, as the expected size of your updates get larger, you also expect there to be quadratically fewer of them—e.g. if your updates came in increments of size 0.1 instead of 0.01, you’d expect only 25 such updates!
  Applying stochastic process-style reasoning to beliefs is empirically very tricky, and results can vary a lot based on seemingly reasonable assumptions. E.g. I remember Taleb making a bunch of mathematically sophisticated arguments^[2] that began with “Let your beliefs take the form of a Wiener process^[3]” and then ending with an absurd conclusion, such as that 538′s forecasts are obviously wrong because their updates aren’t Gaussian distributed or aren’t around 50% until immediately before the elction date. And famously, reasoning of this kind has often been an absolute terrible idea in financial markets. So I’m pretty skeptical of claims of this kind in general.
  1. ^
    There’s some regularity conditions here, but calibrated beliefs that things you eventually learn the truth/falsity of should satisfy these by default.
  2. ^
    Often in an attempt to Euler people who do forecasting work but aren’t super mathematical, like Philip Tetlock.
  3. ^
    This is what happens when you take the limit of the discrete time random walk, as you allow for updates on ever smaller time increments. You get Gaussian distributed increments per unit time—W_t+u—W_t ~ N(0, u) -- and since the tail of your updates is very thin, you continue to get qualitatively similar results to your discrete-time random walk model above.
    And yes, it is ironic that Taleb, who correctly points out the folly of normality assumptions repeatedly, often defaults to making normality assumptions in his own work.
  - Thomas Kwa 5 May 2024 6:41 UTC
    LW: 5 AF: 4
    3
    AF Parent
    I talked about this with Lawrence, and we both agree on the following:
    There are mathematical models under which you should update >=1% in most weeks, and models under which you don’t.
    Brownian motion gives you 1% updates in most weeks. In many variants, like stationary processes with skew, stationary processes with moderately heavy tails, or Brownian motion interspersed with big 10%-update events that constitute <50% of your variance, you still have many weeks with 1% updates. Lawrence’s model where you have no evidence until either AI takeover happens or 10 years passes does not give you 1% updates in most weeks, but this model is almost never the case for sufficiently smart agents.
    Superforecasters empirically make lots of little updates, and rounding off their probabilities to larger infrequent updates make their forecasts on near-term problems worse.
    Thomas thinks that AI is the kind of thing where you can make lots of reasonable small updates frequently. Lawrence is unsure if this is the state that most people should be in, but it seems plausibly true for some people who learn a lot of new things about AI in the average week (especially if you’re very good at forecasting).
    In practice, humans often update in larger discrete chunks. Part of this is because they only consciously think about new information required to generate new numbers once in a while, and part of this is because humans have emotional fluctuations which we don’t include in our reported p(doom).
    Making 1% updates in most weeks is not always just irrational emotional fluctuations; it is consistent with how a rational agent would behave under reasonable assumptions. However, we do not recommend that people consciously try to make 1% updates every week, because fixating on individual news articles is not the right way to think about forecasting questions, and it is empirically better to just think about the problem directly rather than obsessing about how many updates you’re making.
  - niplav 4 May 2024 22:00 UTC
    4 points
    4
    Parent
    Thank you a lot for this. I think this or @Thomas Kwas comment would make an excellent original-sequences-style post—it doesn’t need to be long, but just going through an example and talking about the assumptions would be really valuable for applied rationality.
    
    After all, it’s about how much one should expect ones beliefs to vary, which is pretty important.
- Seth Herd 4 May 2024 13:20 UTC
  8 points
  2
  Parent
  But… Why would p(doom) move like Brownian motion until stopping at 0 or 1?
  
  I don’t disagree with your conclusions, there’s a lot of evidence coming in, and if you’re spending full time or even part time thinking about alignment, a lot of important updates on the inference. But assuming a random walk seems wrong.
  
  Is there a reason that a complex, structured unfolding of reality would look like a random walk?
  - niplav 4 May 2024 22:03 UTC
    4 points
    0
    Parent
    Because^[1] for a Bayesian reasoner, there is conservation of expected evidence.
    
    Although I’ve seen it mentioned that technically the change in the belief on a Bayesian should follow a Martingale, and Brownian motion is a martingale.
    
    ↩︎
    I’m not super technically strong on this particular part of the math. Intuitively it could be that in a bounded reasoner which can only evaluate programs in $P$ , any pattern in its beliefs that can be described by an algorithm in $P$ is detected and the predicted future belief from that pattern is incorporated into current beliefs. On the other hand, any pattern described by an algorithm in $EXPTIME ∖ P$ can’t be in the class of hypotheses of the agent, including hypotheses about its own beliefs, so $EXPTIME$ patterns persist.
    - LawrenceC 5 May 2024 18:29 UTC
      8 points
      2
      Parent
      Technically, the probability assigned to a hypothesis over time should be the martingale (i.e. have expected change zero); this is just a restatement of the conservation of expected evidence/law of total expectation.
      The random walk model that Thomas proposes is a simple model that illustrates a more general fact. For a martingale $(S_{n})_{n \in Z^{+}}$ , the variance of $S_{t}$ is equal to the sum of variances of the individual timestep changes $X_{i} := S_{i} - S_{i - 1}$ (and setting $S_{0} := 0$ ): $Var (S_{t}) = \sum_{i = 1}^{t} Var (X_{i})$ . Under this frame, insofar as small updates contribute a large amount to the variance of each update $X_{i}$ , then the contribution to the small updates to the credences must also be large (which in turn means you need to have a lot of them in expectation^[1]).
      Note that this does not require any strong assumption besides that the the distribution of likely updates is such that the small updates contribute substantially to the variance. If the structure of the problem you’re trying to address allows for enough small updates (relative to large ones) at each timestep, then it must allow for “enough” of these small updates in the sequence, in expectation.
      While the specific +1/-1 random walk he picks is probably not what most realistic credences over time actually look like, playing around with it still helps give a sense of what exactly “conservation of expected evidence” might look/feel like. (In fact, in the dath ilan of Swimmer’s medical dath ilan glowfics, people do use a binary random walk to illustrate how calibrated beliefs typically evolve over time.)
      Now, in terms of if it’s reasonable to model beliefs as Brownian motion (in the standard mathematical sense, not in the colloquial sense): if you suppose that there are many, many tiny independent additive updates to your credence in a hypothesis, your credence over time “should” look like Brownian motion at a large enough scale (again in the standard mathematical sense), for similar reasons as to why the sum of a bunch of independent random variables converges to a Gaussian. This doesn’t imply that your belief in practice should always look like Brownian motion, any more than the CLT implies that real world observables are always Gaussian. But again, the claim Thomas makes carries thorough
      I also make the following analogy in my head: Bernouli:Gaussian ~= Simple Random Walk:Brownian Motion, which I found somewhat helpful. Things irl are rarely independent/time-invarying Bernoulli or Gaussian processes, but they’re mathematically convenient to work with, and are often ‘good enough’ for deriving qualitative insights.
      ^
      Note that you need to apply something like the optional stopping theorem to go from the case of $S_{T}$ for fixed $T,$ to the case of $S_{τ}$ where $τ$ is the time you reach 0 or 1 credence and the updates stop.
    - Seth Herd 5 May 2024 15:29 UTC
      4 points
      0
      Parent
      I get conservation of expected evidence. But the distribution of belief changes is completely unconstrained.
      
      Going from the class martingale to the subclass Brownian motion is arbitrary, and the choice of 1% update steps is another unjustified arbitrary choice.
      
      I think asking about the likely possible evidence paths would improve our predictions.
      
      You spelled it conversation of expected evidence. I was hoping there was another term by that name :)
      - LawrenceC 5 May 2024 18:37 UTC
        6 points
        0
        Parent
        To be honest, I would’ve preferred if Thomas’s post started from empirical evidence (e.g. it sure seems like superforecasters and markets change a lot week on week) and then explained it in terms of the random walk/Brownian motion setup. I think the specific math details (a lot of which don’t affect the qualitative result of “you do lots and lots of little updates, if there exists lots of evidence that might update you a little”) are a distraction from the qualitative takeaway.
        A fancier way of putting it is: the math of “your belief should satisfy conservation of expected evidence” is a description of how the beliefs of an efficient and calibrated agent should look, and examples like his suggest it’s quite reasonable for these agents to do a lot of updating. But the example is not by itself necessarily a prescription for how your belief updating should feel like from the inside (as a human who is far from efficient or perfectly calibrated). I find the empirical questions of “does the math seem to apply in practice” and “therefore, should you try to update more often” (e.g., what do the best forecasters seem to do?) to be larger and more interesting than the “a priori, is this a 100% correct model” question.
      - niplav 5 May 2024 16:20 UTC
        2 points
        0
        Parent
        Oops, you’re correct about the typo and also about how this doesn’t restrict belief change to Brownian motion. Fixing the typo.
- niplav 4 May 2024 10:27 UTC
  6 points
  0
  Parent
  Thank you a lot! Strong upvoted.
  
  I was wondering a while ago whether Bayesianism says anything about how much my probabilities are “allowed” to oscillate around—I was noticing that my probability of doom was often moving by 5% in the span of 1-3 weeks, though I guess this was mainly due to logical uncertainty and not empirical uncertainty.
  
  Since there are 10 5% steps between 50% and 0 or 1, and for ~10 years, I should expect to make these kinds of updates ~100 times, or 10 times a year, or a little bit less than once a month, right? So I’m currently updating “too much”.
- Alexander Gietelink Oldenziel 4 May 2024 8:42 UTC
  6 points
  0
  Parent
  Interesting...
  
  Wouldn’t I expect the evidence to come out in a few big chunks, e.g. OpenAI releasing a new product?
  - Thomas Kwa 4 May 2024 10:49 UTC
    6 points
    2
    Parent
    To some degree yes, but I expect lots of information to be spread out across time. For example: OpenAI releases GPT5 benchmark results. Then a couple weeks later they deploy it on ChatGPT and we can see how subjectively impressive it is out of the box, and whether it is obviously pursuing misaligned goals. Over the next few weeks people develop post-training enhancements like scaffolding, and we get a better sense of its true capabilities. Over the next few months, debate researchers study whether GPT4-judged GPT5 debates reliably produce truth, and control researchers study whether GPT4 can detect whether GPT5 is scheming. A year later an open-weights model of similar capability is released and the interp researchers check how understandable it is and whether SAEs still train.
- Dagon 4 May 2024 16:53 UTC
  4 points
  2
  Parent
  I think this leans a lot on “get evidence uniformly over the next 10 years” and “Brownian motion in 1% steps”. By conservation of expected evidence, I can’t predict the mean direction of future evidence, but I can have some probabilities over distributions which add up to 0.
  For long-term aggregate predictions of event-or-not (those which will be resolved at least a few years away, with many causal paths possible), the most likely updates are a steady reduction as the resolution date gets closer, AND random fairly large positive updates as we learn of things which make the event more likely.
- p.b. 4 May 2024 14:43 UTC
  3 points
  2
  Parent
  I think all the assumptions that go into this model are quite questionable, but it’s still an interesting thought.
- JBlack 5 May 2024 5:40 UTC
  2 points
  −2
  Parent
  It definitely should not move by anything like a Brownian motion process. At the very least it should be bursty and updates should be expected to be very non-uniform in magnitude.
  In practice, you should not consciously update very often since almost all updates will be of insignificant magnitude on near-irrelevant information. I expect that much of the credence weight turns on unknown unknowns, which can’t really be updated on at all until something turns them into (at least) known unknowns.
  But sure, if you were a superintelligence with practically unbounded rationality then you might in principle update very frequently.
  - Thomas Kwa 5 May 2024 7:20 UTC
    2 points
    0
    Parent
    The Brownian motion assumption is rather strong but not required for the conclusion. Consider the stock market, which famously has heavy-tailed, bursty returns. It happens all the time for the S&P 500 to move 1% in a week, but a 10% move in a week only happens a couple of times per decade. I would guess (and we can check) that most weeks have >0.6x of the average per-week variance of the market, which causes the median weekly absolute return to be well over half of what it would be if the market were Brownian motion with the same long-term variance.
    Also, Lawrence tells me that in Tetlock’s studies, superforecasters tend to make updates of 1-2% every week, which actually improves their accuracy.
- TsviBT 5 May 2024 2:26 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Probabilities on summary events like this are mostly pretty pointless. You’re throwing together a bunch of different questions, about which you have very different knowledge states (including how much and how often you should update about them).