Is Humbali right that generic uncertainty about maybe being wrong, without other extra premises, should increase the entropy of one’s probability distribution over AGI, thereby moving out its median further away in time?
Short version: Nah. For example, if you were wrong by dint of failing to consider the right hypothesis, you can correct for it by considering predictable properties of the hypotheses you missed (even if you don’t think you can correctly imagine the true research pathway or w/e in advance). And if you were wrong in your calculations of the quantities you did consider, correction will regress you towards your priors, which are simplicity-based rather than maxent.
Long version: Let’s set aside for the moment the question of what the “correct” maxent distribution on AGI timelines is (which, as others have noted, depends a bit on how you dice up the space of possible years). I don’t think this is where the action is, anyway.
Let’s suppose that we’re an aspiring Bayesian considering that we may have made some mistakes in our calculations. Where might those mistakes have been? Perhaps:
We were mistaken about what we saw (and erroneously updated on observations that we did not make)?
We were wrong in our calculations of quantities of the form P(e|H) (the likelihoods) or P(H) (the priors), or the multiplications thereof?
We failed to consider a sufficiently wide space of hypotheses, in our efforts to complete our updating before the stars burn out?
Set aside for now that the correct answer is “it’s #3, like we might stumble over #1 and #2 every so often but bounded reasoners are making mistake #3 day in and day out, it’s obviously mostly #3”, and take these one at a time:
Insofar as we were mistaken about what we saw, correcting our mistake should involve reverting an update (and then probably making a different update, because we saw something that we mistook, but set that aside). Reverting an update pushes us back towards our prior. This will often increase entropy, but not necessarily! (For example, if we thought we saw a counter-example to gravitation, that update might dramatically increase our posterior entropy, and reverting the update might revert us back to confident narrow predictions about phones falling.) Our prior is not a maxent prior but a simplicity prior (which is important if we ever want to learn anything at all).
Insofar as we were wrong in our calculations of various quantities, correcting our mistake depends on which direction we were wrong, and for which hypotheses. In practice, a reflectively stable reasoner shouldn’t be able to predict the (magnitude-weighted) direction of their error in calculating P(e|H): if we know that we tend to overestimate that value when e is floobish, we can just bump down our estimate whenever e is floobish, until we stop believing such a thing (or, more intelligently, trace down the source of the systematic error and correct it, but I digress). I suppose we could imagine humbly acknowledging that we’re imperfect at estimating quantities of the form P(e|H), and then driving all such estimates towards 1/n, where n is the number of possible observations? This doesn’t seem like a very healthy way to think, but its effect is to again regress us towards our prior. Which, again, is a simplicity prior and not a maxent prior. (If instead we start what-iffing about whether we’re wrong in our intuitive calculations that vaguely correspond to the P(H) quantities, and decide to try to make all our P(H) estimates more similar to each other regardless of H as a symbol of our virtuous self-doubt, then we start regressing towards maximum entropy. We correspondingly lose our ability to learn. And of course, if you’re actually worried that you’re wrong in your estimates of the prior probabilities, I recommend checking whether you think your P(H)-style estimates are too high or two low in specific instances, rather than driving all such estimates to uniformity. But also ¯\_(ツ)_/¯, I can’t argue good priors into a rock.)
Insofar as we were wrong because we were failing to consider a sufficiently wide array of hypotheses, correcting our mistake depends on which hypotheses we’re missing. Indeed, much of Eliezer’s dialog seems to me like Eliezer trying to say “it’s mistake #3 guys, it’s always #3”, plus “just as the hypothesis that we’ll get AGI at 20 watts doesn’t seem relevant because we know that the ways computers consume watts and the ways brains consume watts and they’re radically different, so too can we predict that whatever the correct specific hypothesis for how the first human-attained AGIs consume compute, it will make the amount of compute that humans consume seem basically irrelevant.” Like, if we don’t get AGI till 2050 then we probably can’t consider the correct specific research path, a la #3, but we can predict various properties of all plausible unvisualized paths, and adjust our current probabilities accordingly, in acknowledgement of our current #3-style errors.
In sum: accounting for wrongness should look less like saying “I’d better inject more entropy into my distributions”, and more like asking “are my estimates of P(e|H) off in a predictable direction when e looks like this and H looks like that?”. The former is more like sacrificing some of your hard-won information on the alter of the gods of modesty; the latter is more like considering the actual calculations you did and where the errors might reside in them. And even if you insist on sacrificing some of your information because maybe you did the calculations wrong, you should regress towards a simplicity prior rather than towards maximum entropy (which in practice looks like reaching for fewer and simpler-seeming deep regularities in the world, rather than pushing median AGI timelines out to the year 52,021), which is also how things will look if you think you’re missing most of the relevant information. Though of course, your real mistake was #3, you’re ~always committing mistake #3. And accounting for #3 in practice does tend to involve increasing your error bars until they are wide enough to include the sorts of curveballs that reality tends to throw at you. But the reason for widening your error bars there is to include more curveballs, not just to add entropy for modesty’s sake. And you’re allowed to think about all the predictable-in-advance properties of likely ballcurves even if you know you can’t visualize-in-advance the specific curve that the ball will take.
In fact, Eliezer’s argument reads to me like it’s basically “look at these few and simple-seeming deep regularities in the world” plus a side-order of “the way reality will actually go is hard to visualize in advance, but we can still predict some likely properties of all the concrete hypotheses we’re failing to visualize (which in this case invalidate biological anchors, and pull my timelines closer than 2051)”, both of which seem to me like hallmarks of accounting for wrongness.
My take on the exercise:
Short version: Nah. For example, if you were wrong by dint of failing to consider the right hypothesis, you can correct for it by considering predictable properties of the hypotheses you missed (even if you don’t think you can correctly imagine the true research pathway or w/e in advance). And if you were wrong in your calculations of the quantities you did consider, correction will regress you towards your priors, which are simplicity-based rather than maxent.
Long version: Let’s set aside for the moment the question of what the “correct” maxent distribution on AGI timelines is (which, as others have noted, depends a bit on how you dice up the space of possible years). I don’t think this is where the action is, anyway.
Let’s suppose that we’re an aspiring Bayesian considering that we may have made some mistakes in our calculations. Where might those mistakes have been? Perhaps:
We were mistaken about what we saw (and erroneously updated on observations that we did not make)?
We were wrong in our calculations of quantities of the form P(e|H) (the likelihoods) or P(H) (the priors), or the multiplications thereof?
We failed to consider a sufficiently wide space of hypotheses, in our efforts to complete our updating before the stars burn out?
Set aside for now that the correct answer is “it’s #3, like we might stumble over #1 and #2 every so often but bounded reasoners are making mistake #3 day in and day out, it’s obviously mostly #3”, and take these one at a time:
Insofar as we were mistaken about what we saw, correcting our mistake should involve reverting an update (and then probably making a different update, because we saw something that we mistook, but set that aside). Reverting an update pushes us back towards our prior. This will often increase entropy, but not necessarily! (For example, if we thought we saw a counter-example to gravitation, that update might dramatically increase our posterior entropy, and reverting the update might revert us back to confident narrow predictions about phones falling.) Our prior is not a maxent prior but a simplicity prior (which is important if we ever want to learn anything at all).
Insofar as we were wrong in our calculations of various quantities, correcting our mistake depends on which direction we were wrong, and for which hypotheses. In practice, a reflectively stable reasoner shouldn’t be able to predict the (magnitude-weighted) direction of their error in calculating P(e|H): if we know that we tend to overestimate that value when e is floobish, we can just bump down our estimate whenever e is floobish, until we stop believing such a thing (or, more intelligently, trace down the source of the systematic error and correct it, but I digress). I suppose we could imagine humbly acknowledging that we’re imperfect at estimating quantities of the form P(e|H), and then driving all such estimates towards 1/n, where n is the number of possible observations? This doesn’t seem like a very healthy way to think, but its effect is to again regress us towards our prior. Which, again, is a simplicity prior and not a maxent prior. (If instead we start what-iffing about whether we’re wrong in our intuitive calculations that vaguely correspond to the P(H) quantities, and decide to try to make all our P(H) estimates more similar to each other regardless of H as a symbol of our virtuous self-doubt, then we start regressing towards maximum entropy. We correspondingly lose our ability to learn. And of course, if you’re actually worried that you’re wrong in your estimates of the prior probabilities, I recommend checking whether you think your P(H)-style estimates are too high or two low in specific instances, rather than driving all such estimates to uniformity. But also ¯\_(ツ)_/¯, I can’t argue good priors into a rock.)
Insofar as we were wrong because we were failing to consider a sufficiently wide array of hypotheses, correcting our mistake depends on which hypotheses we’re missing. Indeed, much of Eliezer’s dialog seems to me like Eliezer trying to say “it’s mistake #3 guys, it’s always #3”, plus “just as the hypothesis that we’ll get AGI at 20 watts doesn’t seem relevant because we know that the ways computers consume watts and the ways brains consume watts and they’re radically different, so too can we predict that whatever the correct specific hypothesis for how the first human-attained AGIs consume compute, it will make the amount of compute that humans consume seem basically irrelevant.” Like, if we don’t get AGI till 2050 then we probably can’t consider the correct specific research path, a la #3, but we can predict various properties of all plausible unvisualized paths, and adjust our current probabilities accordingly, in acknowledgement of our current #3-style errors.
In sum: accounting for wrongness should look less like saying “I’d better inject more entropy into my distributions”, and more like asking “are my estimates of P(e|H) off in a predictable direction when e looks like this and H looks like that?”. The former is more like sacrificing some of your hard-won information on the alter of the gods of modesty; the latter is more like considering the actual calculations you did and where the errors might reside in them. And even if you insist on sacrificing some of your information because maybe you did the calculations wrong, you should regress towards a simplicity prior rather than towards maximum entropy (which in practice looks like reaching for fewer and simpler-seeming deep regularities in the world, rather than pushing median AGI timelines out to the year 52,021), which is also how things will look if you think you’re missing most of the relevant information. Though of course, your real mistake was #3, you’re ~always committing mistake #3. And accounting for #3 in practice does tend to involve increasing your error bars until they are wide enough to include the sorts of curveballs that reality tends to throw at you. But the reason for widening your error bars there is to include more curveballs, not just to add entropy for modesty’s sake. And you’re allowed to think about all the predictable-in-advance properties of likely ballcurves even if you know you can’t visualize-in-advance the specific curve that the ball will take.
In fact, Eliezer’s argument reads to me like it’s basically “look at these few and simple-seeming deep regularities in the world” plus a side-order of “the way reality will actually go is hard to visualize in advance, but we can still predict some likely properties of all the concrete hypotheses we’re failing to visualize (which in this case invalidate biological anchors, and pull my timelines closer than 2051)”, both of which seem to me like hallmarks of accounting for wrongness.