Daniel Kokotajlo comments on When reporting AI timelines, be clear who you’re deferring to

Daniel Kokotajlo 11 Oct 2022 0:12 UTC
5 points
1
I don’t think that’s fair. BioAnchors is the best model publicly available by many legible metrics; it does postdict the past reasonably well and while I agree that your model postdicts the past better, you didn’t really spend much effort arguing for this in your post, so you shouldn’t be satisfied with the fact that I agree with your conclusion. Also, I think you are using the word “Bayesian” in an unusual way here, I normally hear the phrase “bayesian model” used to mean something else, something which Bio Anchors definitely counts as.

That said, for those watching along, I do think the compute requirements Ajeya estimates are OOMs too high and have argued as much myself, and I do encourage people to think about Jacob’s argument here & go read his post. I’m especially excited to see him elaborate on the (imo plausible) claim that if you apply the Bio Anchors framework to predicting e.g. human-level vision or audio or whatever, it’d be very surprised by recent progress in those areas, and therefore that it overestimates compute requirements. I’d be curious to hear Ajeya’s response to that argument.
- jacob_cannell 11 Oct 2022 6:34 UTC
  6 points
  1
  Parent
  
  BioAnchors is the best model publicly available by many legible metrics; it does postdict the past reasonably well and while I agree that your model postdicts the past better, you didn’t really spend much effort arguing for this in your post, so you shouldn’t be satisfied with the fact that I agree with your conclusion.
  
  I did spend effort arguing for a model that postdicts the past, that’s much of the point of my post. So perhaps you mean I didn’t spend effort comparing said postdiction ability to that of he BioAnchors model. I sketched a computable technique over the relevant dataset to a sufficient level of detail that the reader hopefully can simulate and predict the general outcome. I could go farther and actually evaluate said model on a larger dataset, but it’s somewhat time consuming and the utility of that is mostly constrained by how one evaluates the intelligence or salient equivalent capabilities of various systems.
  
  Also, I think you are using the word “Bayesian” in an unusual way here, I normally hear the phrase “bayesian model” used to mean something else, something which Bio Anchors definitely counts as.
  
  A Bayesian model has a specific form—it is a model that computes a posterior as the product of a likelihood (which computes evidence fit) and a prior, where the prior minimally must weight against bit complexity, ala Occam/Solomonoff. The likelihood component measures the postdiction fit over the relevant evidence p(E|H) - and is required for any bayesian model.
  
  So when I say the BioAnchors model isn’t attempting to be Bayesian, I believe that is just straightforwardly true in the sense that (from what I recall at least), it doesn’t even really attempt to postdict the relevant historical evidence. Now sure you could argue it’s doing that implicitly, but bayesian models are usually very explicit about their likelihood (postdiction) fit.
  
  I’m especially excited to see him elaborate on the (imo plausible) claim that if you apply the Bio Anchors framework to predicting e.g. human-level vision or audio or whatever, it’d be very surprised by recent progress in those areas, and therefore that it overestimates compute requirements.
  
  Yeah so I encourage readers to try this on their own … I may get to it myself and write it up, but I would need to actually study BioAnchors more closely. I didn’t even look at it when writing my post, this comparison only (admittedly naturally) came up later in comments.