Mark Xu comments on Biology-Inspired AGI Timelines: The Trick That Never Works

Mark Xu 5 Dec 2021 1:37 UTC
LW: 2 AF: 1
AF
The way that you would think about NN anchors in my model (caveat that this isn’t my whole model):
- You have some distribution over 2020-FLOPS-equivalent that TAI needs.
- Algorithmic progress means that 20XX-FLOPS convert to 2020-FLOPS-equivalent at some 1:N ratio.
- The function from 20XX to the 1:N ratio is relatively predictable, e.g. a “smooth” exponential with respect to time.
- Therefore, even though current algorithms will hit DMR, the transition to the next algorithm that has less DMR is also predictably going to be some constant ratio better at converting current-FLOPS to 2020-FLOPS-equivalent.
E.g. in (some smallish) parts of my view, you take observations like “AGI will use compute more efficiently than human brains” and can ask questions like “but how much is the efficiency of compute->cognition increasing over time?” and draw that graph and try to extrapolate. Of course, the main trouble is in trying to estimate the original distribution of 2020-FLOPS-equivalent needed for TAI, which might go astray in the way a 1950-watt-equivalent needed for TAI will go astray.
- Vanessa Kosoy 5 Dec 2021 9:40 UTC
  LW: 2 AF: 2
  AF Parent
  I don’t understand this.
  - What is the meaning of “2020-FLOPS-equivalent that TAI needs”? Plausibly you can’t build TAI with 2020 algorithms without some truly astronomical amount of FLOPs.
  - What is the meaning of “20XX-FLOPS convert to 2020-FLOPS-equivalent”? If 2020 algorithms hit DMR, you can’t match a 20XX algorithm with a 2020 algorithm without some truly astronomical amount of FLOPs.
  Maybe you’re talking about extrapolating the compute-performance curve, assuming that it stays stable across algorithmic paradigms (although, why would it??) However, in this case, how do you quantify the performance required for TAI? Do we have “real life elo” for modern algorithms that we can compare to human “real life elo”? Even if we did, this is not what Cotra is doing with her “neural anchor”.
  - Daniel Kokotajlo 5 Dec 2021 10:36 UTC
    LW: 2 AF: 2
    AF Parent
    What is the meaning of “2020-FLOPS-equivalent that TAI needs”? Plausibly you can’t build TAI with 2020 algorithms without some truly astronomical amount of FLOPs.
    I think 10^35 would probably be enough. This post gives some intuition as to why, and also goes into more detail about what 2020-flops-equivalent-that-TAI-needs means. If you want even more detail + rigor, see Ajeya’s report. If you think it’s very unlikely that 10^35 would be enough, I’d love to hear more about why—what are the blockers? Why would OmegaStar, SkunkWorks, etc. described in the post (and all the easily-accessible variants thereof) fail to be transformative? (Also, same questions for APS-AI or AI-PONR instead of TAI, since I don’t really care about TAI)
    - Vanessa Kosoy 5 Dec 2021 11:07 UTC
      LW: 2 AF: 2
      AF Parent
      I didn’t ask how much, I asked what does it even mean. I think I understand the principles of Cotra’s report. What I don’t understand is why should we believe the “neural anchor” when (i) modern algorithms applied to a brain-sized ANN might not produce brain-performance and (ii) the compute cost of future algorithms might behave completely differently. (i.e. I don’t understand how Carl’s and Mark’s arguments in this thread protect the neural anchor from Yudkowsky’s criticism.)
      - Daniel Kokotajlo 5 Dec 2021 11:25 UTC
        LW: 2 AF: 2
        AF Parent
        These are three separate things:
        (a) What is the meaning of “2020-FLOPS-equivalent that TAI needs?”
        (b) Can you build TAI with 2020 algorithms without some truly astronomical amount of FLOPs?
        (c) Why should we believe the “neural anchor?”
        (a) is answered roughly in my linked post and in much more detail and rigor in Ajeya’s doc.
        (b) depends on what you mean by truly astronomical; I think it would probably be doable for 10^35, Ajeya thinks 50% chance.
        For (c), I actually don’t think we should put that much weight on the “neural anchor,” and I don’t think Ajeya’s framework requires that we do (although, it’s true, most of her anchors do center on this human-brain-sized ANN scenario which indeed I think we shouldn’t put so much weight on.) That said, I think it’s a reasonable anchor to use, even if it’s not where all of our weight should go. This post gives some of my intuitions about this. Of course Ajeya’s report says a lot more.