Steven Byrnes comments on Takeoff speeds have a huge effect on what it means to work on AI x-risk

Steven Byrnes 13 Apr 2022 18:21 UTC
LW: 18 AF: 10
AF
I guess it depends on “how fast is fast and how slow is slow”, and what you say is true on the margin, but here’s my plea that the type of thinking that says “we want some technical problem to eventually get solved, so we try to solve it” is a super-valuable type of thinking right now even if we were somehow 100% confident in slow takeoff. (This is mostly an abbreviated version of this section.)
1. Differential Technological Development (DTD) seems potentially valuable, but is only viable if we know what paths-to-AGI will be safe & beneficial really far in advance. (DTD could take the form of accelerating one strand of modern ML relative to another—e.g. model-based RL versus self-supervised language models etc.—or it could take the form of differentially developing ML-as-a-whole compared to, I dunno, something else.) Relatedly, suppose (for the sake of argument) that someone finds an ironclad proof that safe prosaic AGI is impossible, and the only path forward is a global ban on prosaic AGI research. It would be way better to find that proof right now than finding it in 5 years, and better in 5 years than 10, etc., and that’s true no matter how gradual takeoff is.
2. We don’t know how long safety research will take. If takeoff happens over N years, and safety research takes N+1 years, that’s bad even if N is large.
  1. Maybe you’ll say that almost all of the person-years of safety research will happen during takeoff, and any effort right now is a drop in the ocean compared to that. But I really think wall-clock time is an important ingredient in research progress, not just person-years. (“Nine women can’t make a baby in a month.”)
3. We don’t just need to figure out the principles for avoiding AGI catastrophic accidents. We also need every actor with a supercomputer to understand and apply these principles. Some ideas take many decades to become widely (let alone universally) accepted—famous examples include evolution and plate tectonics. It takes wall-clock time for arguments to be refined. It takes wall-clock time for evidence to be marshaled. It takes wall-clock time for nice new pedagogical textbooks to be created. And of course, it takes wall-clock time for the stubborn holdouts to die and be replaced by the next generation. :-P
- MaxRa 14 Apr 2022 13:03 UTC
  4 points
  Parent
  Some ideas take many decades to become widely (let alone universally) accepted—famous examples include evolution and plate tectonics.
  One example that an AI policy person mentioned in a recent Q&A is “bias in ML” already being fairly much a consensus issue in ML and AI policy. I guess this happened in 5ish years?
  - Steven Byrnes 14 Apr 2022 13:45 UTC
    13 points
    Parent
    I certainly wouldn’t say that all correct ideas take decades to become widely accepted. For example, often somebody proves a math theorem, and within months there’s an essentially-universal consensus that the theorem is true and the proof is correct.
    Still, “bias in ML” is an interesting example. I think that in general, “discovering bias and fighting it” is a thing that everyone feels very good about doing, especially in academia and tech which tend to be politically left-wing. So the deck was stacked in its favor for it to become a popular cause to support and talk about. But that’s not what we need for AGI safety. The question is not “how soon will almost everyone be saying feel-good platitudes about AGI safety and bemoaning the lack of AGI safety?”; the question is “how soon will AGI safety be like bridge-building safety, where there are established, universally-agreed-upon, universally-followed, legally-mandated, idiot-proof best practices?”. I don’t think the “bias in ML” field is there yet. I’m not an expert, but my impression is that there is a lot of handwringing about bias in ML, and not a lot of established universal idiot-proof best practices about bias in ML. I think a lot of the discourse is bad or confused—e.g. people continue to cite the ProPublica report as a prime example of “bias in ML” despite the related impossibility theorem (see Brian Christian book chapter 2). I’m not even sure that all the currently-popular best practices are good ideas. For example, if there’s a facial recognition system that’s worse at black faces than white faces, my impression is that best practices are to diversify the training data so that it gets better at black faces. But it seems to me that facial recognition systems are just awful, because they enable mass surveillance, and the last thing we should be doing is making them better, and if they’re worse at identifying a subset of the population then maybe those people are the lucky ones.
    So by my standards, “bias in ML” is still a big mess, and therefore 5ish years hasn’t been enough.
    - Not Relevant 15 Apr 2022 4:20 UTC
      2 points
      Parent
      I think the ML bias folks are stuck with too hard a problem, since they’ve basically decided that all of justice can/should (or should not) be remedied through algorithms. As a result the technical folks have run into all the problems philosophy never solved, and so “policy” can only do the most obvious interventions (limit use of inaccurate facial recognition) which get total researcher consensus. (Not to mention the subfield is left-coded and thus doesn’t win the bipartisan natsec-tech crowd.) That said, 5 years was certainly enough to get their scholars heavily embedded throughout a presidential administration.