jefftk comments on Dragon Agnosticism

jefftk 2 Aug 2024 1:26 UTC
14 points
8
I don’t think refocusing my main efforts on the alignment problem would make humanity safer.
- jefftk 2 Aug 2024 11:19 UTC
  29 points
  8
  Parent
  For people disagree-voting [edit: at the time the parent was disagree-voted to −7], I’d be happy to see arguments that I should switch from trying to detect bioengineered pandemics to alignment research.
  - Raemon 2 Aug 2024 16:04 UTC
    4 points
    0
    Parent
    I don’t know if you should switch offhand, but I notice this post of yours is pretty old and a lot has changed.
    What’s your rough assessment of AI risk now?
    If you haven’t thought about it explicitly, I think it’s probably worth spending a week thinking about.
    
    Also, how many people work on your current project? If you left, would that tank the project or be pretty replaceable? (Or: if you left are there other people who might then be more likely to leave?).
    - jefftk 2 Aug 2024 18:58 UTC
      13 points
      3
      Parent
      
      What’s your rough assessment of AI risk now?
      
      I think it’s pretty important and I’m glad a bunch of people are working on it. I seriously considered switching into it in spring 2022 before deciding to go into biorisk
      
      Also, how many people work on your current project? If you left, would that tank the project or be pretty replaceable?
      
      We’re pretty small (~7), and I’ve recently started leading our near-term first team (four people counting me, trying to hire two more). I think I’m not very replaceable: my strengths are very different from others on the team in a highly complementary way, especially from a “let’s get a monitoring system up and running now” perspective.
      
      (I must admit to some snark in my short response to Mako above. I’m mildly grumpy about people going around as if alignment is literally the only thing that matters. But that’s also not really what he was saying, since he was pushing back against my worrying about dragons and not my day job.)
      - Raemon 2 Aug 2024 19:02 UTC
        5 points
        0
        Parent
        Nod. I had initially remembered the Superintelligence Risk Project being more recent than 2017, was there a 2022 writeup of your decisionmaking?
  - the gears to ascension 2 Aug 2024 21:16 UTC
    2 points
    0
    Parent
    I think you should think about how your work generalizes between the topics, and try to make it possible for alignment researchers to take as much as they can from it; this is because I expect software pandemics are going to become increasingly similar to wetware pandemics, and so significant conceptual parts of defenses for either will generalize somewhat. That said, I also think that the stronger form of the alignment problem is likely to be useful to you directly on your work anyway; if detecting pandemics in any way involves ML, you’re going to run into adversarial examples, and will quickly be facing the same collapsed set of problems (what objective do I train for? how well did it work, can an adversarial optimization process eg evolution or malicious bioengineers break this? what side effects will my system have if deployed?) as anyone who tries to deploy ML. If you’re instead not using ML, I just think your system won’t work very well and you’re being unambitious at your primary goal, because serious bioengineered dangers are likely to involve present-day ML bio tools by the time they’re a major issue.
    But I think you in particular are doing something sufficiently important that it’s quite plausible to me that you’re correct. This is very unusual and I wouldn’t say it to many people. (normally I’d just not bother directly saying they should switch to working on alignment because of not wanting to waste their time when I’m confident they won’t be worth my time to try to spin up, and I instead just make noise about the problem vaguely in people’s vicinity and let them decide to jump on it if desired.)