RHollerith comments on If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

RHollerith 3 Oct 2024 13:39 UTC
2 points
−35
TsviBT didn’t recommend MIRI probably because he receives a paycheck from MIRI and does not want to appear self-serving. I on the other hand have never worked for MIRI and am unlikely ever to (being of the age when people usually retire) so I feel free to recommend MIRI without hesitation or reservation.

MIRI has abandoned hope of anyone’s solving alignment before humanity runs out of time: they continue to employ people with deep expertise in AI alignment, but those employees spend their time explaining why the alignment plans of others will not work.

Most technical alignment researchers are increasing P(doom) because they openly publish results that help both the capability research program and the alignment program, but the alignment program is very unlikely to reach a successful conclusion before the capability program “succeeds”, so publishing the results only shortens the amount of time we have to luck into an effective response or resolution to the AI danger (which again if one appears might not even involve figuring out how to align an AI so that it stays aligned as it becomes an ASI).

There are 2 other (not-for-profit) organizations in the sector that as far as I can tell are probably doing more good than harm, but I don’t know enough about them for it to be a good idea for me to name them here.
- TsviBT 3 Oct 2024 13:52 UTC
  22 points
  2
  Parent
  I’m no longer employed by MIRI. I think Yudkowsky is by far the best source of technical alignment research insight; but MIRI’s research program was in retrospect probably pretty doomed even before I got there. I can see ways to improve it but I’m not that confident and I can somewhat directly see that I’m probably not capable of carrying out my suggested improvements. And AFAIK, as you say they’re not currently doing very much alignment research. I’m also fine with appearing self-serving; if I were actively doing alignment research, I might recommend myself, though I don’t really think it’s appropriate to do so to a random person who can’t evaluate arguments about alignment research and doesn’t know who to trust. I guess if someone pays me enough I’ll do some alignment research. I recommend myself as one authority among others on strategy regarding strong human intelligence amplification.
  - RHollerith 6 Oct 2024 16:20 UTC
    2 points
    −2
    Parent
    I’m not saying that MIRI has some effective plan which more money would help with. I’m only saying that unlike most of the actors accepting money to work in AI Safety, at least they won’t use a donation in a way that makes the situation worse. Specifically, MIRI does not publish insights that help the AI project and is very careful in choosing whom they will teach technical AI skills and knowledge.
    - ryan_greenblatt 6 Oct 2024 16:53 UTC
      8 points
      3
      Parent
      
      at least they won’t use a donation in a way that makes the situation worse
      
      Seems false, they could have problematic effects on discourse if their messaging is poor or seems dumb in retrospect.
      
      I disagree pretty heavily with MIRI which makes this more likely from my perspective.
      
      It seems likely that Yudkowsky has lots of bad effects on discourse right now even from his own lights. I feel pretty good about official MIRI comms activities from my understanding despite a number of disagreements.
  - Sam Iacono 4 Oct 2024 16:01 UTC
    −1 points
    0
    Parent
    - TsviBT 4 Oct 2024 16:28 UTC
      4 points
      3
      Parent
      Not sure what you’re asking. I think someone trying to work on the technical problem of AI alignment should read Yudkowsky. I think this because… of a whole bunch of the content of ideas and arguments. Would need more context to elaborate, but it doesn’t seem like you’re asking about that.
      - Sam Iacono 4 Oct 2024 18:31 UTC
        −1 points
        0
        Parent
        TsviBT 4 Oct 2024 18:47 UTC
        2 points
        0
        Parent
        I still don’t know what you mean.