Eli Tyre comments on Towards more cooperative AI safety strategies

Eli Tyre 21 Jul 2024 5:19 UTC
3 points
1
my understanding was they had no plan to create a sovereign for most of their history (like after 2004)
Yeah, I think that’s false.

The plan was “Figure out how to build a friendly AI, and then build one”. (As Eliezer stated in the video that I linked somewhere else in this comment thread).

But also, I got that impression from the Sequences? Like Eliezer talks about actually building an AGI, not just figuring out the theory of how to build one. You didn’t get that impression?
- Ruby 21 Jul 2024 5:33 UTC
  4 points
  2
  Parent
  I don’t remember what exactly I thought in 2012 when I was reading the Sequences. I do recall sometime later, after DL was in full swing, it seeming like MIRI wasn’t in any position to be building AGI before others (like no compute, not the engineering prowess), and someone (not necessarily at MIRI) confirmed that wasn’t the plan. Now and at the time, I don’t know how much that was principle vs ability.
  - Ruby 21 Jul 2024 5:42 UTC
    2 points
    0
    Parent
    My feeling of the plan pre-pivotal-act era was “figure out the theory of how to build a safe AI at all, and try to get whoever is building to adopt that approach”, and that MIRI wasn’t taking any steps to be the ones building it. I also had the model that due to psychological unity of mankind, anyone building an aligned[ with them] AGI was a good outcome compared to someone building unaligned. Like even if it was Xi Jinping, a sovereign aligned with him would be okay (and not obviously that dramatically different from anyone else?). I’m not sure how much this was MIRI positions vs fragments that I combined in my own head that came from assorted places and were never policy.
    - Eli Tyre 21 Jul 2024 18:39 UTC
      3 points
      0
      Parent
      Well, I can tell you that they definitely planned to build the Friendly AI, after figuring out how.
      
      See this other comment.
      - Ruby 21 Jul 2024 19:17 UTC
        2 points
        0
        Parent
        Pretty solid evidence.