Rafael Harth comments on Some blindspots in rationality and effective altruism

Rafael Harth 19 Mar 2021 21:06 UTC
43 points
- Eliezer Yudkowsky’s portrayal of a single self-recursively improving AGI (later overturned by some applied ML researchers)
I’ve found myself doubting this claim, so I’ve read the post in question. As far as I can tell, it’s a reasonable summary of the fast takeoff position that many people still hold today. If all you meant to say was that there was disagreement, then fine—but saying ‘later overturned’ makes it sound like there is consensus, not that people still have the same disagreement they’ve had 13 years ago. (And your characterization in the paragraph I’ll quote below also gives that impression.)

In hindsight, judgements read as simplistic and naive in similar repeating ways (relying on one metric, study, or paradigm and failing to factor in mean reversion or model error there; fixating on the individual and ignoring societal interactions; assuming validity across contexts):
What links here?
- Remmelt's comment on Some blindspots in rationality and effective altruism by Remmelt (EA Forum; 21 Mar 2021 18:43 UTC; 9 points)
- Remmelt 19 Mar 2021 21:29 UTC
  3 points
  Parent
  Sorry, I get how the bullet point example gave that impression. I’m keeping the summary brief, so let me see what I can do.
  
  I think the culprit is ‘overturned’. That makes it sound like their counterarguments were a done deal or something. I’ll reword that to ‘rebutted and reframed in finer detail’.
  
  Note though that ‘some applied ML researchers’ hardly sounds like consensus. I did not mean to convey that, but I’m glad you picked it up.
  As far as I can tell, it’s a reasonable summary of the fast takeoff position that many people still hold today.
  Perhaps, your impression from your circle is different from mine in terms of what proportion of AIS researchers prioritise work on the fast takeoff scenario?
  - Rafael Harth 20 Mar 2021 17:39 UTC
    2 points
    Parent
    
    I think the culprit is ‘overturned’. That makes it sound like their counterarguments were a done deal or something. I’ll reword that to ‘rebutted and reframed in finer detail’.
    
    Yeah, I think overturned is the word I took issue with. How about ‘disputed’? That seems to be the term that remains agnostic about whether there is something wrong with the original argument or not.
    
    Perhaps, your impression from your circle is different from mine in terms of what proportion of AIS researchers prioritise work on the fast takeoff scenario?
    
    My impression is that gradual takeoff has gone from a minority to a majority position on LessWrong, primarily due to Paul Christiano, but not an overwhelming majority. (I don’t know how it differs among Alignment Researchers.)
    
    I believe the only data I’ve seen on this was in a thread where people were asked to make predictions about AI stuff, including takeoff speed and timelines, using the new interactive prediction feature. (I can’t find this post—maybe someone else remembers what it was called?) I believe that was roughly compatible with the sizeable minority summary, but I could be wrong.
    - Remmelt 20 Mar 2021 18:12 UTC
      1 point
      Parent
      How about ‘disputed’?
      Seems good. Let me adjust!
      My impression is that gradual takeoff has gone from a minority to a majority position on LessWrong, primarily due to Paul Christiano, but not an overwhelming majority
      This roughly corresponds with my impression actually.
      I know a group that has surveyed researchers that have permission to post on the AI Alignment Forum, but they haven’t posted an analysis of the survey’s answers yet.
- Remmelt 20 Mar 2021 9:47 UTC
  1 point
  Parent
  To disentangle what I had in mind when I wrote ‘later overturned by some applied ML researchers’:
  
  Some applied ML researchers in the AI x-safety research community like Paul Christiano, Andrew Critch, David Krueger, and Ben Garfinkel have made solid arguments towards the conclusion that Eliezer’s past portrayal of a single self-recursively improving AGI had serious flaws.
  
  In the post though, I was sloppy in writing about this particular example, in a way that served to support the broader claims I was making.