David Udell comments on AGI Ruin: A List of Lethalities

David Udell 5 Jun 2022 23:21 UTC
10 points
0
“Geniuses” with nice legible accomplishments in fields with tight feedback loops where it’s easy to determine which results are good or bad right away, and so validate that this person is a genius, are (a) people who might not be able to do equally great work away from tight feedback loops, (b) people who chose a field where their genius would be nicely legible even if that maybe wasn’t the place where humanity most needed a genius, and (c) probably don’t have the mysterious gears simply because they’re rare. You cannot just pay $5 million apiece to a bunch of legible geniuses from other fields and expect to get great alignment work out of them. They probably do not know where the real difficulties are, they probably do not understand what needs to be done, they cannot tell the difference between good and bad work, and the funders also can’t tell without me standing over their shoulders evaluating everything, which I do not have the physical stamina to do. I concede that real high-powered talents, especially if they’re still in their 20s, genuinely interested, and have done their reading, are people who, yeah, fine, have higher probabilities of making core contributions than a random bloke off the street. But I’d have more hope—not significant hope, but more hope—in separating the concerns of (a) credibly promising to pay big money retrospectively for good work to anyone who produces it, and (b) venturing prospective payments to somebody who is predicted to maybe produce good work later.

What fields would qualify as “lacking tight feedback loops”? Computer security? Why don’t, e.g., credentialed math geniuses leading their subfields qualify—because math academia is already pretty organized and inventing a new subfield of math (or whatever) is just not in the same reference class of feat as Newton inventing mathematical physics from scratch?

(c) probably still holds even if there exists a promising class of legible geniuses, though.
- lc 6 Jun 2022 0:00 UTC
  25 points
  6
  Parent
  Most of the impressive computer security subdisciplines have very tight feedback loops and extreme legibility; that’s what makes them impressive. When I think of the hardest security jobs, I think of 0-day writers, red-teamers, etc., who might have whatever Eliezer describes as security mindset but are also described extremely well by him in #40. There are people that do a really good job of protecting large companies, but they’re rare, and their accomplishments are highly illegible except to a select group of guys at e.g. SpecterOps. I don’t think MIRI would be able to pick them out, which is of course not their fault.
  I’d say something more like hedge fund management, but unfortunately those guys tend to be paid pretty well...
- Ruby 6 Jun 2022 20:27 UTC
  3 points
  1
  Parent
  I think the intended field lacking tight feedback loops is AI alignment.
  - David Udell 6 Jun 2022 22:55 UTC
    4 points
    2
    Parent
    (I meant: What fields can we draw legible geniuses from, into alignment.)
    - Kenny 9 Jun 2022 16:29 UTC
      5 points
      0
      Parent
      I think people have floated the idea of recruiting ‘math geniuses’ specifically and EY is claiming that, even if they could be recruited and were recruited, we couldn’t (reasonably) “expect to get great alignment work out of them”.