Ruby comments on AGI Ruin: A List of Lethalities

Ruby 6 Jun 2022 20:51 UTC
LW: 10 AF: 2
3
AF
I’m curious about why you decided it wasn’t worth your time.

Going from the post itself, the case for publishing it goes something like “the whole field of AI Alignment is failing to produce useful work because people aren’t engaging with what’s actually hard about the problem and are ignoring all the ways their proposals are doomed; perhaps yelling at them via this post might change some of that.”

Accepting the premises (which I’m inclined to), trying to get the entire field to correct course seems actually pretty valuable, maybe even worth a month of your time, now that I think about it.
- johnswentworth 6 Jun 2022 21:43 UTC
  LW: 39 AF: 15
  9
  AF Parent
  First and foremost, I have been making extraordinarily rapid progress in the last few months, though most of that is not yet publicly visible.
  Second, a large part of why people pour effort into not-very-useful work is that the not-very-useful work is tractable. Useless, but at least you can make progress on the useless thing! Few people really want to work on problems which are actually Hard, so people will inevitably find excuses to do easy things instead. As Eliezer himself complains, writing the list just kicks the can down the road; six months later people will have a new set of bad ideas with giant gaping holes in them. The real goal is to either:
  - produce people who will identify the holes in their own schemes, repeatedly, until they converge to work on things which are actually useful despite being Hard, or
  - get enough of a paradigm in place that people can make legible progress on actually-useful things without doing anything Hard.
  I have recently started testing out methods for the former, but it’s the sort of thing which starts out with lots of tests on individuals or small groups to see what works. The latter, of course, is largely what my technical research is aimed at in the medium term.
  (I also note that there will always be at least some need for people doing the Hard things, even once a paradigm is established.)
  In the short term, if people want to identify the holes in their own schemes and converge to work on actually useful things, I think the “builder/breaker” methodology that Paul uses in the ELK doc is currently a good starting point.
- Thane Ruthenis 7 Jun 2022 4:54 UTC
  8 points
  1
  Parent
  Well, it’s the Law of Continued Failure, as Eliezer termed it himself, no? There’s already been a lot of rants about the real problems of alignment and how basically no-one focuses on them, most of them Eliezer-written as well. The sort of person who wasn’t convinced/course-corrected by previous scattered rants isn’t going to be course-corrected by a giant post compiling all the rants in one place. Someone to whom this post would be of use is someone who’ve already absorbed all the information contained in it from other sources; someone who can already write it up on their own.
  The picture may not be quite as grim as that, but yeah I can see how writing it would not be anyone’s top priority.