Ash Gray comments on The inordinately slow spread of good AGI conversations in ML

Ash Gray 22 Jun 2022 10:34 UTC
3 points
2
I think your overall point—More Dakka, make AGI less weird—is right. In my experience, though, I disagree with your disagreement:
I disagree with “the case for the risks hasn’t been that clearly laid out”. I think there’s a giant, almost overwhelming pile of intro resources at this point, any one of which is more than sufficient, written in all manner of style, for all manner of audience.^[1]
(I do think it’s possible to create a much better intro resource than any that exist today, but ‘we can do much better’ is compatible with ‘it’s shocking that the existing material hasn’t already finished the job’.)
The short version is that while there is a lot written about alignment, I haven’t seen the core ideas organised into something clear enough to facilitate critically engaging with those ideas.
In my experience, there’s two main issues:
1. is low discoverability of “good” introductory resources.
2. is the existing (findable) resources are not very helpful if your goal is to get a clear understanding of the main argument in alignment—that the default outcome of building AGI without explictly making sure it’s aligned is strongly negative.
For 1, I don’t mean that “it’s hard to find any introductory resources.” I mean that it’s hard to know what is worth engaging with. Because of the problems in 2, its very time-consuming to try to get more than a surface-level understanding. This is an issue for me personally when the main purpose of this kind of exploration is to try and decide whether I want to invest more time and effort in the area.
For 2, there are many issues. The most common is that many resources are now quite old—are they still relevant? What is the state of the field now? Many are very long, or include “lists of ideas” without attempting to organise them into a cohesive whole, are single ideas, or are too vague or informal to evaluate. The result is a general feeling of “Well, okay...But there were a lot of assumptions and handwaves in all that, and I’m not sure if none of them matter.”
(If anyone is interested I can give feedback on specific articles in a reply—omitted here for length. I’ve read a majority of the links in the [1] footnote.)
Some things I think would help this situation:
1. Maintain an up-to-date list of quality “intro to alignment” resources.
  - Note that this shouldn’t be a catch-all for all intro resources. Being opinionated with what you include is a good thing as it helps newcomers judge relative importance.
2. Create a new introductory resource that doesn’t include irrelevant distractions from the main argument.
  - What I’m talking about here are things like timelines, scenarios and likelihoods, policy arguments, questionable analogies (especially evolutionary), appeals to meta-reasoning and the like, that don’t have any direct bearing on alignment itself and add distracting noncentral details that mainly serve to add potential points of disagreement.
    
    I think Arbital does this best, but I think it suffers from being organised as a comprehensive list of ideas as separate pages rather than a cohesive argument supported by specific direct evidence. I’m also not sure how current it is.
3. People who are highly engaged in alignment write “What I think is most important to know about alignment” (which could be specific evidence or general arguments).
Lastly, when you say
If you’re building a machine, you should have an at least somewhat lower burden of proof for more serious risks. It’s your responsibility to check your own work to some degree, and not impose lots of micromorts on everyone else through negligence.^[2]
But I don’t think the latter point matters much, since the ‘AGI is dangerous’ argument easily meets higher burdens of proof as well.
Do you have some specific work in mind which provides this higher burden of proof?