Sammy Martin comments on [Linkpost] Introducing Superalignment

Sammy Martin 6 Jul 2023 10:44 UTC
LW: 19 AF: 10
2
AF
Very nice! I’d say this seems like it’s aimed at a difficulty level of 5 to 7 on my table,

https://www.lesswrong.com/posts/EjgfreeibTXRx9Ham/ten-levels-of-ai-alignment-difficulty#Table

I.e. experimentation on dangerous systems and interpretability play some role but the main thrust is automating alignment research and oversight, so maybe I’d unscientifically call it a 6.5, which is a tremendous step up from the current state of things (2.5) and would solve alignment in many possible worlds.