Stephen McAleese comments on 7 traps that (we think) new alignment researchers often fall into

Stephen McAleese 28 Sep 2022 22:12 UTC
3 points
0
I really like this post because it’s readable and informative. For the second problem, pursuing proxy goals, I recommend also reading about a related problem called the XY problem.
On point 4: many popular alignment ideas are not models of current systems, but models of future AI systems. Accuracy is then lost not only from modeling the system but also from having to create a prediction about it.