I really like this post because it’s readable and informative. For the second problem, pursuing proxy goals, I recommend also reading about a related problem called the XY problem.
On point 4: many popular alignment ideas are not models of current systems, but models of future AI systems. Accuracy is then lost not only from modeling the system but also from having to create a prediction about it.
I really like this post because it’s readable and informative. For the second problem, pursuing proxy goals, I recommend also reading about a related problem called the XY problem.
On point 4: many popular alignment ideas are not models of current systems, but models of future AI systems. Accuracy is then lost not only from modeling the system but also from having to create a prediction about it.