ryan_greenblatt comments on Can we get an AI to “do our alignment homework for us”?

ryan_greenblatt 26 Feb 2024 22:01 UTC
4 points
0
I said “fail to sufficiently check human AI safety work given substantial time”. This might be considerably easier than ensuring that such institutions exist immediately and can already evaluate things. I was just noting there was a weaker version of “build institutions which are reasonably good at checking the quality of AI safety work done by humans” which is required for a pause to produce good safety work.

Of course, good AI safety work (in the traditional sense of AI safety work) might be not be the best route forward. We could also (e.g.) work on routes other than AI like emulated minds.