VojtaKovarik comments on Can we get an AI to “do our alignment homework for us”?

VojtaKovarik 26 Feb 2024 21:07 UTC
3 points
0
However, note that if you think we would fail to sufficiently check human AI safety work given substantial time, we would also fail to solve various issues given a substantial pause
This does not seem automatic to me (at least in the hypothetical scenario where “pause” takes a couple of decades). The reasoning being that there is difference between [automate a current form of an institution, and speed-run 50 years of it in a month] and [an institutions, as it develops over 50 years].
For example, my crux^[1] is that current institutions do not subscribe to the security mindset with respect to AI. But perhaps hypothetical institutions in 50 years might.
1. ^
  For being in favour of slowing things down; if that were possible in a reasonable way, which it might not be.
- ryan_greenblatt 26 Feb 2024 22:01 UTC
  4 points
  0
  Parent
  I said “fail to sufficiently check human AI safety work given substantial time”. This might be considerably easier than ensuring that such institutions exist immediately and can already evaluate things. I was just noting there was a weaker version of “build institutions which are reasonably good at checking the quality of AI safety work done by humans” which is required for a pause to produce good safety work.
  
  Of course, good AI safety work (in the traditional sense of AI safety work) might be not be the best route forward. We could also (e.g.) work on routes other than AI like emulated minds.