Dakara comments on If we solve alignment, do we die anyway?

Dakara 19 Nov 2024 19:26 UTC
1 point
0
Sure, I might as well ask my question directly about scalable oversight, since it seems like a leading strategy of iterative alignment anyways. I do have one preliminary question (which probably isn’t worthy of being included in that post, given that it doesn’t ask about a specific issue or threat model, but rather about expectations of people).

I take it that this strategy relies on evaluation being easier than coming up with research? Do you expect this to be the case?