In one of his appearances on video this year, Eliezer said IIRC that all of the intent-alignment techniques he knows of stop working once the AI’s capabilities improve enough, mentioning RLHF. Other than that I am not knowledgeable enough to answer you.
In one of his appearances on video this year, Eliezer said IIRC that all of the intent-alignment techniques he knows of stop working once the AI’s capabilities improve enough, mentioning RLHF. Other than that I am not knowledgeable enough to answer you.