I agree that human feedback does not ensure safety, what I meant is that if it is necessary for functioning, it restricts how smart or powerful an AI can become.
Necessary-at-stage-1 is not the same as necessary-at-stage-2. A lot of people seem to use the word “safety” in conjunction with a single medium-level obstacle to one slice out of the total risk pie.
I agree that human feedback does not ensure safety, what I meant is that if it is necessary for functioning, it restricts how smart or powerful an AI can become.
Necessary-at-stage-1 is not the same as necessary-at-stage-2. A lot of people seem to use the word “safety” in conjunction with a single medium-level obstacle to one slice out of the total risk pie.
Agreed. (Alternatively, this could end up like obedient AI maybe? Not sure).