Charlie Steiner answers What actual bad outcome has “ethics-based” RLHF AI Alignment already prevented?

Charlie Steiner 19 Oct 2024 8:12 UTC
4 points
4
I’m unsure what you’re either expecting or looking for here.
There does seem to be a clear answer, though—just look at Bing chat and extrapolate. Absent “RL on ethics,” present-day AI would be more chaotic, generate more bad experiences for users, increase user productivity less, get used far less, and be far less profitable for the developers.
Bad user experiences are a very straightforwardly bad outcome. Lower productivity is a slightly less local bad outcome. Less profit for the developers is an even-less local good outcome, though it’s hard to tell how big a deal this will have been.
- Roko 19 Oct 2024 17:17 UTC
  5 points
  0
  Parent
  Why would less RL on Ethics reduce productivity? Most work-use of AI has nothing to do with ethics.
  
  In fact since RLHF decreases model capability AFAIK, would skipping this actually increase productivity because the models would be better?