Linch comments on What actual bad outcome has “ethics-based” RLHF AI Alignment already prevented?

Linch 20 Oct 2024 0:32 UTC
2 points
1
They were likely using inferior techniques to RLHF to implement ~Google corporate standards; not sure what you mean by “ethics-based,” presumably they have different ethics than you (or LW) does but intent alignment has always been about doing what the user/operator wants, not about solving ethics.
- Roko 20 Oct 2024 16:33 UTC
  2 points
  3
  Parent
  
  alignment has always been about doing what the user/operator wants
  
  Well it has often been about not doing what the user wants, actually.