They were likely using inferior techniques to RLHF to implement ~Google corporate standards; not sure what you mean by “ethics-based,” presumably they have different ethics than you (or LW) does but intent alignment has always been about doing what the user/operator wants, not about solving ethics.
This is not to do with ethics though?
This is just the model hallucinating?
They were likely using inferior techniques to RLHF to implement ~Google corporate standards; not sure what you mean by “ethics-based,” presumably they have different ethics than you (or LW) does but intent alignment has always been about doing what the user/operator wants, not about solving ethics.
Well it has often been about not doing what the user wants, actually.