how about an having a smaller model governing safety regulations? this could act as an “aligner” on top of LLMs. say some sort of RLHF just focused on mitigating risks
how about an having a smaller model governing safety regulations? this could act as an “aligner” on top of LLMs. say some sort of RLHF just focused on mitigating risks