Cleo Nardo comments on Base LLMs refuse too

Cleo Nardo 1 Oct 2024 3:18 UTC
2 points
0
it’s quite common for assistants to refuse instructions, especially harmful instructions. so i’m not surprised that base llms systestemically refuse harmful instructions from than harmless ones.
- cubefox 1 Oct 2024 16:21 UTC
  2 points
  0
  Parent
  Indeed. The base LLM would likely predict a “henchman” to be a lot less scrupulous than an “assistant”.