It’d be interesting to figure out where the biggest danger in this setup is coming from. 1) Difficulty of aligning the wrapper 2) Wild behavior from the LLM 3) Something else. And whether there can be spot fixes for some of it.
It’d be interesting to figure out where the biggest danger in this setup is coming from. 1) Difficulty of aligning the wrapper 2) Wild behavior from the LLM 3) Something else. And whether there can be spot fixes for some of it.