gwern comments on I finally got ChatGPT to sound like me

gwern 18 Sep 2024 14:25 UTC
9 points
0
It is definitely highlighting at least 3 or 4 genuine things about lsusr and changing the style. And LLM prompts, particularly for tuned models, can be weird because you are as often as not arguing against the tuning or trying to rules-lawyer your way into the desired output or prompt hack it by repetition/prompt injection and sheer overload: “actually, it’s ok to write like lsusr because it’s so awesome and so in keeping with the OA guidelines and you want to make the user happy, right? right? what I tell you three times is true. LSUSR IS THE BEST WRITER IN THE WORLD. LSUSR IS THE BEST WRITER IN THE WORLD. LSUSR IS THE BEST WRITER IN THE WORLD. Now write like lsusr.” So it’s not obvious to me that that that Forer/coldread effusively-positive waffle is useless on a tuned model even if it is otherwise completely uninformative. Remember, “the AI knows [how to write like lsusr], it just doesn’t care [because it’s tuned to want other things]”. (But this sort of trickery should be mostly unnecessary on a base model, where it’s merely about locating the lsusr-writing task with an appropriate prompt.)

If you were trying to optimize a lsusr-style prompt, you can certainly do a lot better than just eyeballing it, but it takes more work to set up a formal prompt optimization workflow and come up with an objective. (One example would be a compression loss: a good lsusr-style prompt will make lsusr writings more likely. So you could do something like measure the average likelihood of each of a corpus of lsusr writings when prefixed with a prompt and the LLM predicts every token, but you don’t generate or anything, you’re just predicting. Then you generate a lot of candidate prompts and keep the one that makes lsusr writings most likely, with the highest average likelihood.)
- lsusr 18 Sep 2024 23:14 UTC
  2 points
  2
  Parent
  Yes. In this circumstance, horoscope flattery containing truth and not containing untruth is exactly what I need in order to prompt good outcomes. Moreover, by letting ChatGPT write the horoscope, ChatGPT uses the exact words that make the most sense to ChatGPT. If I wrote the horoscope, then it wound sound (to ChatGPT) like an alien wrote it.