ryan_greenblatt comments on Some negative steganography results

ryan_greenblatt 9 Dec 2023 23:18 UTC
5 points
2

I have to say, it does seem a bit implausible to me that even GPT-3.5-turbo is unable to do so when GPT-3.5 is so powerful and your steganography instance is such a simple one.

The “poor man’s RL” scheme I used here is quite weak (and perhaps poorly tuned). I suspect this is the primary issue, though limitations on the OpenAI finetuning API as you described could also be an issue.