Thank you for posting this. Why do you think this is a steganography evidence in LLMs? Those steg tokens would be unrelated to the question being asked and as such be out of usual distribution and easily noticeable by an eavesdropper. Yet, this is a good evidence for hidden reasoning inside CoT. I think this experiment was done in https://arxiv.org/abs/2404.15758, Pfau, Merrill, and Bowman, ‘Let’s Think Dot by Dot’.
Thank you for posting this. Why do you think this is a steganography evidence in LLMs? Those steg tokens would be unrelated to the question being asked and as such be out of usual distribution and easily noticeable by an eavesdropper. Yet, this is a good evidence for hidden reasoning inside CoT. I think this experiment was done in https://arxiv.org/abs/2404.15758, Pfau, Merrill, and Bowman, ‘Let’s Think Dot by Dot’.