Gemini 2.0 Flash Thinking is claimed to ‘transparently show its thought process’ (in contrast to o1, which only shows a summary): https://x.com/denny_zhou/status/1869815229078745152. This might be at least a bit helpful in terms of studying how faithful (e.g. vs. steganographic, etc.) the Chains of Thought are.
Huge if true! Faithful Chain of Thought may be a key factor in whether the promise of LLMs as ideal for alignment pays off, or not.
I am increasingly concerned that OpenAI isn’t showing us o1s CoT because it’s using lots of jargon that’s heading toward a private language. I hope it’s merely that it didn’t want to show its unaligned “thoughts”, and to prevent competitors from training on its useful chains of thought.
IMO, I think that most of the reason why they are not releasing CoT for o1 is exactly because of PR/competitive reasons, or this reason in a nutshell:
I hope it’s merely that it didn’t want to show its unaligned “thoughts”, and to prevent competitors from training on its useful chains of thought.
Gemini 2.0 Flash Thinking is claimed to ‘transparently show its thought process’ (in contrast to o1, which only shows a summary): https://x.com/denny_zhou/status/1869815229078745152. This might be at least a bit helpful in terms of studying how faithful (e.g. vs. steganographic, etc.) the Chains of Thought are.
Other recent models that show (at least purportedly) the full CoT:
Deepseek R1-Lite (note that you have to turn ‘DeepThink’ on)
Qwen QwQ-32B
Huge if true! Faithful Chain of Thought may be a key factor in whether the promise of LLMs as ideal for alignment pays off, or not.
I am increasingly concerned that OpenAI isn’t showing us o1s CoT because it’s using lots of jargon that’s heading toward a private language. I hope it’s merely that it didn’t want to show its unaligned “thoughts”, and to prevent competitors from training on its useful chains of thought.
IMO, I think that most of the reason why they are not releasing CoT for o1 is exactly because of PR/competitive reasons, or this reason in a nutshell: