I agree that it is not as strong evidence as if we had access to original CoT, but I think that having deviations in CoT is more likely than summarizer fumbling that hard.
I think it’s more likely that this is just a (non-model) bug in ChatGPT. In the examples you gave, it looks like there’s always one step that comes completely out of nowhere and the rest of the chain of though would make sense without it. This reminds me of the bug where ChatGPT would show other users’ conversations.
I agree that it is not as strong evidence as if we had access to original CoT, but I think that having deviations in CoT is more likely than summarizer fumbling that hard.
There are now two alleged instances of full chains of thought leaking (use an appropriate amount of spepticism), both of which seem coherent enough.
I think it’s more likely that this is just a (non-model) bug in ChatGPT. In the examples you gave, it looks like there’s always one step that comes completely out of nowhere and the rest of the chain of though would make sense without it. This reminds me of the bug where ChatGPT would show other users’ conversations.