faul_sname comments on Why is o1 so deceptive?

faul_sname Oct 6, 2024, 5:37 AM
7 points
2

Another issue is that a lot of o1’s thoughts consist of vagaries like “reviewing the details” or “considering the implementation”, and it’s not clear how to even determine if these steps are inferentially valid.

If you’re referring to the chain of thought summaries you see when you select the o1-preview model in chatgpt, those are not the full chain of thought. Examples of the actual chain-of-thought can be found on the learning to reason with LLMs page with a few more examples in the o1 system card. Note that we are going off of OpenAI’s word that these chain of thought examples are representative—if you try to figure out what actual reasoning o1 used to come to a conclusion you will run into the good old “Your request was flagged as potentially violating our usage policy. Please try again with a different prompt.”
- gwern Oct 6, 2024, 11:58 PM
  9 points
  3
  Parent
  If you distrust OA’s selection, it seems like o1 is occasionally leaking the chains of thought: https://www.reddit.com/r/OpenAI/comments/1fxa6d6/two_purported_instances_of_o1preview_and_o1mini/ So you can cross-reference those to see if OA’s choices seem censored somehow, and also just look at those as additional data.
  
  It’s also noteworthy that people are reporting that there seem like there are other blatant confabulations in the o1 chains, much more so than simply making up a plausible URL, based on the summaries: https://www.reddit.com/r/PromptEngineering/comments/1fj6h13/hallucinations_in_o1preview_reasoning/ Stuff which makes no sense in context and just comes out of nowhere. (And since confabulation seems to be pretty minimal in summarization tasks these days—when I find issues in summaries, it’s usually omitting important stuff rather than making up wildly spurious stuff—I expect those confabulations were not introduced by the summarizer, but were indeed present in the original chain as summarized.)