Kaj_Sotala comments on o1: A Technical Primer

Kaj_Sotala 10 Dec 2024 11:34 UTC
LW: 9 AF: 3
1
AF
We can also learn something about how o1 was trained from the capabilities it exhibits. Any proposed training procedure must be compatible with the following capabilities:
1. Error Correction: “[o1] learns to recognize and correct its mistakes.”
2. Factoring: “[o1] learns to break down tricky steps into simpler ones.”
3. Backtracking: “[o1] learns to try a different approach when the current one isn’t working.”
I would be cautious of drawing particularly strong conclusions from isolated sentences in an announcement post. The purpose of the post is marketing, not technical accuracy. It wouldn’t be unusual for engineers at a company to object to technical inaccuracies in marketing material and have their complaints ignored.
There probably aren’t going to be any blatant lies in the post, but something like “It’d sound cool if we said that the system learns to recognize and correct its mistakes, would there be a way of interpreting the results like that if you squinted the right way? You’re saying that in principle yes, but yes in a way that would also apply to every LLM since GPT-2? Good enough, let’s throw that in” seems very plausible.
- CBiddulph 10 Dec 2024 20:37 UTC
  7 points
  0
  Parent
  While I’d be surprised to hear about something like this happening, I wouldn’t be that surprised. But in this case, it seems pretty clear that o1 is correcting its own mistakes in a way that past GPTs essentially never did, if you look at the CoT examples in the announcement (e.g. the “Cipher” example).
- Jesse Hoogland 11 Dec 2024 17:13 UTC
  LW: 4 AF: 2
  0
  AF Parent
  The examples they provide one of the announcement blog posts (under the “Chain of Thought” section) suggest this is more than just marketing hype (even if these examples are cherry-picked):
  
  Here are some excerpts from two of the eight examples:
  Cipher:
  Hmm.
  
  But actually in the problem it says the example:
  ...
  Option 2: Try mapping as per an assigned code: perhaps columns of letters?
  
  Alternatively, perhaps the cipher is more complex.
  
  Alternatively, notice that “oyfjdnisdr” has 10 letters and “Think” has 5 letters.
  ...
  Alternatively, perhaps subtract: 25 −15 = 10.
  
  No.
  
  Alternatively, perhaps combine the numbers in some way.
  
  Alternatively, think about their positions in the alphabet.
  
  Alternatively, perhaps the letters are encrypted via a code.
  
  Alternatively, perhaps if we overlay the word ‘Think’ over the cipher pairs ‘oy’, ‘fj’, etc., the cipher is formed by substituting each plaintext letter with two letters.
  
  Alternatively, perhaps consider the ‘original’ letters.
  Science:
  Wait, perhaps more accurate to find Kb for F^− and compare it to Ka for NH4+.
  ...
  But maybe not necessary.
  ...
  Wait, but in our case, the weak acid and weak base have the same concentration, because NH4F dissociates into equal amounts of NH4^+ and F^-
  ...
  Wait, the correct formula is:
- otto.barten 11 Dec 2024 11:42 UTC
  1 point
  0
  Parent
  I don’t have strong takes on what exactly is happening in this particular case but I agree that companies (and more generally, people at high-pressure positions) are very frequently doing the kind of thing you describe. I don’t think we have an indication that this would not be valid for leading AI labs as well.