To be clear, I think a large update against neuralese is that this seems like the sort of thing that would be pretty likely to leak and I’m not aware of any public leaks. Probably this should yield more like 10% likely. I didn’t think very carefully about the 25%.
o1 was the first large reasoning model — as we outlined in the original “Learning to Reason” blog, it’s “just” an LLM trained with RL. o3 is powered by further scaling up RL beyond o1
@ryan_greenblatt Shouldn’t this be interpreted as a very big update vs. the neuralese-in-o3 hypothesis?
An LLM trained with a sufficient amount of RL maybe could learn to compress its thoughts into more efficient representations than english text, which seems consistent with the statement. I’m not sure if this is possible in practice; I’ve asked here if anyone knows of public examples.
What public knowledge has led you to this estimate?
The interaction with users used for o1 (where the AI thinks for a while prior to sending a response) is consistent with neuralese.
RL adding substantial additional capabilities means there might be enough RL for this to work.
o3 is a substantial leap over o1 seemingly.
To be clear, I think a large update against neuralese is that this seems like the sort of thing that would be pretty likely to leak and I’m not aware of any public leaks. Probably this should yield more like 10% likely. I didn’t think very carefully about the 25%.
Makes sense. Perhaps we’ll know more when o3 is released. If the model doesn’t offer a summary of CoT it makes neuralese more likely.
From https://x.com/__nmca__/status/1870170101091008860:
@ryan_greenblatt Shouldn’t this be interpreted as a very big update vs. the neuralese-in-o3 hypothesis?
No
An LLM trained with a sufficient amount of RL maybe could learn to compress its thoughts into more efficient representations than english text, which seems consistent with the statement. I’m not sure if this is possible in practice; I’ve asked here if anyone knows of public examples.
Yes, I would count it if the CoT is total gibberish which is (steganographically) encoding reasoning.