Grok 3 told me 9.11 > 9.9. (common with other LLMs too), but again, turning on Thinking solves it.
This is unrelated to Grok 3, but I am not convinced that the above part of Andrej Karpathy’s tweet is a “gotcha”. Software version numbers use dots with a different meaning than decimal numbers and there 9.11 > 9.9 would be correct. I don’t think there is a clear correct choice of which of these contexts to assume for an LLM if it only gets these few tokens.
E.g. if I ask Claude, the pure “is 9.11>9.9″ question gives me a no, whereas ”I am trying to install a python package. Could you tell me whether `9.11>9.9`?” gives me a yes.
This is unrelated to Grok 3, but I am not convinced that the above part of Andrej Karpathy’s tweet is a “gotcha”. Software version numbers use dots with a different meaning than decimal numbers and there 9.11 > 9.9 would be correct.
I don’t think there is a clear correct choice of which of these contexts to assume for an LLM if it only gets these few tokens.
E.g. if I ask Claude, the pure “is 9.11>9.9″ question gives me a no, whereas
”I am trying to install a python package. Could you tell me whether `9.11>9.9`?” gives me a yes.