I just tried multiplying 13-digit numbers with o3-mini (high). My approach was to ask it to explain a basic multiplication algorithm to me, and then carry it out. On the first try it was lazy and didn’t actually follow the algorithm (it just told me “it would take a long time to actually carry out all the shifts and multiplications...”, and it got the result wrong.
Then I told it to follow the algorithm, even if it is time consuming, and it did, and the result was correct.
So I’m not sure about the take that
The fact that something that has ingested the entirety of human literature can’t figure out how to generalize multiplication past 13 digits is actually a sign of the fact that it has no understanding of what a multiplication algorithm is.
The model got lazy, did some part of the calculation “in its head” (i.e. not actually following the algorithm but guesstimating the result, like we would do if we were asked to do a task like that without pencil and paper), and got the result slightly wrong—but when you ask it to actually follow the multiplication algorithm it just explained to me, it can absolutely do it.
I’d be interested in the CoT that led to the incorrect conclusion. If the model actually believed that it’s lazy estimation leads to the correct result, that shows that it’s overestimating its own capabilities—one could call this a fundamental misunderstanding of multiplication. I know that I’m incorrect when I’m estimating the result in my head, because I understand stuff about multiplication—or one could call it a failure of introspection.
The other possibility is that it didn’t care to produce an entirely correct result, and just didn’t bother and got lazy.
I would assume this is because wasting time (which is to the detriment of your opponent, and which he cannot control) in the first example is a not instrumental to achieving your goal. It is merely a side-effect. “Thou shall not profit from wasting time”.
If playing optimally involves making decisions that make the game go longer (such as waiting to draw additional countermagic or whatever), so be it.
That said, I’m surprised Zvi said “match wp” here—I assume this is an oversight on his part. He should just have written “game wp”.