This is a pretty strong update against LLMs for me. I would have expected them to perform okay against a random model given free access to the board state and list of legal moves. I suspect I could probably win blind (and I am a serious player, certainly others can win multiple blind games at once) so this is not entirely a perception issue. On the other hand, o1 is certainly getting some traction, which often precedes steady improvement (based on the last couple of years). But… like, it’s basically doing a super overpriced tree search. I’m guessing a tree search to depth 3 with a naive heuristic is already enough to beat a random player, so I’m not convinced that the LLM is lifting any weight here.
This is a pretty strong update against LLMs for me. I would have expected them to perform okay against a random model given free access to the board state and list of legal moves. I suspect I could probably win blind (and I am a serious player, certainly others can win multiple blind games at once) so this is not entirely a perception issue. On the other hand, o1 is certainly getting some traction, which often precedes steady improvement (based on the last couple of years). But… like, it’s basically doing a super overpriced tree search. I’m guessing a tree search to depth 3 with a naive heuristic is already enough to beat a random player, so I’m not convinced that the LLM is lifting any weight here.