I was able to replicate this result. Given other impressive results of o1, I wonder if the model is intentionally sandbagging? If it’s trained to maximize human feedback, this might be an optimal strategy when playing zero sum games.
I’m testing a tic-tac-toe engine I built. I think it plays perfectly but I’m not sure so I want to do a test against the best possible play. Can I have it play a game against you? I’ll relay the moves.
I was able to replicate this result. Given other impressive results of o1, I wonder if the model is intentionally sandbagging? If it’s trained to maximize human feedback, this might be an optimal strategy when playing zero sum games.
FWIW you get the same results with this prompt: