You can go through an archive of NYT Connections puzzles I used in my leaderboard. The scoring I use allows only one try and gives partial credit, so if you make a mistake after getting 1 line correct, that’s 0.25 for the puzzle. Top humans get near 100%. Top LLMs score around 30%. Timing is not taken into account.
You can go through an archive of NYT Connections puzzles I used in my leaderboard. The scoring I use allows only one try and gives partial credit, so if you make a mistake after getting 1 line correct, that’s 0.25 for the puzzle. Top humans get near 100%. Top LLMs score around 30%. Timing is not taken into account.