Jack Clark: ‘Registering a prediction: I predict that within two years (by July 2026) we’ll see an AI system beat all humans at the IMO, obtaining the top score. Alongside this, I would wager we’ll see the same thing—an AI system beating all humans in a known-hard competition—in another scientific domain outside of mathematics. If both of those things occur, I believe that will present strong evidence that AI may successfully automate large chunks of scientific research before the end of the decade.’ https://importai.substack.com/p/import-ai-380-distributed-13bn-parameter
With research automation in mind, here’s my wager: the modal top-15 STEM PhD student will redirect at least half of their discussion/questions from peers to mid-2026 LLMs. Defining the relevant set of questions as being drawn from the same difficulty/diversity/open-endedness distribution that PhDs would have posed in early 2024.
With research automation in mind, here’s my wager: the modal top-15 STEM PhD student will redirect at least half of their discussion/questions from peers to mid-2026 LLMs.
Fwiw, I’ve kind of already noted myself starting to do some of this, for AI safety-related papers; especially after Claude-3.5 Sonnet came out.
Jack Clark: ‘Registering a prediction: I predict that within two years (by July 2026) we’ll see an AI system beat all humans at the IMO, obtaining the top score. Alongside this, I would wager we’ll see the same thing—an AI system beating all humans in a known-hard competition—in another scientific domain outside of mathematics. If both of those things occur, I believe that will present strong evidence that AI may successfully automate large chunks of scientific research before the end of the decade.’ https://importai.substack.com/p/import-ai-380-distributed-13bn-parameter
Prediction markets on similar questions suggest to me that this is a consensus view.
General LLMs 44% to get gold on the IMO before 2026. This suggests the mathematical competency will be transferrable—not just restricted to domain-specific solvers.
LLMs favored to outperform PhD students in their own subject before 2026
With research automation in mind, here’s my wager: the modal top-15 STEM PhD student will redirect at least half of their discussion/questions from peers to mid-2026 LLMs. Defining the relevant set of questions as being drawn from the same difficulty/diversity/open-endedness distribution that PhDs would have posed in early 2024.
Fwiw, I’ve kind of already noted myself starting to do some of this, for AI safety-related papers; especially after Claude-3.5 Sonnet came out.