How has this forecast changed in the last 5 years? Has widespread and rapid advance of non-transformative somewhat-general-purpose LLMs change any of your component predictions?
I don’t actually disagree, but MUCH of the cause of this is an excessively high bar (as you point out, but it still makes the title misleading). “perform nearly all valuable tasks at human cost or less” is really hard to put a lot of stake in, when “cost” is so hard to define at scale in an AGI era. Money changes meaning when a large subset of human action is no longer necessary. And even if we somehow could agree on a unit of measure, comparing the cost of creating a robot vs the cost of creating and raising a human is probably impossible.
How has this forecast changed in the last 5 years? Has widespread and rapid advance of non-transformative somewhat-general-purpose LLMs change any of your component predictions?
We didn’t have this framework 5 years ago, but the tremendous success of LLMs can only be a big positive update, I think. That said, some negative updates for me from the past 15 years have been how slowly Siri improved, how slowly Wolfram Alpha improved, and how slowly Alexa improved. I genuinely expected faster progress from their data flywheels after their launches, but somehow it didn’t seem to happen. Self-driving seems to be middle of the road compared to how I thought it would go 5 years ago.
I don’t actually disagree, but MUCH of the cause of this is an excessively high bar (as you point out, but it still makes the title misleading).
Agreed. I think the “<1%” headline feels like an aggressive claim, but the definition from the contest we use is a very high bar. For lower bars, we’d forecast much higher probabilities. We expect great things from AI and AGI, and we are not reflexively bearish on progress.
How has this forecast changed in the last 5 years? Has widespread and rapid advance of non-transformative somewhat-general-purpose LLMs change any of your component predictions?
I don’t actually disagree, but MUCH of the cause of this is an excessively high bar (as you point out, but it still makes the title misleading). “perform nearly all valuable tasks at human cost or less” is really hard to put a lot of stake in, when “cost” is so hard to define at scale in an AGI era. Money changes meaning when a large subset of human action is no longer necessary. And even if we somehow could agree on a unit of measure, comparing the cost of creating a robot vs the cost of creating and raising a human is probably impossible.
We didn’t have this framework 5 years ago, but the tremendous success of LLMs can only be a big positive update, I think. That said, some negative updates for me from the past 15 years have been how slowly Siri improved, how slowly Wolfram Alpha improved, and how slowly Alexa improved. I genuinely expected faster progress from their data flywheels after their launches, but somehow it didn’t seem to happen. Self-driving seems to be middle of the road compared to how I thought it would go 5 years ago.
Agreed. I think the “<1%” headline feels like an aggressive claim, but the definition from the contest we use is a very high bar. For lower bars, we’d forecast much higher probabilities. We expect great things from AI and AGI, and we are not reflexively bearish on progress.