I think this post is epistemically weak (which does not mean I disagree with you):
Your post pushes the claim that “It’s time for EA leadership to pull the short-timelines fire alarm.” wouldn’t be wise. Problems in the discourse: (1) “pulling the short-timelines fire alarm” isn’t well-defined in the first place, (2) there is a huge inferential gap between “AGI won’t come before 2030″ and “EA shouldn’t pull the short-timelines fire alarm” (which could mean sth like e.g. EA should start planning to start a Manhattan project for aligning AGI in the next few years.), and (3) your statement “we are concerned about a view of the type expounded in the post causing EA leadership to try something hasty and ill-considered” that slightly addresses that inferential gap is just a bad rhetorical method where you interpret what the other said in a very extreme and bad way, although the other person actually didn’t mean that, and you are definitely not seriously considering the pros and cons of taking more initiative. (Though of course it’s not really clear what “taking more initiative” means, and critiquing the other post (which IMO was epistemically very bad) would be totally right.)
You’re not giving a reason why you think timelines aren’t that short, only saying you believe it enough to bet on it. IMO, simply saying “the >30% 3-7 years claim is compared to current estimates of many smart people an extraordinary claim that requires an extraordinary burden of proof, which isn’t provided” would have been better.
Even if not explicitly or even if not endorsed by you, your post implicitly promotes the statement “EA leadership doesn’t need to shorten its timelines”. I’m not at all confident about this, but it seems to me like EA leadership acts as if we have pretty long timelines, significantly longer than your bets would imply. (The way the post is written, you should have at least explicitly pointed out that this post doesn’t imply that EA has short enough timelines.)
AGI timelines are so difficult to predict that prediction markets might be extremely outperformed by a few people with very deep models about the alignment problem, like Eliezer Yudkowsky or Paul Christiano, so even if we would take many such bets in the form of a prediction market, this wouldn’t be strong evidence that our estimate is that good, or the estimate would be extremely uncertain. (Not at all saying taking bets is bad, though the doom factor makes taking bets difficult indeed.)
It’s not that there’s anything wrong with posting such a post saying you’re willing to bet, as long as you don’t sell it as an argument why timelines aren’t that short or even more downstream things like what EA leadership should do. What bothers me isn’t that this post got posted, but that it and the post it is counterbalancing received so many upvotes. Lesswrong should be a place where good epistemics are very important, not where people cheer for their side by upvoting everything that supports their own opinion.
I share your opinion that the post is misleading. Adding to the list,
Bets don’t pay out until you win them, and this includes in epistemic credit, but we need to realize we are in short timelines before they happen. If they are to lose this bet, we wouldn’t learn from it until it is dangerously late.
There are market arguments to update from betting markets ahead of time, but a fair few people have accepted the bet, so that does not transparently help the authors’ case.
1:1 odds in 2026 on human-expert MMLU performance, $1B models, >90% MATH , >80% APPS top-1, IMO Gold Medal, or human-like robot dexterity is a short timeline. The only criteria that doesn’t seem to me to support short timelines at 1:1 odds is Tesla FSD, and some people might disagree.
1:1 odds in 2026 on human-expert MMLU performance, $1B models, >90% MATH , >80% APPS top-1, IMO Gold Medal, or human-like robot dexterity is a short timeline.
I disagree. I think each of these benchmarks will be surpassed well before we are at AGI-levels of capability. That said, I agree that the post was insufficient in justifying why we think this bet is a reasonable reply to the OP. I hope in the near-term future to write a longer, more personal post that expands on some of my reasoning.
The bet itself was merely a public statement to the effect of “if people are saying these radical things, why don’t they put their money where their mouths are?” I don’t think such statements need to have long arguments attached to them. But, I can totally see why people were left confused.
I appreciate that you changed the title, and think this makes the post a lot more agreeable. It is totally reasonable to be making bets without having to justify them, just as long as the making of a bet is not mistaken to be more evidence than its associated sustained market movement.
I think each of these benchmarks will be surpassed well before we are at AGI-levels of capability.
Solving any of these tasks in a non-gamed manner just 14 years after AlexNet might not be at the point of AGI, or at least I can envision a future consistent with it coming prior, but it is significant evidence that AGI is not too many years out. I can still just about imagine today that neural networks might hit some wall that ultimately limits their understanding, but this point has to come prior to neural networks showing that they are almost fully general reasoners with the right backpropagation signal (it is after all the backpropagation that is capable of learning almost arbitrary tasks with almost no task-specialization). An alarm needs to precede alignment catastrophe by long enough that you have time to do something about it; isn’t much use if it is only there to tell you how you are going to die.
Bootstrapping is often painted as a model looking at its own code, thinking really hard, and writing better code that it knows to be better, but this is an extremely strong version of bootstrapping and you don’t need to come anywhere close to these capabilities in order to start worrying about concrete dangers. I wrote a post that gave an example of a minimum viable FOOM, but it is not only possible to get to from that angle, nor the earliest level of capability where I think things will start breaking. It is worth remembering that evolution optimized for humanity from proto-humans that could not be given IMO Gold Medal questions and be expected to solve them. Evolution isn’t intelligent at all, so it certainly is not the case that you need human level intelligence before you can optimize on intelligence.
You may PM me for a small optional point I don’t want to make in public.
-
E: A more aggressive way of phrasing this is to challenge, why can’t this work, and concretely, what specific capability do you think is missing for machine intelligence to start doing transformative things like AI research? If a machine is grounded in language and other sensory media, and is also capable of taking a novel mathematical question and inventing not just the answer but the method by which it is solved, why can’t it apply that ability to other tasks that it is able to talk about? Many models have shown that reasoning abilities transfer, and that agents trained on general domains do similarly well across them.
It’s not that there’s anything wrong with posting such a post saying you’re willing to bet, as long as you don’t sell it as an argument why timelines aren’t that short or even more downstream things like what EA leadership should do.
I don’t agree that we sold our post as an argument for why timelines aren’t short. Thus, I don’t think this objection applies.
That said, I do agree that the initial post deserves a much longer and nuanced response. While I don’t think it’s fair to demand that every response be nuanced and long, I do agree that our post could have been a bit better in responding to the object-level claims. For what it’s worth, I do hope to write a far more nuanced and substantive take on these issues in the relative near-term.
I don’t agree that we sold our post as an argument for why timelines are short. Thus, I don’t think this objection applies.
You probably mean “why timelines aren’t short”. I didn’t think you explicitly thought it was an argument against short timelines, but because the post got so many upvotes I’m worried that many people implicitly perceive it as such, and the way the post is written contributes to that. But great that you changed the title, that already makes it a lot better!
That said, I do agree that the initial post deserves a much longer and nuanced response.
I don’t really think the initial post deserves a nuanced response. (My response would have been “the >30% 3-7 years claim is compared to current estimates of many smart people an extraordinary claim that requires an extraordinary burden of proof, which isn’t provided”.) But I do think that the community (and especially EA leadership) should probably carefully reevaluate timelines (considering arguments of short timelines and how good they are), so great if you are planning to do a careful analysis of timeline arguments!
I think this post is epistemically weak (which does not mean I disagree with you):
Your post pushes the claim that “It’s time for EA leadership to pull the short-timelines fire alarm.” wouldn’t be wise. Problems in the discourse: (1) “pulling the short-timelines fire alarm” isn’t well-defined in the first place, (2) there is a huge inferential gap between “AGI won’t come before 2030″ and “EA shouldn’t pull the short-timelines fire alarm” (which could mean sth like e.g. EA should start planning to start a Manhattan project for aligning AGI in the next few years.), and (3) your statement “we are concerned about a view of the type expounded in the post causing EA leadership to try something hasty and ill-considered” that slightly addresses that inferential gap is just a bad rhetorical method where you interpret what the other said in a very extreme and bad way, although the other person actually didn’t mean that, and you are definitely not seriously considering the pros and cons of taking more initiative. (Though of course it’s not really clear what “taking more initiative” means, and critiquing the other post (which IMO was epistemically very bad) would be totally right.)
You’re not giving a reason why you think timelines aren’t that short, only saying you believe it enough to bet on it. IMO, simply saying “the >30% 3-7 years claim is compared to current estimates of many smart people an extraordinary claim that requires an extraordinary burden of proof, which isn’t provided” would have been better.
Even if not explicitly or even if not endorsed by you, your post implicitly promotes the statement “EA leadership doesn’t need to shorten its timelines”. I’m not at all confident about this, but it seems to me like EA leadership acts as if we have pretty long timelines, significantly longer than your bets would imply. (The way the post is written, you should have at least explicitly pointed out that this post doesn’t imply that EA has short enough timelines.)
AGI timelines are so difficult to predict that prediction markets might be extremely outperformed by a few people with very deep models about the alignment problem, like Eliezer Yudkowsky or Paul Christiano, so even if we would take many such bets in the form of a prediction market, this wouldn’t be strong evidence that our estimate is that good, or the estimate would be extremely uncertain.
(Not at all saying taking bets is bad, though the doom factor makes taking bets difficult indeed.)
It’s not that there’s anything wrong with posting such a post saying you’re willing to bet, as long as you don’t sell it as an argument why timelines aren’t that short or even more downstream things like what EA leadership should do. What bothers me isn’t that this post got posted, but that it and the post it is counterbalancing received so many upvotes. Lesswrong should be a place where good epistemics are very important, not where people cheer for their side by upvoting everything that supports their own opinion.
I share your opinion that the post is misleading. Adding to the list,
Bets don’t pay out until you win them, and this includes in epistemic credit, but we need to realize we are in short timelines before they happen. If they are to lose this bet, we wouldn’t learn from it until it is dangerously late.
There are market arguments to update from betting markets ahead of time, but a fair few people have accepted the bet, so that does not transparently help the authors’ case.
1:1 odds in 2026 on human-expert MMLU performance, $1B models, >90% MATH , >80% APPS top-1, IMO Gold Medal, or human-like robot dexterity is a short timeline. The only criteria that doesn’t seem to me to support short timelines at 1:1 odds is Tesla FSD, and some people might disagree.
I disagree. I think each of these benchmarks will be surpassed well before we are at AGI-levels of capability. That said, I agree that the post was insufficient in justifying why we think this bet is a reasonable reply to the OP. I hope in the near-term future to write a longer, more personal post that expands on some of my reasoning.
The bet itself was merely a public statement to the effect of “if people are saying these radical things, why don’t they put their money where their mouths are?” I don’t think such statements need to have long arguments attached to them. But, I can totally see why people were left confused.
I appreciate that you changed the title, and think this makes the post a lot more agreeable. It is totally reasonable to be making bets without having to justify them, just as long as the making of a bet is not mistaken to be more evidence than its associated sustained market movement.
Solving any of these tasks in a non-gamed manner just 14 years after AlexNet might not be at the point of AGI, or at least I can envision a future consistent with it coming prior, but it is significant evidence that AGI is not too many years out. I can still just about imagine today that neural networks might hit some wall that ultimately limits their understanding, but this point has to come prior to neural networks showing that they are almost fully general reasoners with the right backpropagation signal (it is after all the backpropagation that is capable of learning almost arbitrary tasks with almost no task-specialization). An alarm needs to precede alignment catastrophe by long enough that you have time to do something about it; isn’t much use if it is only there to tell you how you are going to die.
Bootstrapping is often painted as a model looking at its own code, thinking really hard, and writing better code that it knows to be better, but this is an extremely strong version of bootstrapping and you don’t need to come anywhere close to these capabilities in order to start worrying about concrete dangers. I wrote a post that gave an example of a minimum viable FOOM, but it is not only possible to get to from that angle, nor the earliest level of capability where I think things will start breaking. It is worth remembering that evolution optimized for humanity from proto-humans that could not be given IMO Gold Medal questions and be expected to solve them. Evolution isn’t intelligent at all, so it certainly is not the case that you need human level intelligence before you can optimize on intelligence.
You may PM me for a small optional point I don’t want to make in public.
-
E: A more aggressive way of phrasing this is to challenge, why can’t this work, and concretely, what specific capability do you think is missing for machine intelligence to start doing transformative things like AI research? If a machine is grounded in language and other sensory media, and is also capable of taking a novel mathematical question and inventing not just the answer but the method by which it is solved, why can’t it apply that ability to other tasks that it is able to talk about? Many models have shown that reasoning abilities transfer, and that agents trained on general domains do similarly well across them.
I don’t agree that we sold our post as an argument for why timelines aren’t short. Thus, I don’t think this objection applies.
That said, I do agree that the initial post deserves a much longer and nuanced response. While I don’t think it’s fair to demand that every response be nuanced and long, I do agree that our post could have been a bit better in responding to the object-level claims. For what it’s worth, I do hope to write a far more nuanced and substantive take on these issues in the relative near-term.
You probably mean “why timelines aren’t short”. I didn’t think you explicitly thought it was an argument against short timelines, but because the post got so many upvotes I’m worried that many people implicitly perceive it as such, and the way the post is written contributes to that. But great that you changed the title, that already makes it a lot better!
I don’t really think the initial post deserves a nuanced response. (My response would have been “the >30% 3-7 years claim is compared to current estimates of many smart people an extraordinary claim that requires an extraordinary burden of proof, which isn’t provided”.)
But I do think that the community (and especially EA leadership) should probably carefully reevaluate timelines (considering arguments of short timelines and how good they are), so great if you are planning to do a careful analysis of timeline arguments!