Yes, IMO challenge falling in 2024 is surprising to me at something like the 1% level or maybe even more extreme (though could also go down if I thought about it a lot or if commenters brought up relevant considerations, e.g. I’d look at IMO problems and gold medal cutoffs and think about what tasks ought to be easy or hard; I’m also happy to make more concrete per-question predictions). I do think that there could be huge amounts of progress from picking the low hanging fruit and scaling up spending by a few orders of magnitude, but I still don’t expect it to get you that far.
I don’t think this is an easy prediction to extract from a trendline, in significant part because you can’t extrapolate trendlines this early that far out. So this is stress-testing different parts of my model, which is fine by me.
At the meta-level, this is the kind of thing I’m looking for, though I’d prefer have some kind of quantitative measure of how not-surprised you are. If you are only saying 2% then we probably want to talk about things less far in your tails than the IMO challenge.
Okay, then we’ve got at least one Eliezerverse item, because I’ve said below that I think I’m at least 16% for IMO theorem-proving by end of 2025. The drastic difference here causes me to feel nervous, and my second-order estimate has probably shifted some in your direction just from hearing you put 1% on 2024, but that’s irrelevant because it’s first-order estimates we should be comparing here.
So we’ve got huge GDP increases for before-End-days signs of Paulverse and quick IMO proving for before-End-days signs of Eliezerverse? Pretty bare portfolio but it’s at least a start in both directions. If we say 5% instead of 1%, how much further would you extend the time limit out beyond 2024?
I also don’t know at all what part of your model forbids theorem-proving to fall in a shocking headline followed by another headline a year later—it doesn’t sound like it’s from looking at a graph—and I think that explaining reasons behind our predictions in advance, not just making quantitative predictions in advance, will help others a lot here.
EDIT: Though the formal IMO challenge has a barnacle about the AI being open-sourced, which is a separate sociological prediction I’m not taking on.
I think IMO gold medal could be well before massive economic impact, I’m just surprised if it happens in the next 3 years. After a bit more thinking (but not actually looking at IMO problems or the state of theorem proving) I probably want to bump that up a bit, maybe 2%, it’s hard reasoning about the tails.
I’d say <4% on end of 2025.
I think this is the flipside of me having an intuition where I say things like “AlphaGo and GPT-3 aren’t that surprising”—I have a sense for what things are and aren’t surprising, and not many things happen that are so surprising.
If I’m at 4% and you are 12% and we had 8 such bets, then I can get a factor of 2 if they all come out my way, and you get a factor of ~1.5 if one of them comes out your way.
I might think more about this and get a more coherent probability distribution, but unless I say something else by end of 2021 you can consider 4% on end of 2025 this my prediction.
Maybe another way of phrasing this—how much warning do you expect to get, how far out does your Nope Vision extend? Do you expect to be able to say “We’re now in the ‘for all I know the IMO challenge could be won in 4 years’ regime” more than 4 years before it happens, in general? Would it be fair to ask you again at the end of 2022 and every year thereafter if we’ve entered the ‘for all I know, within 4 years’ regime?
Added: This question fits into a larger concern I have about AI soberskeptics in general (not you, the soberskeptics would not consider you one of their own) where they saunter around saying “X will not occur in the next 5 / 10 / 20 years” and they’re often right for the next couple of years, because there’s only one year where X shows up for any particular definition of that, and most years are not that year; but also they’re saying exactly the same thing up until 2 years before X shows up, if there’s any early warning on X at all. It seems to me that 2 years is about as far as Nope Vision extends in real life, for any case that isn’t completely slam-dunk; when I called upon those gathered AI luminaries to say the least impressive thing that definitely couldn’t be done in 2 years, and they all fell silent, and then a single one of them named Winograd schemas, they were right that Winograd schemas at the stated level didn’t fall within 2 years, but very barely so (they fell the year after). So part of what I’m flailingly asking here, is whether you think you have reliable and sensitive Nope Vision that extends out beyond 2 years, in general, such that you can go on saying “Not for 4 years” up until we are actually within 6 years of the thing, and then, you think, your Nope Vision will actually flash an alert and you will change your tune, before you are actually within 4 years of the thing. Or maybe you think you’ve got Nope Vision extending out 6 years? 10 years? Or maybe theorem-proving is just a special case and usually your Nope Vision would be limited to 2 years or 3 years?
This is all an extremely Yudkowskian frame on things, of course, so feel free to reframe.
I think I’ll get less confident as our accomplishments get closer to the IMO grand challenge. Or maybe I’ll get much more confident if we scale up from $1M → $1B and pick the low hanging fruit without getting fairly close, since at that point further progress gets a lot easier to predict
There’s not really a constant time horizon for my pessimism, it depends on how long and robust a trend you are extrapolating from. 4 years feels like a relatively short horizon, because theorem-proving has not had much investment so compute can be scaled up several orders of magnitude, and there is likely lots of low-hanging fruit to pick, and we just don’t have much to extrapolate from (compared to more mature technologies, or how I expect AI will be shortly before the end of days), and for similar reasons there aren’t really any benchmarks to extrapolate.
(Also note that it matters a lot whether you know what problems labs will try to take a stab at. For the purpose of all of these forecasts, I am trying insofar as possible to set aside all knowledge about what labs are planning to do though that’s obviously not incentive-compatible and there’s no particular reason you should trust me to do that.)
Yes, IMO challenge falling in 2024 is surprising to me at something like the 1% level or maybe even more extreme (though could also go down if I thought about it a lot or if commenters brought up relevant considerations, e.g. I’d look at IMO problems and gold medal cutoffs and think about what tasks ought to be easy or hard; I’m also happy to make more concrete per-question predictions). I do think that there could be huge amounts of progress from picking the low hanging fruit and scaling up spending by a few orders of magnitude, but I still don’t expect it to get you that far.
I don’t think this is an easy prediction to extract from a trendline, in significant part because you can’t extrapolate trendlines this early that far out. So this is stress-testing different parts of my model, which is fine by me.
At the meta-level, this is the kind of thing I’m looking for, though I’d prefer have some kind of quantitative measure of how not-surprised you are. If you are only saying 2% then we probably want to talk about things less far in your tails than the IMO challenge.
Okay, then we’ve got at least one Eliezerverse item, because I’ve said below that I think I’m at least 16% for IMO theorem-proving by end of 2025. The drastic difference here causes me to feel nervous, and my second-order estimate has probably shifted some in your direction just from hearing you put 1% on 2024, but that’s irrelevant because it’s first-order estimates we should be comparing here.
So we’ve got huge GDP increases for before-End-days signs of Paulverse and quick IMO proving for before-End-days signs of Eliezerverse? Pretty bare portfolio but it’s at least a start in both directions. If we say 5% instead of 1%, how much further would you extend the time limit out beyond 2024?
I also don’t know at all what part of your model forbids theorem-proving to fall in a shocking headline followed by another headline a year later—it doesn’t sound like it’s from looking at a graph—and I think that explaining reasons behind our predictions in advance, not just making quantitative predictions in advance, will help others a lot here.
EDIT: Though the formal IMO challenge has a barnacle about the AI being open-sourced, which is a separate sociological prediction I’m not taking on.
I think IMO gold medal could be well before massive economic impact, I’m just surprised if it happens in the next 3 years. After a bit more thinking (but not actually looking at IMO problems or the state of theorem proving) I probably want to bump that up a bit, maybe 2%, it’s hard reasoning about the tails.
I’d say <4% on end of 2025.
I think this is the flipside of me having an intuition where I say things like “AlphaGo and GPT-3 aren’t that surprising”—I have a sense for what things are and aren’t surprising, and not many things happen that are so surprising.
If I’m at 4% and you are 12% and we had 8 such bets, then I can get a factor of 2 if they all come out my way, and you get a factor of ~1.5 if one of them comes out your way.
I might think more about this and get a more coherent probability distribution, but unless I say something else by end of 2021 you can consider 4% on end of 2025 this my prediction.
Maybe another way of phrasing this—how much warning do you expect to get, how far out does your Nope Vision extend? Do you expect to be able to say “We’re now in the ‘for all I know the IMO challenge could be won in 4 years’ regime” more than 4 years before it happens, in general? Would it be fair to ask you again at the end of 2022 and every year thereafter if we’ve entered the ‘for all I know, within 4 years’ regime?
Added: This question fits into a larger concern I have about AI soberskeptics in general (not you, the soberskeptics would not consider you one of their own) where they saunter around saying “X will not occur in the next 5 / 10 / 20 years” and they’re often right for the next couple of years, because there’s only one year where X shows up for any particular definition of that, and most years are not that year; but also they’re saying exactly the same thing up until 2 years before X shows up, if there’s any early warning on X at all. It seems to me that 2 years is about as far as Nope Vision extends in real life, for any case that isn’t completely slam-dunk; when I called upon those gathered AI luminaries to say the least impressive thing that definitely couldn’t be done in 2 years, and they all fell silent, and then a single one of them named Winograd schemas, they were right that Winograd schemas at the stated level didn’t fall within 2 years, but very barely so (they fell the year after). So part of what I’m flailingly asking here, is whether you think you have reliable and sensitive Nope Vision that extends out beyond 2 years, in general, such that you can go on saying “Not for 4 years” up until we are actually within 6 years of the thing, and then, you think, your Nope Vision will actually flash an alert and you will change your tune, before you are actually within 4 years of the thing. Or maybe you think you’ve got Nope Vision extending out 6 years? 10 years? Or maybe theorem-proving is just a special case and usually your Nope Vision would be limited to 2 years or 3 years?
This is all an extremely Yudkowskian frame on things, of course, so feel free to reframe.
I think I’ll get less confident as our accomplishments get closer to the IMO grand challenge. Or maybe I’ll get much more confident if we scale up from $1M → $1B and pick the low hanging fruit without getting fairly close, since at that point further progress gets a lot easier to predict
There’s not really a constant time horizon for my pessimism, it depends on how long and robust a trend you are extrapolating from. 4 years feels like a relatively short horizon, because theorem-proving has not had much investment so compute can be scaled up several orders of magnitude, and there is likely lots of low-hanging fruit to pick, and we just don’t have much to extrapolate from (compared to more mature technologies, or how I expect AI will be shortly before the end of days), and for similar reasons there aren’t really any benchmarks to extrapolate.
(Also note that it matters a lot whether you know what problems labs will try to take a stab at. For the purpose of all of these forecasts, I am trying insofar as possible to set aside all knowledge about what labs are planning to do though that’s obviously not incentive-compatible and there’s no particular reason you should trust me to do that.)