Maybe another way of phrasing this—how much warning do you expect to get, how far out does your Nope Vision extend? Do you expect to be able to say “We’re now in the ‘for all I know the IMO challenge could be won in 4 years’ regime” more than 4 years before it happens, in general? Would it be fair to ask you again at the end of 2022 and every year thereafter if we’ve entered the ‘for all I know, within 4 years’ regime?
Added: This question fits into a larger concern I have about AI soberskeptics in general (not you, the soberskeptics would not consider you one of their own) where they saunter around saying “X will not occur in the next 5 / 10 / 20 years” and they’re often right for the next couple of years, because there’s only one year where X shows up for any particular definition of that, and most years are not that year; but also they’re saying exactly the same thing up until 2 years before X shows up, if there’s any early warning on X at all. It seems to me that 2 years is about as far as Nope Vision extends in real life, for any case that isn’t completely slam-dunk; when I called upon those gathered AI luminaries to say the least impressive thing that definitely couldn’t be done in 2 years, and they all fell silent, and then a single one of them named Winograd schemas, they were right that Winograd schemas at the stated level didn’t fall within 2 years, but very barely so (they fell the year after). So part of what I’m flailingly asking here, is whether you think you have reliable and sensitive Nope Vision that extends out beyond 2 years, in general, such that you can go on saying “Not for 4 years” up until we are actually within 6 years of the thing, and then, you think, your Nope Vision will actually flash an alert and you will change your tune, before you are actually within 4 years of the thing. Or maybe you think you’ve got Nope Vision extending out 6 years? 10 years? Or maybe theorem-proving is just a special case and usually your Nope Vision would be limited to 2 years or 3 years?
This is all an extremely Yudkowskian frame on things, of course, so feel free to reframe.
I think I’ll get less confident as our accomplishments get closer to the IMO grand challenge. Or maybe I’ll get much more confident if we scale up from $1M → $1B and pick the low hanging fruit without getting fairly close, since at that point further progress gets a lot easier to predict
There’s not really a constant time horizon for my pessimism, it depends on how long and robust a trend you are extrapolating from. 4 years feels like a relatively short horizon, because theorem-proving has not had much investment so compute can be scaled up several orders of magnitude, and there is likely lots of low-hanging fruit to pick, and we just don’t have much to extrapolate from (compared to more mature technologies, or how I expect AI will be shortly before the end of days), and for similar reasons there aren’t really any benchmarks to extrapolate.
(Also note that it matters a lot whether you know what problems labs will try to take a stab at. For the purpose of all of these forecasts, I am trying insofar as possible to set aside all knowledge about what labs are planning to do though that’s obviously not incentive-compatible and there’s no particular reason you should trust me to do that.)
Maybe another way of phrasing this—how much warning do you expect to get, how far out does your Nope Vision extend? Do you expect to be able to say “We’re now in the ‘for all I know the IMO challenge could be won in 4 years’ regime” more than 4 years before it happens, in general? Would it be fair to ask you again at the end of 2022 and every year thereafter if we’ve entered the ‘for all I know, within 4 years’ regime?
Added: This question fits into a larger concern I have about AI soberskeptics in general (not you, the soberskeptics would not consider you one of their own) where they saunter around saying “X will not occur in the next 5 / 10 / 20 years” and they’re often right for the next couple of years, because there’s only one year where X shows up for any particular definition of that, and most years are not that year; but also they’re saying exactly the same thing up until 2 years before X shows up, if there’s any early warning on X at all. It seems to me that 2 years is about as far as Nope Vision extends in real life, for any case that isn’t completely slam-dunk; when I called upon those gathered AI luminaries to say the least impressive thing that definitely couldn’t be done in 2 years, and they all fell silent, and then a single one of them named Winograd schemas, they were right that Winograd schemas at the stated level didn’t fall within 2 years, but very barely so (they fell the year after). So part of what I’m flailingly asking here, is whether you think you have reliable and sensitive Nope Vision that extends out beyond 2 years, in general, such that you can go on saying “Not for 4 years” up until we are actually within 6 years of the thing, and then, you think, your Nope Vision will actually flash an alert and you will change your tune, before you are actually within 4 years of the thing. Or maybe you think you’ve got Nope Vision extending out 6 years? 10 years? Or maybe theorem-proving is just a special case and usually your Nope Vision would be limited to 2 years or 3 years?
This is all an extremely Yudkowskian frame on things, of course, so feel free to reframe.
I think I’ll get less confident as our accomplishments get closer to the IMO grand challenge. Or maybe I’ll get much more confident if we scale up from $1M → $1B and pick the low hanging fruit without getting fairly close, since at that point further progress gets a lot easier to predict
There’s not really a constant time horizon for my pessimism, it depends on how long and robust a trend you are extrapolating from. 4 years feels like a relatively short horizon, because theorem-proving has not had much investment so compute can be scaled up several orders of magnitude, and there is likely lots of low-hanging fruit to pick, and we just don’t have much to extrapolate from (compared to more mature technologies, or how I expect AI will be shortly before the end of days), and for similar reasons there aren’t really any benchmarks to extrapolate.
(Also note that it matters a lot whether you know what problems labs will try to take a stab at. For the purpose of all of these forecasts, I am trying insofar as possible to set aside all knowledge about what labs are planning to do though that’s obviously not incentive-compatible and there’s no particular reason you should trust me to do that.)