for a long time, it was obvious that Zero Covid was not working, especially once increasingly infectious variants meant that there was community spread & all the measures had failed to make r<<1 but was hovering at r=1, and they were burning through patience, time, and money while denying all possibility of failure while also having no exit strategy; but it was equally obvious that people saying that were wrong, and Zero Covid was working, because after all, very few people were getting sick or dying of Covid. Every day, the pro-CCP people would say, ‘who are you going to believe, your lying eyes or prophets whose forecasts have never yet come true?’ and they would have a good point, as catastrophe failed to manifest for another day. And this would go on for days upon months upon years: anyone who said Zero Covid had to collapse relatively soon (keeping in mind ‘there is a great deal of ruin in a nation’ and things like bankruptcies or titanic frauds like Wirecard can always go on a lot longer than you think) would experience being wrong anywhere up to a good 700 times in a row, also with zero visible ‘progress’ to show. That is, ‘progress’ looked like things like COVID cases going above zero, and climbing log graphs, and community spread: but this is not what failure ‘looks like’ to regular people—to them, that just looks like more Zero Covid success.
But then December 2022 happened. And boom, almost literally overnight, Zero Covid was dropped and they let’er rip in just about the most destructive and harmful fashion possible one could drop Zero Covid. And then the pro-CCP forecasts lost all their forecasting points and the anti-ZC forecasts won bigly. But they were still wrong in important ways—as cynical as I was and sure that Zero Covid was going to fail at some point, I definitely didn’t take seriously ‘what if they just decide to stop Zero Covid overnight in the worst way possible, and there is no prolonged multi-month stepdown or long interval where they do all the sane sensible preparations that they have had such ample leisure to engage in?’ I wouldn’t’ve set up a prediction market on that, or if I had, I would have framed it in a modest sensible way which would’ve rendered it pointless.
Obviously, ‘foom’ or ‘hard takeoff’ is in a similar position to Hanson or Christiano-style positions. It will be ‘wrong’ (and you see a lot of this going around, especially on Twitter where there are many vested interests in deprecating AI risk), until it is right. The signs that foom is correct are easy to write off. For example, Hanson’s views on deep learning have been badly wrong many times: I tried to explain to him about transfer learning starting to work back in 2015 or so (a phenomenon I regarded as extremely important and which has in fact become so dominant in DL we take it utterly for granted) and he denied it with the usual Hansonian rebuttals; or when he denied that DL could scale at all, he mostly just ignored the early scaling work I linked him to like Hestness et al 2017. Or consider Transformers: a lynchpin of his position was that algorithms have to be heavily tailored to every domain and problem they are applied to, like they were in ML at that time—an AlphaGo in DRL had nothing to do with a tree search chess engine, much less stuff like Markov random fields in NLP, and this was just a fact of nature, and Yudkowsky’s fanciful vaporing about ‘general algorithms’ or ‘general intelligence’ so much wishful thinking. Then Transformers+scaling hit and here we are… Or how about his fight to the bitter end to pretend that ‘ems’ are ever going to be a thing and might beat DL to making AGI, despite near-zero progress (and the only visible progress both due to and would eventually benefit DL much more)? He was right for a long time, until he was wrong. None of this has provoked any substantial rethinking of his views, that I can tell: his latest post just doubles down on everything as if it was still 2013. Or the much more odious cases like Marcus, where they produce an example of a problem that DL will never be able to do—explicitly saying that, not merely insinuating it—and then when DL does it in a year or two, simply never referring to it again and coming up with a new problem DL will never be able to do; one would ask, why the new problem is any more relevant than the old problem was, and why did you ever put forward the old problem if you are completely untroubled by its solution, but that would ascribe more intellectual honesty to their exercise in FUD than it ever possessed.
But on the other hand, I don’t know what sensible forecasts I could’ve made about scaling that would be meaningful evidence right now and convince people, as opposed to being subtly wrong or too vague or morally right but marked wrong. When I got really excited reading Hestness, or Hinton on dark knowledge in JFT-300M, should I have registered a forecast: “in 2022, I prophesy that we will be training DL models with billions of parameters on billions of images!”? That would have been realistic just by drawing the line on the famous ‘AI and compute’ graph, it would turn out to have been very true, and yet—it’s not ‘cruxy’. No one cares if I did or did not make that prediction. Everyone who thought that was dumb back in 2017 because “DL doesn’t ‘just scale™’, we need many paradigm breakthroughs taking decades to reach these levels of performance” has gradually updated and indeed forgot their skepticism entirely or merely laugh about it gently, and has devised their new fallback positions about why AGI is safely far away—pointing out to them the correct prediction won’t change their minds about anything. And this seems to be true of everything else that I could have predicted in 2017.
We can’t solve this with the standard forecasting and prediction markets. They are too thinly traded, the base probabilities too low and too far into systematic bias territory, the contracts too long term, too likely to be misworded, and many key events could easily not happen and are too reliant on ‘catching a falling knife’ and issues with ‘the market can remain irrational longer than you can remain insolvent’/‘there is a great deal of ruin in a nation’. (Marcus would still be wrong even if no one had ever gotten around to retrying his GPT-2 examples on GPT-3, and Zero Covid would still be doomed even if it had managed to last a few % longer and was only going to collapse tomorrow rather than 3 months ago.)
This is why I spend more time thinking about capabilities and how they work than about the many repeated demands for dates or participation in PMs. We have a decent handle on how to grade answers to questions as good or bad; but what we really need right now are not good answers, but good questions.
“I tried to explain to him about transfer learning starting to work back in 2015 or so (a phenomenon I regarded as extremely important and which has in fact become so dominant in DL we take it utterly for granted) and he denied it with the usual Hansonian rebuttals; or when he denied that DL could scale at all, he mostly just ignored the early scaling work I linked him to like Hestness et al 2017. Or consider Transformers: a lynchpin of his position was that algorithms have to be heavily tailored to every domain and problem they are applied to, like they were in ML at that time—an AlphaGo in DRL had nothing to do with a tree search chess engine, much less stuff like Markov random fields in NLP, and this was just a fact of nature, and Yudkowsky’s fanciful vaporing about ‘general algorithms’ or ‘general intelligence’ so much wishful thinking. Then Transformers+scaling hit and here we are…”
I can’t find where you’ve had this exchange with him—can you find it?
If his embarrassing mistakes (and refusal to own up to them) is documented and demonstrable, why not just blam them onto his blog and twitter?
Reading Dan Wang’s belated letter, where he describes Shanghai and the abrupt collapse of Zero Covid, reminds me of one interesting aspect for us of base rates, Outside View reasoning, rocks which say ’everything is fine, and the difference between probabilities & decisions:
for a long time, it was obvious that Zero Covid was not working, especially once increasingly infectious variants meant that there was community spread & all the measures had failed to make r<<1 but was hovering at r=1, and they were burning through patience, time, and money while denying all possibility of failure while also having no exit strategy; but it was equally obvious that people saying that were wrong, and Zero Covid was working, because after all, very few people were getting sick or dying of Covid. Every day, the pro-CCP people would say, ‘who are you going to believe, your lying eyes or prophets whose forecasts have never yet come true?’ and they would have a good point, as catastrophe failed to manifest for another day. And this would go on for days upon months upon years: anyone who said Zero Covid had to collapse relatively soon (keeping in mind ‘there is a great deal of ruin in a nation’ and things like bankruptcies or titanic frauds like Wirecard can always go on a lot longer than you think) would experience being wrong anywhere up to a good 700 times in a row, also with zero visible ‘progress’ to show. That is, ‘progress’ looked like things like COVID cases going above zero, and climbing log graphs, and community spread: but this is not what failure ‘looks like’ to regular people—to them, that just looks like more Zero Covid success.
But then December 2022 happened. And boom, almost literally overnight, Zero Covid was dropped and they let’er rip in just about the most destructive and harmful fashion possible one could drop Zero Covid. And then the pro-CCP forecasts lost all their forecasting points and the anti-ZC forecasts won bigly. But they were still wrong in important ways—as cynical as I was and sure that Zero Covid was going to fail at some point, I definitely didn’t take seriously ‘what if they just decide to stop Zero Covid overnight in the worst way possible, and there is no prolonged multi-month stepdown or long interval where they do all the sane sensible preparations that they have had such ample leisure to engage in?’ I wouldn’t’ve set up a prediction market on that, or if I had, I would have framed it in a modest sensible way which would’ve rendered it pointless.
Obviously, ‘foom’ or ‘hard takeoff’ is in a similar position to Hanson or Christiano-style positions. It will be ‘wrong’ (and you see a lot of this going around, especially on Twitter where there are many vested interests in deprecating AI risk), until it is right. The signs that foom is correct are easy to write off. For example, Hanson’s views on deep learning have been badly wrong many times: I tried to explain to him about transfer learning starting to work back in 2015 or so (a phenomenon I regarded as extremely important and which has in fact become so dominant in DL we take it utterly for granted) and he denied it with the usual Hansonian rebuttals; or when he denied that DL could scale at all, he mostly just ignored the early scaling work I linked him to like Hestness et al 2017. Or consider Transformers: a lynchpin of his position was that algorithms have to be heavily tailored to every domain and problem they are applied to, like they were in ML at that time—an AlphaGo in DRL had nothing to do with a tree search chess engine, much less stuff like Markov random fields in NLP, and this was just a fact of nature, and Yudkowsky’s fanciful vaporing about ‘general algorithms’ or ‘general intelligence’ so much wishful thinking. Then Transformers+scaling hit and here we are… Or how about his fight to the bitter end to pretend that ‘ems’ are ever going to be a thing and might beat DL to making AGI, despite near-zero progress (and the only visible progress both due to and would eventually benefit DL much more)? He was right for a long time, until he was wrong. None of this has provoked any substantial rethinking of his views, that I can tell: his latest post just doubles down on everything as if it was still 2013. Or the much more odious cases like Marcus, where they produce an example of a problem that DL will never be able to do—explicitly saying that, not merely insinuating it—and then when DL does it in a year or two, simply never referring to it again and coming up with a new problem DL will never be able to do; one would ask, why the new problem is any more relevant than the old problem was, and why did you ever put forward the old problem if you are completely untroubled by its solution, but that would ascribe more intellectual honesty to their exercise in FUD than it ever possessed.
But on the other hand, I don’t know what sensible forecasts I could’ve made about scaling that would be meaningful evidence right now and convince people, as opposed to being subtly wrong or too vague or morally right but marked wrong. When I got really excited reading Hestness, or Hinton on dark knowledge in JFT-300M, should I have registered a forecast: “in 2022, I prophesy that we will be training DL models with billions of parameters on billions of images!”? That would have been realistic just by drawing the line on the famous ‘AI and compute’ graph, it would turn out to have been very true, and yet—it’s not ‘cruxy’. No one cares if I did or did not make that prediction. Everyone who thought that was dumb back in 2017 because “DL doesn’t ‘just scale™’, we need many paradigm breakthroughs taking decades to reach these levels of performance” has gradually updated and indeed forgot their skepticism entirely or merely laugh about it gently, and has devised their new fallback positions about why AGI is safely far away—pointing out to them the correct prediction won’t change their minds about anything. And this seems to be true of everything else that I could have predicted in 2017.
We can’t solve this with the standard forecasting and prediction markets. They are too thinly traded, the base probabilities too low and too far into systematic bias territory, the contracts too long term, too likely to be misworded, and many key events could easily not happen and are too reliant on ‘catching a falling knife’ and issues with ‘the market can remain irrational longer than you can remain insolvent’/‘there is a great deal of ruin in a nation’. (Marcus would still be wrong even if no one had ever gotten around to retrying his GPT-2 examples on GPT-3, and Zero Covid would still be doomed even if it had managed to last a few % longer and was only going to collapse tomorrow rather than 3 months ago.)
This is why I spend more time thinking about capabilities and how they work than about the many repeated demands for dates or participation in PMs. We have a decent handle on how to grade answers to questions as good or bad; but what we really need right now are not good answers, but good questions.
“I tried to explain to him about transfer learning starting to work back in 2015 or so (a phenomenon I regarded as extremely important and which has in fact become so dominant in DL we take it utterly for granted) and he denied it with the usual Hansonian rebuttals; or when he denied that DL could scale at all, he mostly just ignored the early scaling work I linked him to like Hestness et al 2017. Or consider Transformers: a lynchpin of his position was that algorithms have to be heavily tailored to every domain and problem they are applied to, like they were in ML at that time—an AlphaGo in DRL had nothing to do with a tree search chess engine, much less stuff like Markov random fields in NLP, and this was just a fact of nature, and Yudkowsky’s fanciful vaporing about ‘general algorithms’ or ‘general intelligence’ so much wishful thinking. Then Transformers+scaling hit and here we are…”
I can’t find where you’ve had this exchange with him—can you find it?
If his embarrassing mistakes (and refusal to own up to them) is documented and demonstrable, why not just blam them onto his blog and twitter?