If that were the case, I actually would fault Eliezer, at least a little. He’s frequently, though by no means always, stuck to qualitative and hard-to-pin-down punditry like we see here, rather than to unambiguous forecasting.
This allows him, or his defenders, to retroactively defend his predictions as somehow correct even when they seem wrong in hindsight.
Let’s imagine for a moment that Eliezer’s right that AI safety is a cosmically important issue, and yet that he’s quite mistaken about all the technical details of how AGI will arise and how to effectively make it safe. It would be important to know whether we can trust his judgment and leadership.
Without the ability to evaluate his performance, either by going with the most obvious interpretation of his qualitative judgments or an unambiguous forecast, it’s hard to evaluate his performance as an AI safety leader. Combine that with a culture of deference to perceived expertise and status and the problem gets worse.
So I prioritize the avoidance of special pleading in this case: I think Eliezer comes across as clearly wrong in substance in this specific post, and that it’s important not to reach for ways “he was actually right from a certain point of view” when evaluating his predictive accuracy.
Similarly, I wouldn’t judge as correct the early COVID-19 pronouncements that masks don’t work to stop the spread just because cloth masks are poor-to-ineffective and many people refuse to wear masks properly. There’s a way we can stretch the interpretation to make them seem sort of right, but we shouldn’t. We should expect public health messaging to be clearly right in substance, if it’s not making cut and dry unambiguous quantitative forecasts but is instead delivering qualitative judgments of efficacy.
None of that bears on how easy or hard it was to build gpt-4. It only bears on how we should evaluate Eliezer as a forecaster/pundit/AI safety leader.
I think several things here, considering the broader thread:
You’ve done a great job in communicating several reactions I also had:
There are signs of serious mispredictions and mistakes in some of the 2008 posts.
There are ways to read these posts as not that bad in hindsight, but we should be careful in giving too much benefit of the doubt.
Overall these observations constitute important evidence on EY’s alignment intuitions and ability to make qualitative AI predictions.
I did a bad job of marking my interpretations of what Eliezer wrote, as opposed to claiming he did dismiss ANNs. Hopefully my edits have fixed my mistakes.
If that were the case, I actually would fault Eliezer, at least a little. He’s frequently, though by no means always, stuck to qualitative and hard-to-pin-down punditry like we see here, rather than to unambiguous forecasting.
This allows him, or his defenders, to retroactively defend his predictions as somehow correct even when they seem wrong in hindsight.
Let’s imagine for a moment that Eliezer’s right that AI safety is a cosmically important issue, and yet that he’s quite mistaken about all the technical details of how AGI will arise and how to effectively make it safe. It would be important to know whether we can trust his judgment and leadership.
Without the ability to evaluate his performance, either by going with the most obvious interpretation of his qualitative judgments or an unambiguous forecast, it’s hard to evaluate his performance as an AI safety leader. Combine that with a culture of deference to perceived expertise and status and the problem gets worse.
So I prioritize the avoidance of special pleading in this case: I think Eliezer comes across as clearly wrong in substance in this specific post, and that it’s important not to reach for ways “he was actually right from a certain point of view” when evaluating his predictive accuracy.
Similarly, I wouldn’t judge as correct the early COVID-19 pronouncements that masks don’t work to stop the spread just because cloth masks are poor-to-ineffective and many people refuse to wear masks properly. There’s a way we can stretch the interpretation to make them seem sort of right, but we shouldn’t. We should expect public health messaging to be clearly right in substance, if it’s not making cut and dry unambiguous quantitative forecasts but is instead delivering qualitative judgments of efficacy.
None of that bears on how easy or hard it was to build gpt-4. It only bears on how we should evaluate Eliezer as a forecaster/pundit/AI safety leader.
I think several things here, considering the broader thread:
You’ve done a great job in communicating several reactions I also had:
There are signs of serious mispredictions and mistakes in some of the 2008 posts.
There are ways to read these posts as not that bad in hindsight, but we should be careful in giving too much benefit of the doubt.
Overall these observations constitute important evidence on EY’s alignment intuitions and ability to make qualitative AI predictions.
I did a bad job of marking my interpretations of what Eliezer wrote, as opposed to claiming he did dismiss ANNs. Hopefully my edits have fixed my mistakes.