IMO that’s a much more defensible position, and is what the discussion should have initially focused on. From my perspective, the way the debate largely went is:
Jotto: Eliezer claims to have a relatively successful forecasting track record, along with Dario and Demis; but this is clearly dissembling, because a forecasting track record needs to look like a long series of Metaculus predictions.
Other people: (repeat without qualification the claim that Eliezer is falsely claiming to have a “forecasting track record”; simultaneously claims that Eliezer has a subpar “forecasting track record”, based on evidence that wouldn’t meet Jotto’s stated bar)
Jotto: (signal-boosts the inconsistent claims other people are making, without noting that this is equivocating between two senses of “track record” and therefore selectively applying two different standards)
Rob B: (gripes and complains)
Whereas the way the debate should have gone is:
Jotto: I personally disagree with Eliezer that the AI Foom debate is easy to understand and cash out into rough predictions about how the field has progressed since 2009, or how it is likely to progress in the future. Also, I wish that all of Eliezer, Robin, Demis, Dario, and Paul had made way more Metaculus-style forecasts back in 2010, so it would be easier to compare their prediction performance. I find it frustrating that nobody did this, and think we should start doing this way more now. Also, I think this sharper comparison would probably have shown that Eliezer is significantly worse at thinking about this topic than Paul, and maybe than Robin, Demis, and Dario.
Rob B: I disagree with your last sentence, and I disagree quantitatively that stuff like the Foom debate is as hard-to-interpret as you suggest. But I otherwise agree with you, and think it would have been useful if the circa-2010 discussions had included more explicit probability distributions, scenario breakdowns, quantitative estimates, etc. (suitably flagged as unstable, spitballed ass-numbers). Even where these aren’t cruxy and don’t provide clear evidence about people’s quality of reasoning about AGI, it’s still just helpful to have a more precise sense of what people’s actual beliefs at the time were. “X is unlikely” is way less useful than knowing whether it’s more like 30%, or more like 5%, or more like 0.1%, etc.
I think the whole ‘X isn’t a real track record’ thing was confusing, and made your argument sound more forceful than it should have.
Plus maybe some disagreements about how possible it is in general to form good models of people and of topics like AGI in the absence of Metaculus-ish forecasts, and disagreement about exactly how informative it would be to have a hundred examples of narrow-AI benchmark predictions over the last ten years from all the influential EAs?
(I think it would be useful, but more like ‘1% to 10% of the overall evidence for weighing people’s reasoning and correctness about AGI’, not ’90% to 100% of the evidence’.)
(An exception would be if, e.g., it turned out that ML progress is way more predictable than Eliezer or I believe. ML’s predictability is a genuine crux for us, so seeing someone else do amazing at this prediction task for a bunch of years, with foresight rather than hindsight, would genuinely update us a bunch. But we don’t expect to learn much from Eliezer or Rob trying to predict stuff, because while someone else may have secret insight that lets them predict the future of narrow-AI advances very narrowly, we are pretty sure we don’t know how to do that.)
Part of what I object to is that you’re a Metaculus radical, whose Twitter bio says “Replace opinions with forecasts.”
This is a view almost no one in the field currently agrees with or tries to live up to.
Which is fine, on its own. I like radicals, and want to hear their views argued for and hashed out in conversation.
But then you selectively accuse Eliezer of lying about having a “track record”, without noting how many other people are also expressing non-forecast “opinions” (and updating on these), and while using language in ways that make it sound like Eliezer is doing something more unusual than he is, and making it sound like your critique is more independent of your nonstandard views on track records and “opinions” than it actually is.
That’s the part that bugs me. If you have an extreme proposal for changing EA’s norms, argue for that proposal. Don’t just selectively take potshots at views or people you dislike more, while going easy on everyone else.
That’s the part that bugs me. If you have an extreme proposal for changing EA’s norms, argue for that proposal. Don’t just selectively take potshots at views or people you dislike more, while going easy on everyone else.
I think Jotto has argued for the proposal in the past. Whether he did it in that particular comment is not very important, so long as he holds everyone to the same standards.
As for his standards: I think he sees Eliezer as an easy target because he’s high status in this community and has explicitly said that he thinks his track record is good (in fact, better than other people). On its own, therefore, it’s not surprising that Eliezer would get singled out.
I no longer see exchanges with you as a good use of energy, unless you’re able to describe some of the strawmanning of me you’ve done and come clean about that.
EDIT: Since this is being downvoted, here is a comment chain where Rob Besinger interpreted me in ways that are bizarre, such as suggesting that I think Eliezer is saying he has “a crystal ball”, or that “if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist”. Things that sound thematically similar to what I was saying, but were weird, persistent extremes that I don’t see as good-faith readings of me. It kept happening over Twitter, then again on LW. At no point have I felt he’s trying to understand what I actually think. So I don’t see the point of continuing with him.
IMO that’s a much more defensible position, and is what the discussion should have initially focused on. From my perspective, the way the debate largely went is:
Jotto: Eliezer claims to have a relatively successful forecasting track record, along with Dario and Demis; but this is clearly dissembling, because a forecasting track record needs to look like a long series of Metaculus predictions.
Other people: (repeat without qualification the claim that Eliezer is falsely claiming to have a “forecasting track record”; simultaneously claims that Eliezer has a subpar “forecasting track record”, based on evidence that wouldn’t meet Jotto’s stated bar)
Jotto: (signal-boosts the inconsistent claims other people are making, without noting that this is equivocating between two senses of “track record” and therefore selectively applying two different standards)
Rob B: (gripes and complains)
Whereas the way the debate should have gone is:
Jotto: I personally disagree with Eliezer that the AI Foom debate is easy to understand and cash out into rough predictions about how the field has progressed since 2009, or how it is likely to progress in the future. Also, I wish that all of Eliezer, Robin, Demis, Dario, and Paul had made way more Metaculus-style forecasts back in 2010, so it would be easier to compare their prediction performance. I find it frustrating that nobody did this, and think we should start doing this way more now. Also, I think this sharper comparison would probably have shown that Eliezer is significantly worse at thinking about this topic than Paul, and maybe than Robin, Demis, and Dario.
Rob B: I disagree with your last sentence, and I disagree quantitatively that stuff like the Foom debate is as hard-to-interpret as you suggest. But I otherwise agree with you, and think it would have been useful if the circa-2010 discussions had included more explicit probability distributions, scenario breakdowns, quantitative estimates, etc. (suitably flagged as unstable, spitballed ass-numbers). Even where these aren’t cruxy and don’t provide clear evidence about people’s quality of reasoning about AGI, it’s still just helpful to have a more precise sense of what people’s actual beliefs at the time were. “X is unlikely” is way less useful than knowing whether it’s more like 30%, or more like 5%, or more like 0.1%, etc.
I think the whole ‘X isn’t a real track record’ thing was confusing, and made your argument sound more forceful than it should have.
Plus maybe some disagreements about how possible it is in general to form good models of people and of topics like AGI in the absence of Metaculus-ish forecasts, and disagreement about exactly how informative it would be to have a hundred examples of narrow-AI benchmark predictions over the last ten years from all the influential EAs?
(I think it would be useful, but more like ‘1% to 10% of the overall evidence for weighing people’s reasoning and correctness about AGI’, not ’90% to 100% of the evidence’.)
(An exception would be if, e.g., it turned out that ML progress is way more predictable than Eliezer or I believe. ML’s predictability is a genuine crux for us, so seeing someone else do amazing at this prediction task for a bunch of years, with foresight rather than hindsight, would genuinely update us a bunch. But we don’t expect to learn much from Eliezer or Rob trying to predict stuff, because while someone else may have secret insight that lets them predict the future of narrow-AI advances very narrowly, we are pretty sure we don’t know how to do that.)
Part of what I object to is that you’re a Metaculus radical, whose Twitter bio says “Replace opinions with forecasts.”
This is a view almost no one in the field currently agrees with or tries to live up to.
Which is fine, on its own. I like radicals, and want to hear their views argued for and hashed out in conversation.
But then you selectively accuse Eliezer of lying about having a “track record”, without noting how many other people are also expressing non-forecast “opinions” (and updating on these), and while using language in ways that make it sound like Eliezer is doing something more unusual than he is, and making it sound like your critique is more independent of your nonstandard views on track records and “opinions” than it actually is.
That’s the part that bugs me. If you have an extreme proposal for changing EA’s norms, argue for that proposal. Don’t just selectively take potshots at views or people you dislike more, while going easy on everyone else.
I think Jotto has argued for the proposal in the past. Whether he did it in that particular comment is not very important, so long as he holds everyone to the same standards.
As for his standards: I think he sees Eliezer as an easy target because he’s high status in this community and has explicitly said that he thinks his track record is good (in fact, better than other people). On its own, therefore, it’s not surprising that Eliezer would get singled out.
I no longer see exchanges with you as a good use of energy, unless you’re able to describe some of the strawmanning of me you’ve done and come clean about that.
EDIT: Since this is being downvoted, here is a comment chain where Rob Besinger interpreted me in ways that are bizarre, such as suggesting that I think Eliezer is saying he has “a crystal ball”, or that “if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist”. Things that sound thematically similar to what I was saying, but were weird, persistent extremes that I don’t see as good-faith readings of me. It kept happening over Twitter, then again on LW. At no point have I felt he’s trying to understand what I actually think. So I don’t see the point of continuing with him.