First I commend the effort you’re putting into responding to me, and I probably can’t reciprocate as much.
But here is a major point I suspect you are misunderstanding:
It seems like you’re interpreting EY as claiming ‘I have a crystal ball that gives me unique power to precisely time AGI’, whereas I interpret EY as saying that one particular Metaculus estimate is wrong.
This is neither necessary for my argument, nor at any point have I thought he’s saying he can “precisely time AGI”.
If he thought it was going to happen earlier than the community, it would be easy to show an example distribution of his, without high precision (nor much effort). Literally just add a distribution into the box on the question page, click and drag the sliders so it’s somewhere that seems reasonable to him, and submit it. He could then screenshot it. Even just copypasting the confidence interval figures.
Note that this doesn’t mean making the date range very narrow (confident), that’s unrelated. He can still be quite uncertain about specific times. Here’s an example of me somewhat disagreeing with the community. Of course now the community has updated to earlier, but he can still do these things, and should. Doesn’t even need to be screenshotted really, just posting it in the Metaculus thread works.
And further, this point you make here:
Eliezer hasn’t said he thinks he can do better than Metaculus on arbitrary questions. He’s just said he thinks Metaculus is wrong on one specific question.
My argument doesn’t need him to necessarily be better at “arbitrary questions”. If Eliezer believes Metaculus is wrong on one specific question, he can trivially show a better answer. If he does this on a few questions and it gets properly scored, that’s a track record.
You mentioned other things, such as how much it would transfer to broader, longer-term questions. That isn’t known and I can’t stay up late typing about this, but at the very minimum people can demonstrate they are calibrated, even if you believe there is zero knowledge transfer from narrower/shorter questions to broader/longer ones.
Going to have to stop it there for today, but I would end this comment with a feeling: it feels like I’m mostly debating people who think they can predict when Tetlock’s findings don’t apply, and so reliably that it’s unnecessary to forecast properly nor transparently, and it seems like they don’t understand.
Note that this doesn’t mean making the date range very narrow (confident), that’s unrelated.
Fair enough, but I was responding to a pair of tweets where you said:
Eliezer says that nobody knows much about AI timelines. But then keeps saying “I knew [development] would happen sooner than you guys thought”. Every time he does that, he’s conning people.
I know I’m using strong wording. But I’d say the same in any other domain.
He should create a public Metaculus profile. Place a bunch of forecasts.
If he beats the community by the landslide he claims, then I concede.
If he’s mediocre, then he was conning people.
‘It would be convenient if Eliezer would record his prediction on Metaculus, so we know with more precision how strong of an update to make when he publicly says “my median is well before 2050” and Metaculus later updates toward a nearer-term median’ is a totally fair request, but it doesn’t bear much resemblance to ‘if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist’. Seems way too extreme.
Likewise, ‘prove that you’re better than Metaculus on a ton of forecasts or you’re a con artist’ seems like a wild response to ‘Metaculus was slower than me to update about a specific quantity in a single question’. So I’m trying to connect the dots, and I end up generating hypotheses like:
Maybe Jotto is annoyed that Eliezer is confident about hard takeoff, not just that he has a nearer timelines median than Metaculus. And maybe Jotto specifically thinks that there’s no way you can rationally be confident about hard takeoff unless you think you’re better than Metaculus at timing tons of random narrow AI things.
So then it follows that if you’re avoiding testing your mettle vs. Metaculus on a bunch of random narrow AI predictions, then you must not have any rational grounds for confidence in hard takeoff. And moreover this chain of reasoning is obvious, so Eliezer knows he has no grounds for confidence and is deliberately tricking us.
Or:
Maybe Jotto hears Eliezer criticize Paul for endorsing soft takeoff, and hears Eliezer criticize Metaculus for endorsing Ajeya-ish timelines, and Jotto concludes ‘ah, Eliezer must think he’s amazing at predicting AI-ish events in general; this should be easy to test, so since he’s avoiding publicly testing it, he must be trying to trick us’.
In principle you could have an Eliezer-model like that and think that Eliezer has lots of nonstandard beliefs about random AI topics that make him way too confident about things like hard takeoff and yet his distributions tend to be wide, but that seems like a pretty weird combination of views to me, so I assumed that you’d also think Eliezer has relatively narrow distributions about everything.
it feels like I’m mostly debating people who think they can predict when Tetlock’s findings don’t apply, and so reliably that it’s unnecessary to forecast properly nor transparently, and it seems like they don’t understand.
I think there’s a good amount of overlap between MIRI- and CFAR-ish views of rationality and Tetlock-ish views, but I also don’t think of Tetlock’s tips as the be-all end-all of learning things about the world, of doing science, etc., and I don’t see his findings as showing that we should give up on inside-view model-building, not-fully-explicit-and-quantified reasoning under uncertainty, or any of the suggestions in When (Not) To Use Probabilities.
(Nor do I think Tetlock would endorse the ‘no future-related knowledge generation except via Metaculus or prediction markets’ policy you seem to be proposing. Maybe if we surveyed them we’d find out that Tetlock thinks Metaculus is 25% cooler than Eliezer does, or something? It’s not obvious to me that it matters.)
Also, I think you said on Twitter that Eliezer’s a liar unless he generates some AI prediction that lets us easily falsify his views in the near future? Which seems to require that he have very narrow confidence intervals about very near-term events in AI.
So I continue to not understand what it is about the claims ‘the median on my AGI timeline is well before 2050’, ‘Metaculus updated away from 2050 after I publicly predicted it was well before 2050’, or ‘hard takeoff is true with very high probability’, that makes you think someone must have very narrow contra-mainstream distributions on near-term narrow-AI events or else they’re lying.
‘if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist’. Seems way too extreme.
No, I don’t mean the distinguishing determinant of con-artist or not-con-artist trait is whether it’s recorded on Metaculus. It’s mentioned in that tweet because if you’re going to bother doing it, might as well go all the way and show a distribution.
But even if he just posted a confidence interval, on some site other than Metaculus, that would be a huge upgrade. Because then anyone could add it to a spreadsheet scorable forecasts, and reconstruct it without too much effort.
‘if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist’. Seems way too extreme.
No, that’s not what I’m saying. The main thing is that they be scorable. But if someone is going to do it at all, then doing it on Metaculus just makes more sense—the administrative work is already taken care of, and there’s no risk of cherry-picking nor omission.
Also, from another reply you gave:
Also, I think you said on Twitter that Eliezer’s a liar unless he generates some AI prediction that lets us easily falsify his views in the near future? Which seems to require that he have very narrow confidence intervals about very near-term events in AI.
I never used the term “liar”. The thing he’s doing that I think is bad is more like what a pundit does, like the guy who calls recessions, a sort of epistemic conning. “Lying” is different, at least to me.
More importantly, no he doesn’t necessarily need to have really narrow distributions, and I don’t know why you think this. Only if he was squashed close against the “Now” side on the chart, then yes it would be “narrower”—but if that’s what Eliezer thinks, if he’s saying himself it’s earlier than x date, then on a graph that looks like it’s a bit narrower and shifted to the left, and it simply reflects what he believes.
There’s nothing about how we score forecasters that requires him have “very narrow” confidence intervals about very near-term events in AI, in order to measure alpha. To help me understand, can you describe why you think this? Why don’t you think alpha would start being measurable with merely slightly more narrow confidence intervals than the community, and centered closer to the actual outcome?
EDIT a week later: I have decided that several of your misunderstandings should be considered strawmanning, and I’ve switched from upvoting some of your comments here to downvoting them.
First I commend the effort you’re putting into responding to me, and I probably can’t reciprocate as much.
But here is a major point I suspect you are misunderstanding:
This is neither necessary for my argument, nor at any point have I thought he’s saying he can “precisely time AGI”.
If he thought it was going to happen earlier than the community, it would be easy to show an example distribution of his, without high precision (nor much effort). Literally just add a distribution into the box on the question page, click and drag the sliders so it’s somewhere that seems reasonable to him, and submit it. He could then screenshot it. Even just copypasting the confidence interval figures.
Note that this doesn’t mean making the date range very narrow (confident), that’s unrelated. He can still be quite uncertain about specific times. Here’s an example of me somewhat disagreeing with the community. Of course now the community has updated to earlier, but he can still do these things, and should. Doesn’t even need to be screenshotted really, just posting it in the Metaculus thread works.
And further, this point you make here:
My argument doesn’t need him to necessarily be better at “arbitrary questions”. If Eliezer believes Metaculus is wrong on one specific question, he can trivially show a better answer. If he does this on a few questions and it gets properly scored, that’s a track record.
You mentioned other things, such as how much it would transfer to broader, longer-term questions. That isn’t known and I can’t stay up late typing about this, but at the very minimum people can demonstrate they are calibrated, even if you believe there is zero knowledge transfer from narrower/shorter questions to broader/longer ones.
Going to have to stop it there for today, but I would end this comment with a feeling: it feels like I’m mostly debating people who think they can predict when Tetlock’s findings don’t apply, and so reliably that it’s unnecessary to forecast properly nor transparently, and it seems like they don’t understand.
Fair enough, but I was responding to a pair of tweets where you said:
‘It would be convenient if Eliezer would record his prediction on Metaculus, so we know with more precision how strong of an update to make when he publicly says “my median is well before 2050” and Metaculus later updates toward a nearer-term median’ is a totally fair request, but it doesn’t bear much resemblance to ‘if you record any prediction anywhere other than Metaculus (that doesn’t have similarly good tools for representing probability distributions), you’re a con artist’. Seems way too extreme.
Likewise, ‘prove that you’re better than Metaculus on a ton of forecasts or you’re a con artist’ seems like a wild response to ‘Metaculus was slower than me to update about a specific quantity in a single question’. So I’m trying to connect the dots, and I end up generating hypotheses like:
Maybe Jotto is annoyed that Eliezer is confident about hard takeoff, not just that he has a nearer timelines median than Metaculus. And maybe Jotto specifically thinks that there’s no way you can rationally be confident about hard takeoff unless you think you’re better than Metaculus at timing tons of random narrow AI things.
So then it follows that if you’re avoiding testing your mettle vs. Metaculus on a bunch of random narrow AI predictions, then you must not have any rational grounds for confidence in hard takeoff. And moreover this chain of reasoning is obvious, so Eliezer knows he has no grounds for confidence and is deliberately tricking us.
Or:
Maybe Jotto hears Eliezer criticize Paul for endorsing soft takeoff, and hears Eliezer criticize Metaculus for endorsing Ajeya-ish timelines, and Jotto concludes ‘ah, Eliezer must think he’s amazing at predicting AI-ish events in general; this should be easy to test, so since he’s avoiding publicly testing it, he must be trying to trick us’.
In principle you could have an Eliezer-model like that and think that Eliezer has lots of nonstandard beliefs about random AI topics that make him way too confident about things like hard takeoff and yet his distributions tend to be wide, but that seems like a pretty weird combination of views to me, so I assumed that you’d also think Eliezer has relatively narrow distributions about everything.
Have you read Inadequate Equilibria, or R:AZ? (Or my distinction between ‘rationality as prothesis’ and ‘rationality as strength training’?)
I think there’s a good amount of overlap between MIRI- and CFAR-ish views of rationality and Tetlock-ish views, but I also don’t think of Tetlock’s tips as the be-all end-all of learning things about the world, of doing science, etc., and I don’t see his findings as showing that we should give up on inside-view model-building, not-fully-explicit-and-quantified reasoning under uncertainty, or any of the suggestions in When (Not) To Use Probabilities.
(Nor do I think Tetlock would endorse the ‘no future-related knowledge generation except via Metaculus or prediction markets’ policy you seem to be proposing. Maybe if we surveyed them we’d find out that Tetlock thinks Metaculus is 25% cooler than Eliezer does, or something? It’s not obvious to me that it matters.)
Also, I think you said on Twitter that Eliezer’s a liar unless he generates some AI prediction that lets us easily falsify his views in the near future? Which seems to require that he have very narrow confidence intervals about very near-term events in AI.
So I continue to not understand what it is about the claims ‘the median on my AGI timeline is well before 2050’, ‘Metaculus updated away from 2050 after I publicly predicted it was well before 2050’, or ‘hard takeoff is true with very high probability’, that makes you think someone must have very narrow contra-mainstream distributions on near-term narrow-AI events or else they’re lying.
Some more misunderstanding:
No, I don’t mean the distinguishing determinant of con-artist or not-con-artist trait is whether it’s recorded on Metaculus. It’s mentioned in that tweet because if you’re going to bother doing it, might as well go all the way and show a distribution.
But even if he just posted a confidence interval, on some site other than Metaculus, that would be a huge upgrade. Because then anyone could add it to a spreadsheet scorable forecasts, and reconstruct it without too much effort.
No, that’s not what I’m saying. The main thing is that they be scorable. But if someone is going to do it at all, then doing it on Metaculus just makes more sense—the administrative work is already taken care of, and there’s no risk of cherry-picking nor omission.
Also, from another reply you gave:
I never used the term “liar”. The thing he’s doing that I think is bad is more like what a pundit does, like the guy who calls recessions, a sort of epistemic conning. “Lying” is different, at least to me.
More importantly, no he doesn’t necessarily need to have really narrow distributions, and I don’t know why you think this. Only if he was squashed close against the “Now” side on the chart, then yes it would be “narrower”—but if that’s what Eliezer thinks, if he’s saying himself it’s earlier than x date, then on a graph that looks like it’s a bit narrower and shifted to the left, and it simply reflects what he believes.
There’s nothing about how we score forecasters that requires him have “very narrow” confidence intervals about very near-term events in AI, in order to measure alpha. To help me understand, can you describe why you think this? Why don’t you think alpha would start being measurable with merely slightly more narrow confidence intervals than the community, and centered closer to the actual outcome?
EDIT a week later: I have decided that several of your misunderstandings should be considered strawmanning, and I’ve switched from upvoting some of your comments here to downvoting them.