If I saw a Yudkowsky tweet saying “I have a great forecasting track record” or “I have a better forecasting track record than Metaculus” my immediate reaction would be “Lol no you don’t fuck off.” When I read the first few lines of your post, I expected to shortly see a pic of such a tweet as proof. In anticipation my “lol fuck you Yudkowsky” reaction already began to rise within me.
But then when I saw the stuff you actually quoted, it seemed… much more reasonable? In particular, him dumping on Metaculus for updating so hard on Gato seemed… correct? Metaculus really should have updated earlier, Gato just put together components that were already published in the last few years. So then I felt that if I had only skimmed the first part of your post and not read the actual post, I would have had an unfairly negative opinion of Yudkowsky, due to the language you used: “He has several times claimed to have a great forecasting track record.”
For what it’s worth, I agree that Yudkowsky is pretty rude and obnoxious & that he should probably get off Twitter if this is how he’s going to behave. Like, yes, he has alpha about this AI stuff; he gets to watch as the “market” gradually corrects and converges to his position. Yay. Good for him. But he’s basically just stroking his own ego by tweeting about it here; I don’t see any altruistic purpose served by it.
I am a forecaster on that question: the main doubt I had was if/when someone would try to do wordy things + game playing on a “single system”. Seemed plausible to me that this particular combination of capabilities never became an exciting area of research, so the date at which an AI can first do these things would then be substantially after this combination of tasks would be achievable with focused effort. Gato was a substantial update because it does exactly these tasks, so I no longer see much reason possibility that the benchmark is achieved only after the capabilities are substantially overshot.
I also tend to defer somewhat to the community.
I was at 2034 when the community was at 2042, and I updated further to 2026 on the Gato news.
That’s good feedback. I can see why the wording I used gives the wrong impression—he didn’t literally say out loud that he has “a great forecasting track record”. It still seems to me heavily implied by several things he’s said, especially what he said to Paul.
I think the point you raise is valid enough. I have crossed out the word “claimed” in the essay, and replaced it with “implied”.
Perhaps this explains my position better:
If I saw a Yudkowsky tweet saying “I have a great forecasting track record” or “I have a better forecasting track record than Metaculus” my immediate reaction would be “Lol no you don’t fuck off.” When I read the first few lines of your post, I expected to shortly see a pic of such a tweet as proof. In anticipation my “lol fuck you Yudkowsky” reaction already began to rise within me.
But then when I saw the stuff you actually quoted, it seemed… much more reasonable? In particular, him dumping on Metaculus for updating so hard on Gato seemed… correct? Metaculus really should have updated earlier, Gato just put together components that were already published in the last few years. So then I felt that if I had only skimmed the first part of your post and not read the actual post, I would have had an unfairly negative opinion of Yudkowsky, due to the language you used: “He has several times claimed to have a great forecasting track record.”
For what it’s worth, I agree that Yudkowsky is pretty rude and obnoxious & that he should probably get off Twitter if this is how he’s going to behave. Like, yes, he has alpha about this AI stuff; he gets to watch as the “market” gradually corrects and converges to his position. Yay. Good for him. But he’s basically just stroking his own ego by tweeting about it here; I don’t see any altruistic purpose served by it.
I am a forecaster on that question: the main doubt I had was if/when someone would try to do wordy things + game playing on a “single system”. Seemed plausible to me that this particular combination of capabilities never became an exciting area of research, so the date at which an AI can first do these things would then be substantially after this combination of tasks would be achievable with focused effort. Gato was a substantial update because it does exactly these tasks, so I no longer see much reason possibility that the benchmark is achieved only after the capabilities are substantially overshot.
I also tend to defer somewhat to the community.
I was at 2034 when the community was at 2042, and I updated further to 2026 on the Gato news.
That’s good feedback. I can see why the wording I used gives the wrong impression—he didn’t literally say out loud that he has “a great forecasting track record”. It still seems to me heavily implied by several things he’s said, especially what he said to Paul.
I think the point you raise is valid enough. I have crossed out the word “claimed” in the essay, and replaced it with “implied”.
OK, thanks!