Field of AGI research plausibly commenced on 1956 with Dartmouth conference. What happens if one uses Laplace’s rule? Then a priori pretty implausible that it will happen, if it hasn’t happened soon.
How do information cascades work in this context? How many researchers would I expect to have read and recall a reward gaming list (1, 2, 3, 4)
Rohin probably doesn’t actually have enough information or enough forecasting firepower to predict that it hasn’t happened at 0.1%, and be calibrated. He probably does have the expertise, though. I did some experiments a while ago, and “I’d be very surprised if I were wrong” translated for me to a 95%. YMMV.
An argument would go: “The question looks pretty fuzzy to me, having moving parts. Long tails are good in that case, and other forecasters who have found some small piece of evidence are over-updating.” Some quotes:
There is strong experimental evidence, however, that such self-insight is usually faulty. The expert perceives his or her own judgmental process, including the number of different kinds of information taken into account, as being considerably more complex than is in fact the case. Experts overestimate the importance of factors that have only a minor impact on their judgment and underestimate the extent to which their decisions are based on a few major variables. In short, people’s mental models are simpler than they think, and the analyst is typically unaware not only of which variables should have the greatest influence, but also which variables actually are having the greatest influence. (Source: Psychology of Intelligence Analysis
, Chapter 5)
Our judges in this study were eight individuals, carefully selected for their expertise as handicappers. Each judge was presented with a list of 88 variables culled from the past performance charts. He was asked to indicate which five variables out of the 88 he would wish to use when handicapping a race, if all he could have was five variables. He was then asked to indicate which 10, which 20, and which 40 he would use if 10, 20, or 40 were available to him.
We see that accuracy was as good with five variables as it was with 10, 20, or 40. The flat curve is an average over eight subjects and is somewhat misleading. Three of the eight actually showed a decrease in accuracy with more information, two improved, and three stayed about the same. All of the handicappers became more confident in their judgments as information increased. (Source: Behavioral Problems of Adhering to a Decision Policy)
I’m not sure to what extent this is happening with forecasters here: finding a particularly interesting and unique nugget of information and then over-updating. I’m also not sure to what extent I actually believe that this question is fuzzy and so long tails are good.
Here is my first entry to the competition.
Here is my second and last entry to the competition. My changes are that I’ve assigned some probability (5%; I’d personally assign 10%) that it has already happened.
Some notes about that distribution:
Note that this is not my actual distribution, this is my guess as to how Rohin will update
My guess doesn’t manipulate Rohin’s distribution much; I expect that Rohin will not in fact change his mind a lot.
In fact, this is not exactly my guess as how Rohin will update. That is, I’m not maximizing expected accuracy, I’m ~maximizing the chance of getting first place (subject to spending little time on this)
Some quick comments at forecasters:
I think that the distinction between the forecaster’s beliefs and Rohin’s is being neglected. Some of the snapshots predict huge updates, which really don’t seem likely.
I was initially going to comment “yeah I meant to put 1% on ‘already happened’ but at the time that I made my distribution the option wasn’t there” and then I reread my prior reasoning and saw the 0.1%. Not sure what happened there, I agree that 0.1% is way too confident.
On Laplace’s rule: as with most outside views, it’s tricky to say what your reference class should be. You could go with the Dartmouth conference, but given that we’re talking about the AI safety community influencing the AI community, you could also go with the publication of Superintelligence in 2014 (which feels like the first real attempt to communicate with the AI community), and then you would be way more optimistic. (I might be neglecting lots of failed attempts by SIAI / MIRI, but my impression is that they didn’t try to engage the academic AI community very much.)
I don’t buy the point about there being good heuristics against x-risk: the premise of my reasoning was that we get warning shots, which would negate many (though not all) of the heuristics.
Notes
Field of AGI research plausibly commenced on 1956 with Dartmouth conference. What happens if one uses Laplace’s rule? Then a priori pretty implausible that it will happen, if it hasn’t happened soon.
How do information cascades work in this context? How many researchers would I expect to have read and recall a reward gaming list (1, 2, 3, 4)
Here is A list of good heuristics that the case for AI x-risk fails. I’d expect that these, being pretty good heuristics, will keep having an effect on AGI researchers that will continue keeping them away from considering x-risks.
Rohin probably doesn’t actually have enough information or enough forecasting firepower to predict that it hasn’t happened at 0.1%, and be calibrated. He probably does have the expertise, though. I did some experiments a while ago, and “I’d be very surprised if I were wrong” translated for me to a 95%. YMMV.
An argument would go: “The question looks pretty fuzzy to me, having moving parts. Long tails are good in that case, and other forecasters who have found some small piece of evidence are over-updating.” Some quotes:
I’m not sure to what extent this is happening with forecasters here: finding a particularly interesting and unique nugget of information and then over-updating. I’m also not sure to what extent I actually believe that this question is fuzzy and so long tails are good.
Here is my first entry to the competition. Here is my second and last entry to the competition. My changes are that I’ve assigned some probability (5%; I’d personally assign 10%) that it has already happened.
Some notes about that distribution:
Note that this is not my actual distribution, this is my guess as to how Rohin will update
My guess doesn’t manipulate Rohin’s distribution much; I expect that Rohin will not in fact change his mind a lot.
In fact, this is not exactly my guess as how Rohin will update. That is, I’m not maximizing expected accuracy, I’m ~maximizing the chance of getting first place (subject to spending little time on this)
Some quick comments at forecasters:
I think that the distinction between the forecaster’s beliefs and Rohin’s is being neglected. Some of the snapshots predict huge updates, which really don’t seem likely.
I was initially going to comment “yeah I meant to put 1% on ‘already happened’ but at the time that I made my distribution the option wasn’t there” and then I reread my prior reasoning and saw the 0.1%. Not sure what happened there, I agree that 0.1% is way too confident.
On Laplace’s rule: as with most outside views, it’s tricky to say what your reference class should be. You could go with the Dartmouth conference, but given that we’re talking about the AI safety community influencing the AI community, you could also go with the publication of Superintelligence in 2014 (which feels like the first real attempt to communicate with the AI community), and then you would be way more optimistic. (I might be neglecting lots of failed attempts by SIAI / MIRI, but my impression is that they didn’t try to engage the academic AI community very much.)
I don’t buy the point about there being good heuristics against x-risk: the premise of my reasoning was that we get warning shots, which would negate many (though not all) of the heuristics.
+1 for long tails