So, I notice that still doesn’t answer the actual question of what my probability should actually be. To make things simple, let’s assume that, if the sun exploded, I would die instantly. In practice it would have to take at least eight minutes, but as a simplifying assumption, let’s assume it’s instantaneous.
In the absence of relevant evidence, it seems to me like Laplace’s Law of Succession would say the odds of the sun exploding in the next hour is 1⁄2. But I could also make that argument to say the odds of the sun exploding in the next year is also 1⁄2, which is nonsensical. So...what’s my actual probability, here, if I know nothing about how the sun works except that it has not yet exploded, the sun is very old (which shouldn’t matter, if I understand you correctly) and that if it exploded, we would all die?
In practice it would have to take at least eight minutes
We don’t need to consider that here because any evidence of the explosion would also take at least eight minutes to arrive, so there is approximately zero minutes during which you are able to observe the evidence of the explosion before you are converted into a plasma that has no ability to update on anything. That is when observational selection effects are at their strongest: namely, when you are vanishingly unlikely to be in one of those intervals between your having observed an event and that event’s destroying your ability to maintain any kind of mental model of reality.
We 21st-century types have so much causal information about reality that I have been unable during this reply to imagine any circumstance where I would resort to Laplace’s law of succession to estimate any probability in anger where observational selection effects also need to be considered. It’s not that I doubt the validity of the law; its just that I have been unable to imagine a situation in which the causal information I have about an “event” does not trump the statistical information I have about how many times the event has been observed to occur in the past and I also have enough causal information to entertain real doubts about my ability to survive if the event goes the wrong way while remaining confident in my survival if the event goes the right way.
Certainly we can imagine ourselves in the situation of the physicists of the 1800s who had no solid guess as to the energy source keeping the sun shining steadily. But even they had the analogy with fire. (The emissions spectra of the sun and of fire are both I believe well approximated as blackbody radiation and the 1800s had prisms and consequently at least primitive spectrographs.) A fire doesn’t explode unless you suddenly give it fuel—and not any fuel will do: adding logs to a fire will not cause an explosion, but adding enough gasoline will. “Where would the fuel come from that would cause the sun to explode?” the 1800s can ask. Planets are made mostly of rocks, which don’t burn, and comets aren’t big enough. Merely what I have written in this short paragraph would be enough to trump IMO statistical
considerations of how many days the sun has gone without exploding.
If I found myself in a star-trek episode in which every night during sleep I find myself transported into some bizarre realm of “almost-pure sensation” where none of my knowledge of reality seems to apply and where a sun-like thing rises and sets, then yeah, I can imagine using the law of succession, but then for observational selection effects to enter the calculation, I’d have to have enough causal information about this sun-like thing (and about my relationship to the bizarre realm) to doubt my ability to survive if it sets and never rises again, but that seems to contradict the assumption that none of my knowledge of reality applies to the bizarre realm.
My probability of the sun’s continuing to set and rise without exploding is determined exclusively by (causal) knowledge created by physicists and passed down to me in books, etc: how many times the sun has risen so far is in comparison of negligible importance. This knowledge is solid and “settled” enough that it is extremely unlikely that any sane physicist would announce that, well, actually, the sun is going to explode—probably within our lifetimes! But if a sane physicist did make such an announcement, I would focus on the physicist’s argument (causal knowledge) and pay almost no attention to the statistical information of how long there have been reliable observations of the sun’s not exploding—and this is true even if I were sure I could survive if the sun exploded—because the causal model is so solid (and the facts the model depends on, e.g., the absorption spectra of hydrogen and helium, are so easily checked). Consequently, the explosion of the sun is not a good example of where observational selection effects become important.
By the way, observational selection effects are hairy enough that I basically cannot calculate anything about them. Suppose for example that if Russia attacked the US with nukes, I would survive with p = .4 (which seems about right). (I live in the US.) Suppose further that my causal model of Russian politics makes my probability that Russia will attack the US with nukes some time in the next 365 days as .003 if Russia had deployed nukes for the first time today (i.e., if Russia didn’t have any nukes till right now). How should I adjust my probability (i.e., the .003) to take into account that fact that Russia’s nukes were in fact deployed starting in 1953 (year?) and so far Russia has never attacked the US with nukes? I don’t know! (And I have practical reasons for wanting to do this particular calculation, so I’ve thought a lot about it over the years. I do know that my probability should be greater than it should be if I and my ability to reason were impervious to nuclear attacks. In contrast to the solar-explosion situation, here is a situation in which the causal knowledge is uncertain enough that it would be genuinely useful to employ the statistical knowledge we have; it is just that I don’t know how to employ it in a calculation.) But things that are almost certain to end my life are much easier to reason about—when it comes to observational selection effects—than something that has a .4 chance of ending my life.
In particular, most of the expected negative utility from AGI research stems from scenarios in which without warning—more precisely, without anything that the average person would recognize as a warning—an AGI kills every one of us. The observational selection effects around such a happening are easier to reason about than those around a nuclear attack: specifically, the fact that the predicted event hasn’t happened yet is not evidence at all that it will not happen in the future. If a powerful magician kills everyone who tries to bring you the news that the Red Socks have won the World Series of Baseball, and if that magician is extremely effective at his task, then your having observed that the Yankees win the World Series every time it occurs (which is strangely not every year, but some years have no World Series as far as you have heard) is not evidence at all about how often the Red Socks have won the World Series.
And the fact that Eliezer has been saying for at least a few months now that AGI could kill us all any day now—that the probability that it will happen 15 years from now is greater than that probability that it will happen today, but the probability it will happen today is nothing to scoff at—is is very weak evidence against what he’s been saying if it is evidence against it at all. A sufficiently rational person will assign what he has been saying the same or very nearly the same probability he would have if Eliezer had started saying it today. In both cases, a sufficiently rational person will focus almost entirely on Eliezer’s argument (complicated though it is) and counterarguments and will give almost no weight to how long Eliezer’s been saying it or how long AGIs have been in existence. Or more precisely, that is what a sufficiently rational person would do if he or she believed that he or she is unlikely to receive any advance warning of a deadly strike by the AGI beyond the warnings given so far by Eliezer and other AGI pessimists.
Eliezer’s argument is more complicated than the reasoning that tells us that the sun will not explode any time soon. More complicated means more likely to contain a subtle flaw. Moreover, it has been reviewed by fewer experts than the solar argument. Consequently, here is a situation in which it would be genuinely useful to use statistical information (e.g., the fact that research labs have been running AGIs for years (ChatGPT is an AGI for example) combined with the fact that we are still alive) but the statistical information is in fact IMO useless because of the extremely strong observational selection effects.
So, I notice that still doesn’t answer the actual question of what my probability should actually be. To make things simple, let’s assume that, if the sun exploded, I would die instantly. In practice it would have to take at least eight minutes, but as a simplifying assumption, let’s assume it’s instantaneous.
In the absence of relevant evidence, it seems to me like Laplace’s Law of Succession would say the odds of the sun exploding in the next hour is 1⁄2. But I could also make that argument to say the odds of the sun exploding in the next year is also 1⁄2, which is nonsensical. So...what’s my actual probability, here, if I know nothing about how the sun works except that it has not yet exploded, the sun is very old (which shouldn’t matter, if I understand you correctly) and that if it exploded, we would all die?
We don’t need to consider that here because any evidence of the explosion would also take at least eight minutes to arrive, so there is approximately zero minutes during which you are able to observe the evidence of the explosion before you are converted into a plasma that has no ability to update on anything. That is when observational selection effects are at their strongest: namely, when you are vanishingly unlikely to be in one of those intervals between your having observed an event and that event’s destroying your ability to maintain any kind of mental model of reality.
We 21st-century types have so much causal information about reality that I have been unable during this reply to imagine any circumstance where I would resort to Laplace’s law of succession to estimate any probability in anger where observational selection effects also need to be considered. It’s not that I doubt the validity of the law; its just that I have been unable to imagine a situation in which the causal information I have about an “event” does not trump the statistical information I have about how many times the event has been observed to occur in the past and I also have enough causal information to entertain real doubts about my ability to survive if the event goes the wrong way while remaining confident in my survival if the event goes the right way.
Certainly we can imagine ourselves in the situation of the physicists of the 1800s who had no solid guess as to the energy source keeping the sun shining steadily. But even they had the analogy with fire. (The emissions spectra of the sun and of fire are both I believe well approximated as blackbody radiation and the 1800s had prisms and consequently at least primitive spectrographs.) A fire doesn’t explode unless you suddenly give it fuel—and not any fuel will do: adding logs to a fire will not cause an explosion, but adding enough gasoline will. “Where would the fuel come from that would cause the sun to explode?” the 1800s can ask. Planets are made mostly of rocks, which don’t burn, and comets aren’t big enough. Merely what I have written in this short paragraph would be enough to trump IMO statistical considerations of how many days the sun has gone without exploding.
If I found myself in a star-trek episode in which every night during sleep I find myself transported into some bizarre realm of “almost-pure sensation” where none of my knowledge of reality seems to apply and where a sun-like thing rises and sets, then yeah, I can imagine using the law of succession, but then for observational selection effects to enter the calculation, I’d have to have enough causal information about this sun-like thing (and about my relationship to the bizarre realm) to doubt my ability to survive if it sets and never rises again, but that seems to contradict the assumption that none of my knowledge of reality applies to the bizarre realm.
My probability of the sun’s continuing to set and rise without exploding is determined exclusively by (causal) knowledge created by physicists and passed down to me in books, etc: how many times the sun has risen so far is in comparison of negligible importance. This knowledge is solid and “settled” enough that it is extremely unlikely that any sane physicist would announce that, well, actually, the sun is going to explode—probably within our lifetimes! But if a sane physicist did make such an announcement, I would focus on the physicist’s argument (causal knowledge) and pay almost no attention to the statistical information of how long there have been reliable observations of the sun’s not exploding—and this is true even if I were sure I could survive if the sun exploded—because the causal model is so solid (and the facts the model depends on, e.g., the absorption spectra of hydrogen and helium, are so easily checked). Consequently, the explosion of the sun is not a good example of where observational selection effects become important.
By the way, observational selection effects are hairy enough that I basically cannot calculate anything about them. Suppose for example that if Russia attacked the US with nukes, I would survive with p = .4 (which seems about right). (I live in the US.) Suppose further that my causal model of Russian politics makes my probability that Russia will attack the US with nukes some time in the next 365 days as .003 if Russia had deployed nukes for the first time today (i.e., if Russia didn’t have any nukes till right now). How should I adjust my probability (i.e., the .003) to take into account that fact that Russia’s nukes were in fact deployed starting in 1953 (year?) and so far Russia has never attacked the US with nukes? I don’t know! (And I have practical reasons for wanting to do this particular calculation, so I’ve thought a lot about it over the years. I do know that my probability should be greater than it should be if I and my ability to reason were impervious to nuclear attacks. In contrast to the solar-explosion situation, here is a situation in which the causal knowledge is uncertain enough that it would be genuinely useful to employ the statistical knowledge we have; it is just that I don’t know how to employ it in a calculation.) But things that are almost certain to end my life are much easier to reason about—when it comes to observational selection effects—than something that has a .4 chance of ending my life.
In particular, most of the expected negative utility from AGI research stems from scenarios in which without warning—more precisely, without anything that the average person would recognize as a warning—an AGI kills every one of us. The observational selection effects around such a happening are easier to reason about than those around a nuclear attack: specifically, the fact that the predicted event hasn’t happened yet is not evidence at all that it will not happen in the future. If a powerful magician kills everyone who tries to bring you the news that the Red Socks have won the World Series of Baseball, and if that magician is extremely effective at his task, then your having observed that the Yankees win the World Series every time it occurs (which is strangely not every year, but some years have no World Series as far as you have heard) is not evidence at all about how often the Red Socks have won the World Series.
And the fact that Eliezer has been saying for at least a few months now that AGI could kill us all any day now—that the probability that it will happen 15 years from now is greater than that probability that it will happen today, but the probability it will happen today is nothing to scoff at—is is very weak evidence against what he’s been saying if it is evidence against it at all. A sufficiently rational person will assign what he has been saying the same or very nearly the same probability he would have if Eliezer had started saying it today. In both cases, a sufficiently rational person will focus almost entirely on Eliezer’s argument (complicated though it is) and counterarguments and will give almost no weight to how long Eliezer’s been saying it or how long AGIs have been in existence. Or more precisely, that is what a sufficiently rational person would do if he or she believed that he or she is unlikely to receive any advance warning of a deadly strike by the AGI beyond the warnings given so far by Eliezer and other AGI pessimists.
Eliezer’s argument is more complicated than the reasoning that tells us that the sun will not explode any time soon. More complicated means more likely to contain a subtle flaw. Moreover, it has been reviewed by fewer experts than the solar argument. Consequently, here is a situation in which it would be genuinely useful to use statistical information (e.g., the fact that research labs have been running AGIs for years (ChatGPT is an AGI for example) combined with the fact that we are still alive) but the statistical information is in fact IMO useless because of the extremely strong observational selection effects.