Overall, I think it’s a nice, systematic overview. Below are some comments.
I should note that I’m not very expert on these things. This is also why the additional literature I mention is mostly weakly related stuff from FRI, the organization I work for. Sorry about that.
An abstract would be nice.
Locators in the citations would be useful, i.e. “Beckstead (2013, sect. XYZ)” instead of just “Beckstead (2013)” when you talk about some specific section of the Beckstead paper. (Cf. section “Pageless Documentation” of the humurous Academic Citation Practice: A Sinking Sheep? by Ole Bjørn Rekdal.)
>from a totalist, consequentialist, and welfarist (but not necessarily utilitarian) point of view
I don’t think much of your analysis assumes welfarism (as I understand it)? Q_w could easily denote things other than welfare (e.g., how virtue ethical, free, productive, autonomous, natural, the mean person is), right? (I guess some of the discussion sections are fairly welfarist, i.e. they talk about suffering, etc., rather than freedom and so forth.)
>an existential risk as one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
Maybe some people would interpret this definition as excluding some of the “shrieks” and “whimpers”, since in some of them, “humanity’s potential is realized” in that it colonizes space, but not in accordance with, e.g., the reader’s values. Anyway, I think this definition is essentially a quote from Bostrom (maybe use quotation marks?), so it’s alright.
>The first is the probability P of reaching time t.
Maybe say more about why you separate N_w(t) (in the continuous model) into P(t) and N(t)?
I also don’t quite understand whether equation 1 is intended as the expected value of the future or as the expected value of a set of futures w that all have the same N_w(t) and Q_w(t). The problem is that if it’s the expected value of the future, I don’t get how you can simplify something like
∑wP(w)∑∞t=1Nw(t)Qw(t)
into the right side of your equation 1. (E.g., you can’t just let N(t) and Q(t) denote expected numbers of moral patients and expected mean qualities of life, because the mean qualities in larger worlds ought to count for more, right?)
I suspect that when reading the start of sect. 3.1, a lot of readers will wonder whether you endorse all the assumptions underlying your model of P(t). In particular, I would guess that people would disagree with the following two assumptions:
-> Short term x-risk reduction (r_1) doesn’t have any effect on long-term risk (r). Perhaps this is true for some fairly specific work on preventing extinction but it seems less likely for interventions like building up the UN (to avoid all kinds of conflict, coordinate against risks, etc.).
-> Long-term extinction risk is constant. I haven’t thought much about these issues but I would guess that extinction risk becomes much lower, once there is a self-sustaining colony on Mars.
Reading further, I see that you address these in sections 3.2 and 3.3. Maybe you could mention/refer to these somewhere near the start of sect. 3.1.
On page 3, you say that the derivative of -P(t) w.r.t. r_1 denotes the value of reducing r_1 by one unit. This is true in this case because P(t) is linear in r_1. But in general, the value of reducing r_1 by one unit is just P(t,r_1-1)-P(t,r_1), right?
Is equation 3, combined with the view that the cost of one unit of f1 is constant, consistent with Ord’s “A plausible model would be that it is roughly as difficult to halve the risk per century, regardless of its starting probability, and more generally, that it is equally difficult to reduce it by some proportion regardless of its absolute value beforehand.”? With your model, it looks like bringing f_1 from 0 to 0.5 and thus halfing r_1 is just as expensive as bringing f_1 from 0.5 to 1.
On p. 7, “not to far off”—probably you mean “too”?
>For example, perhaps we will inevitably develop some hypothetical weapons that give so large an advantage to offence over defence that civilisation is certain to be destroyed.
AI risk is another black ball that will become more accessible. But maybe you would rather not model it as extinction. At least AI risk doesn’t necessarily explain the Fermi paradox and AIs may create sentient beings.
>Ord argues that we may be able to expect future generations to be more interested in risk reduction, implying increasing f_i
I thought f_i was meant to model the impact that we can have on r_i? So, to me it seems more sensible to model the involvement of future generations, to the extent that we can’t influence it, as a “a kind of event E” (as you propose) or, more generally, as implying that the non-intervention risk levels r_i decrease.
>This would only reinforce the case for extinction risk reduction.
It seems that future generations caring about ERR makes short-term ERR more important (because the long-term future is longer and thus can contain more value). But it makes long-term ERR less important, because future generations will, e.g., do AI safety research anyway. (In section “Future resources” of my blog post Complications in evaluating neglectedness, I make the general point that for evaluating the neglectedness of an intervention, one has to look at how many resources future generations will invest into that intervention.)
>There is one case in which it clearly is not: if space colonisation is in fact likely to involve risk-independent islands. Then high population goes with low risk, increasing the value of the future relative to the basic model
(I find risk-independent islands fairly plausible.)
>The expected number of people who will live in period t is
You introduced N(t) as the number of morally relevant beings (rather than “people”).
>However, this increase in population may be due to stop soon,
Although it is well-known that some predict population to stagnate at 9 billion or so, a high-quality citation would be nice.
>The likelihood of space colonisation, a high-profile issue on which billions of dollars is spent per year (Masters, 2015), also seems relatively hard to affect. Extinction risk reduction, on the other hand, is relatively neglected (Bostrom, 2013; Todd, 2017), so it could be easier to achieve progress in this area.
I have only briefly (in part due to the lack of locators) checked the two sources, but it seems that this varies strongly between different extinction risks. For instance, according to Todd (2017), >300bn (and thus much more than on space colonization) is spent on climate change, 1-10bn on nuclear security, 1bn on extreme pandemic prevention. So, overall much more money goes into extinction risk reduction than into space colonization. (This is not too surprising. People don’t want to die, and they don’t want their children or grandchildren to die. They don’t care nearly as much about whether some elite group of people will live on Mars in 50 years.)
>Some people believe that it’s nearly impossible to have a consistent impact on Q(t) so far into the future.
Probably a reference would be good. I guess to the extent that we can’t affect far future Q(t), we also can’t affect far future r_i.
>However, this individual may be biased against ending things, for instance because of the survival instinct, and so could individuals or groups in the future. The extent of this bias is an open question.
It’s also a bit unclear (at least based on hat you write) what legitimizes calling this a bias, rather than simply a revealed preference not to die (even in cases in which you or I as outside observers might think it to be preferable not to live) and thus evidence that their lives are positive. Probably one has to argue via status quo bias or sth like that.
>We may further speculate that if the future is controlled by altruistic values, even powerless persons are likely to have lives worth living. If society is highly knowledgeable and technologically sophisticated, and decisions are made altruistically, it’s plausible that many sources of suffering would eventually be removed, and no new ones created unnecessarily. Selfish values, on the other hand, do not care about the suffering of powerless sentients.
This makes things sound a more binary than they actually are. (I’m sure you’re aware of this.) In the usual sense of the word, people could be “altruistic” but in a non-consequentialist way. There may be lots of suffering in such worlds. (E.g., some libertarians may be regard intervening in the economy as unethical even if companies start creating slaves. A socialist, on the other hand, may view capitalism as fundamentally unjust, try to regulate/control the economy and thus cause a lot of poverty.) Also, even if someone is altruistic in a fairly consequentialist way, they may still not care about all beings that you/I/the reader cares about. E.g., economists tend to be consequentialists but rarely consider animal welfare.
I think for the animal suffering (both wild animals and factory farming) it is worth noting that it seems fairly unlikely that this will be economically efficient in the long term, but that the general underlying principles (Darwinian suffering and exploiting the powerless) might carry over to other beings (like sentient AIs).
Another way in which the future may be negative would be the Malthusian trap btw. (Of course, some would regard at least some Malthusian trap scenarios as positive, see, e.g., Robin Hanson’s The Age of Em.) Presumably this belongs to 5.2.1, since it’s a kind of coordination failure.
As you say, I think the option value argument isn’t super persuasive, because it seems unlikely that the people in power in a million years share my (meta-)values (or agree with the way I do compromise).
Re 5.2.3: Another relevant reference on why one should cooperate—which is somewhat separate from the point that if mutual cooperation works out the gains from trade are great—is Brian Tomasik’s Reasons to Be Nice to Other Value Systems.
>One way to increase Q(t) is to advocate for positive value changes in the direction of greater consideration for powerless sentients, or to promote moral enhancement (Persson and Savulescu, 2008). Another approach might be to work to improve political stability and coordination, making conflict less likely as well as increasing the chance that moral progress continues.
I looked at the version 2017-12-30 10:48:11Z.
Overall, I think it’s a nice, systematic overview. Below are some comments.
I should note that I’m not very expert on these things. This is also why the additional literature I mention is mostly weakly related stuff from FRI, the organization I work for. Sorry about that.
An abstract would be nice.
Locators in the citations would be useful, i.e. “Beckstead (2013, sect. XYZ)” instead of just “Beckstead (2013)” when you talk about some specific section of the Beckstead paper. (Cf. section “Pageless Documentation” of the humurous Academic Citation Practice: A Sinking Sheep? by Ole Bjørn Rekdal.)
>from a totalist, consequentialist, and welfarist (but not necessarily utilitarian) point of view
I don’t think much of your analysis assumes welfarism (as I understand it)? Q_w could easily denote things other than welfare (e.g., how virtue ethical, free, productive, autonomous, natural, the mean person is), right? (I guess some of the discussion sections are fairly welfarist, i.e. they talk about suffering, etc., rather than freedom and so forth.)
>an existential risk as one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
Maybe some people would interpret this definition as excluding some of the “shrieks” and “whimpers”, since in some of them, “humanity’s potential is realized” in that it colonizes space, but not in accordance with, e.g., the reader’s values. Anyway, I think this definition is essentially a quote from Bostrom (maybe use quotation marks?), so it’s alright.
>The first is the probability P of reaching time t.
Maybe say more about why you separate N_w(t) (in the continuous model) into P(t) and N(t)?
I also don’t quite understand whether equation 1 is intended as the expected value of the future or as the expected value of a set of futures w that all have the same N_w(t) and Q_w(t). The problem is that if it’s the expected value of the future, I don’t get how you can simplify something like
∑wP(w)∑∞t=1Nw(t)Qw(t)
into the right side of your equation 1. (E.g., you can’t just let N(t) and Q(t) denote expected numbers of moral patients and expected mean qualities of life, because the mean qualities in larger worlds ought to count for more, right?)
I suspect that when reading the start of sect. 3.1, a lot of readers will wonder whether you endorse all the assumptions underlying your model of P(t). In particular, I would guess that people would disagree with the following two assumptions:
-> Short term x-risk reduction (r_1) doesn’t have any effect on long-term risk (r). Perhaps this is true for some fairly specific work on preventing extinction but it seems less likely for interventions like building up the UN (to avoid all kinds of conflict, coordinate against risks, etc.).
-> Long-term extinction risk is constant. I haven’t thought much about these issues but I would guess that extinction risk becomes much lower, once there is a self-sustaining colony on Mars.
Reading further, I see that you address these in sections 3.2 and 3.3. Maybe you could mention/refer to these somewhere near the start of sect. 3.1.
On page 3, you say that the derivative of -P(t) w.r.t. r_1 denotes the value of reducing r_1 by one unit. This is true in this case because P(t) is linear in r_1. But in general, the value of reducing r_1 by one unit is just P(t,r_1-1)-P(t,r_1), right?
Is equation 3, combined with the view that the cost of one unit of f1 is constant, consistent with Ord’s “A plausible model would be that it is roughly as difficult to halve the risk per century, regardless of its starting probability, and more generally, that it is equally difficult to reduce it by some proportion regardless of its absolute value beforehand.”? With your model, it looks like bringing f_1 from 0 to 0.5 and thus halfing r_1 is just as expensive as bringing f_1 from 0.5 to 1.
On p. 7, “not to far off”—probably you mean “too”?
>For example, perhaps we will inevitably develop some hypothetical weapons that give so large an advantage to offence over defence that civilisation is certain to be destroyed.
AI risk is another black ball that will become more accessible. But maybe you would rather not model it as extinction. At least AI risk doesn’t necessarily explain the Fermi paradox and AIs may create sentient beings.
>Ord argues that we may be able to expect future generations to be more interested in risk reduction, implying increasing f_i
I thought f_i was meant to model the impact that we can have on r_i? So, to me it seems more sensible to model the involvement of future generations, to the extent that we can’t influence it, as a “a kind of event E” (as you propose) or, more generally, as implying that the non-intervention risk levels r_i decrease.
>This would only reinforce the case for extinction risk reduction.
It seems that future generations caring about ERR makes short-term ERR more important (because the long-term future is longer and thus can contain more value). But it makes long-term ERR less important, because future generations will, e.g., do AI safety research anyway. (In section “Future resources” of my blog post Complications in evaluating neglectedness, I make the general point that for evaluating the neglectedness of an intervention, one has to look at how many resources future generations will invest into that intervention.)
>There is one case in which it clearly is not: if space colonisation is in fact likely to involve risk-independent islands. Then high population goes with low risk, increasing the value of the future relative to the basic model
(I find risk-independent islands fairly plausible.)
>The expected number of people who will live in period t is
You introduced N(t) as the number of morally relevant beings (rather than “people”).
>However, this increase in population may be due to stop soon,
Although it is well-known that some predict population to stagnate at 9 billion or so, a high-quality citation would be nice.
>The likelihood of space colonisation, a high-profile issue on which billions of dollars is spent per year (Masters, 2015), also seems relatively hard to affect. Extinction risk reduction, on the other hand, is relatively neglected (Bostrom, 2013; Todd, 2017), so it could be easier to achieve progress in this area.
I have only briefly (in part due to the lack of locators) checked the two sources, but it seems that this varies strongly between different extinction risks. For instance, according to Todd (2017), >300bn (and thus much more than on space colonization) is spent on climate change, 1-10bn on nuclear security, 1bn on extreme pandemic prevention. So, overall much more money goes into extinction risk reduction than into space colonization. (This is not too surprising. People don’t want to die, and they don’t want their children or grandchildren to die. They don’t care nearly as much about whether some elite group of people will live on Mars in 50 years.)
Of course, there a lot of complications to this neglectedness analysis. (All three points I discuss in Complications in evaluating neglectedness seem to apply.)
>Some people believe that it’s nearly impossible to have a consistent impact on Q(t) so far into the future.
Probably a reference would be good. I guess to the extent that we can’t affect far future Q(t), we also can’t affect far future r_i.
>However, this individual may be biased against ending things, for instance because of the survival instinct, and so could individuals or groups in the future. The extent of this bias is an open question.
It’s also a bit unclear (at least based on hat you write) what legitimizes calling this a bias, rather than simply a revealed preference not to die (even in cases in which you or I as outside observers might think it to be preferable not to live) and thus evidence that their lives are positive. Probably one has to argue via status quo bias or sth like that.
>We may further speculate that if the future is controlled by altruistic values, even powerless persons are likely to have lives worth living. If society is highly knowledgeable and technologically sophisticated, and decisions are made altruistically, it’s plausible that many sources of suffering would eventually be removed, and no new ones created unnecessarily. Selfish values, on the other hand, do not care about the suffering of powerless sentients.
This makes things sound a more binary than they actually are. (I’m sure you’re aware of this.) In the usual sense of the word, people could be “altruistic” but in a non-consequentialist way. There may be lots of suffering in such worlds. (E.g., some libertarians may be regard intervening in the economy as unethical even if companies start creating slaves. A socialist, on the other hand, may view capitalism as fundamentally unjust, try to regulate/control the economy and thus cause a lot of poverty.) Also, even if someone is altruistic in a fairly consequentialist way, they may still not care about all beings that you/I/the reader cares about. E.g., economists tend to be consequentialists but rarely consider animal welfare.
I think for the animal suffering (both wild animals and factory farming) it is worth noting that it seems fairly unlikely that this will be economically efficient in the long term, but that the general underlying principles (Darwinian suffering and exploiting the powerless) might carry over to other beings (like sentient AIs).
Another way in which the future may be negative would be the Malthusian trap btw. (Of course, some would regard at least some Malthusian trap scenarios as positive, see, e.g., Robin Hanson’s The Age of Em.) Presumably this belongs to 5.2.1, since it’s a kind of coordination failure.
As you say, I think the option value argument isn’t super persuasive, because it seems unlikely that the people in power in a million years share my (meta-)values (or agree with the way I do compromise).
Re 5.2.3: Another relevant reference on why one should cooperate—which is somewhat separate from the point that if mutual cooperation works out the gains from trade are great—is Brian Tomasik’s Reasons to Be Nice to Other Value Systems.
>One way to increase Q(t) is to advocate for positive value changes in the direction of greater consideration for powerless sentients, or to promote moral enhancement (Persson and Savulescu, 2008). Another approach might be to work to improve political stability and coordination, making conflict less likely as well as increasing the chance that moral progress continues.
Relevant:
https://foundational-research.org/international-cooperation-vs-ai-arms-race/
http://reducing-suffering.org/values-spreading-often-important-extinction-risk/