The topics of existential risk, AI, and other future technologies inherently require the use of very large numbers, far beyond any of those encountered when discussing normal, everyday risks and rewards.
Note that the large number used in this particular back-of-envelope calculation is the world population of several billion, not the still much larger numbers involved in astronomical waste.
Even if this is so, there is tons of evidence that humans suck at reasoning about such large numbers. If you want to make an extraordinary claim like the one you made above, then you need to put forth a large amount of evidence to support it. And on such a far-mode topic, the likelihood of your argument being correct decreases exponentially with the number of steps in the inferential chain.
I only skimmed through the video, but assuming that the estimates at 11:36 are what you’re referring to, those numbers are both seemingly quite high and entirely unjustified in the presentation. It also overlooks things like the fact that utility doesn’t scale linearly in number of lives saved when calculating the benefit per dollar.
Whether or not those numbers are correct, presenting them in their current form seems unlikely to be very productive. Likely either the person you are talking to already agrees, or the 8 lives figure triggers an absurdity heuristic that will demand large amounts of evidence. Heck, I’m already pretty familiar with the arguments, and I still get a small amount of negative affect whenever someone tries to make the “donating to X-risk has expected utility”.
I don’t think anyone on LW disagrees that reducing xrisk substantially carries an extremely high utility. The points of disagreement are over whether SIAI can non-trivially reduce xrisk, and whether they are the most effective way to do so. At least on this website, this seems like the more productive path of discussion.
Keep in mind that estimation is the best we have. You can’t appeal to Nature for not having been given a warning that meets a sufficient standard of rigor. Avoiding all actions of uncertain character dealing with huge consequences is certainly a bad strategy. Any one of such actions might have a big chance of not working out, but not taking any of them is guaranteed to be unhelpful.
You can’t appeal to Nature for not having been given a warning that meets a sufficient standard of rigor.
From a Bayesian point of view, your prior should place low probability on a figure like “8 lives per dollar”. Therefore, lots of evidence is required to overcome that prior.
From a decision-theoretic point of view, the general strategy of believing sketchy (with no offense intended to Anna; I look forward to reading the paper when it is written) arguments that reach extreme conclusions at the end is a bad strategy. There would have to be a reason why this argument was somehow different from all other arguments of this form.
Avoiding all actions of uncertain character dealing with huge consequences is certainly a bad strategy. Any one of such actions might have a big chance of not working out, but not taking any of them is guaranteed to be unhelpful.
If there were tons of actions lying around with similarly huge potential positive consequences, then I would be first in line to take them (for exactly the reason you gave). As it stands, it seems like in reality I get a one-time chance to reduce p(bad singularity) by some small amount. More explicitly, it seems like SIAI’s research program reduces xrisk by some small amount, and a handful of other programs would also reduce xrisk by some small amount. There is no combined set of programs that cumulatively reduces xrisk by some large amount (say > 3% to be explicit).
I have to admit that I’m a little bit confused about how to reason here. The issue is that any action I can personally take will only decrease xrisk by some small amount anyways. But to me the situation feels different if society can collectively decrease xrisk by some large amount, versus if even collectively we can only decrease it by some small amount. My current estimate is that we are in the latter case, not the former—even if xrisk research had unlimited funding, we could only decrease total xrisk by something like 1%. My intuitions here are further complicated by the fact that I also think humans are very bad at estimating small probabilities—so the 1% figure could very easily be a gross overestimate, whereas I think a 5% figure is starting to get into the range where humans are a bit better at estimating, and is less likely to be such a bad overestimate.
From a Bayesian point of view, your prior should place low probability on a figure like “8 lives per dollar”. Therefore, lots of evidence is required to overcome that prior.
My prior contains no such provisions; there are many possible worlds where tiny applications of resources have apparently disproportionate effect, and from the outside they don’t look so unlikely to me.
There are good reasons to be suspicious of claims of unusual effectiveness, but I recommend making that reasoning explicit and seeing what it says about this situation and how strongly.
There are also good reasons to be suspicious of arguments involving tiny probabilities, but keep in mind: first, you probably aren’t 97% confident that we have so little control over the future (I’ve thought about it a lot and am much more optimistic), and second, that even in a pessimistic scenario it is clearly worth thinking seriously about how to handle this sort of uncertainty, because there is quite a lot to gain.
Of course this isn’t an argument that you should support the SIAI in particular (though it may be worth doing some information-gathering to understand what they are currently doing), but that you should continue to optimize in good faith.
I don’t think anyone on LW disagrees that reducing xrisk substantially carries an extremely high utility.
I’m glad you agree.
The points of disagreement are over whether SIAI can non-trivially reduce xrisk, and whether they are the most effective way to do so. At least on this website, this seems like the more productive path of discussion.
I’d be very appreciative to hear if you know of someone doing more.
Well for instance, certain approaches to AGI are more likely to lead to something friendly than other approaches are. If you believe that approach A is 1% less likely to lead to a bad outcome than approach B, then funding research in approach A is already compelling.
In my mind, a well-reasoned statistical approach with good software engineering methodologies is the mainstream approach that is least likely to lead to a bad outcome. It has the advantage that there is already a large amount of related research being done, hence there is actually a reasonable chance that such an AGI would be the first to be implemented. My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
In contrast, I estimate that SIAI’s FAI approach would carry about 90% less risk if implemented than a hacked-together AGI. However, I assign very low probability to SIAI’s current approach succeeding in time. I therefore consider the above-mentioned approach more effective.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness. Then they are free to steer the field as a whole towards whatever direction is determined to carry the least risk, after we have the chance to do further research to determine that direction.
My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
I don’t understand what you mean by “10% less risk”. Do you think any given project using “a well-reasoned statistical approach with good software engineering methodologies” has at least 10% chance of leading to a positive Singularity? Or each such project has a P*0.9 probability of causing an existential disaster, where P is the probability of disaster of a “hacked together” project. Or something else?
You said “I therefore consider the above-mentioned approach more effective.”, but if all you’re claiming is that the above mentioned approach (“a well-reasoned statistical approach with good software engineering methodologies”) has a P*0.9 probability of causing an existential disaster, and not claiming that it has a significant chance of causing a positive Singularity, then why do you think funding such projects is effective for reducing existential risk? Is the idea that each such project would displace a “hacked together” project that would otherwise be started?
EDIT: I originally misinterpreted your post slightly, and corrected my reply accordingly.
Not quite. The hope is that such a project will succeed before any other hacked-together project succeeds. More broadly, the hope is that partial successes using principled methodologies will convince them to be more widely adopted in the AI community as a whole, and more to the point that a contingent of highly successful AI researchers advocating Friendliness can change the overall mindset of the field.
The default is a hacked-together AI project. SIAI’s FAI research is trying to displace this, but I don’t think they will succeed (my information on this is purely outside-view, however).
An explicit instantiation of some of my calculations:
SIAI approach: 0.1% chance of replacing P with 0.1P
Approach that integrates with the rest of the AI community: 30% chance of replacing P with 0.9P
In the first case, P is basically staying constant, in the second case it is being replaced with 0.97P.
The only specific plan I have right now is to put myself in a position to hire smart people to work on this problem. I think the most robust way to do this is to get a faculty position somewhere, but I need to consider the higher relative efficiency of corporations over universities some more to figure out if it’s worthwhile to go with the higher-volatility route of industry.
Also, as Paul notes, I need to consider other approaches to x-risk reduction as well to see if I can do better than my current plan. The main argument in favor of my current plan is that there is a clear path to the goal, with only modest technical hurdles and no major social hurdles. I don’t particularly like plans that start to get fuzzier than that, but I am willing to be convinced that this is irrational.
EDIT: To be more explicit, my current goal is to become one of said high-status AI researchers. I am worried that this is slightly self-serving, although I think I have good reason to believe that I have a comparative advantage at this task.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness.
That seems more of an alternative within SIAI than an alternative to SIAI. With more funding, their Associate Research Program can promote the importance of Friendliness and increase the status of researchers who care about it.
I’d be very appreciative to hear if you know of someone doing more.
Over the coming months I’m going to be doing an investigation of the non-profits affiliated with the Nuclear Threat Initiative with a view toward finding x-risk reduction charities other than SIAI & FHI. I’ll report back what I learn but it may be a while.
I’m under the impression that nuclear war doesn’t pose an existential risk. Do you disagree? If so, I probably ought to make a discussion post on the subject so we don’t take this one too far off topic.
My impression is that the risk of immediate extinction due to nuclear war is very small but that a nuclear war could cripple civilization to the point of not being able to recover enough to affect a positive singularity; also it would plausibly increase other x-risks—intuitively, nuclear war would destabilize society, and people are less likely to take safety precautions in an unstable society when developing advanced technologies than they otherwise would be. I’d give a subjective estimate of 0.1% − 1% of nuclear war preventing a positive singularity.
Good question. My intended meaning was the second of the meanings that you listed “the probability of a positive singularity is 0.1%-1% lower than the probability of a positive singularity given no nuclear war.” Would be interested to hear any thoughts that you have about these things.
I can’t think of a mechanism through which recovery would become long-term impossible, but maybe there is one. People taking fewer safety precautions in a destabilized society does sound plausible. There are probably a number of other, similarly important effects of nuclear war on existential risk to take into account. Different technologies (IA, uploading, AGI, Friendliness philosophy) have different motivations behind them that would probably be differently affected by a nuclear war. Memes would have more time to come closer to some sort of equilibrium in various relevant groups. To the extent that there are nontrivial existential risks not depending on future technology, they would have more time to strike. Catastrophes would be more psychologically salient, or maybe the idea of future nuclear war would overshadow other kinds of catastrophe. Power would be more in the hands of those who weren’t involved in the nuclear war.
In any case, the effect of nuclear war on existential risk seems like a nontrivial question that we’d have to have a better idea about before we could decide that resources are better spent on nuclear war prevention than something else. To make things more complicated, it’s possible that preventing nuclear war would on average decrease existential risk but that a specific measure to prevent nuclear war would increase existential risk (or vice versa), because the specific kinds of nuclear war that the measure prevents are atypical.
The number and strength of reasons we see one way or the other may depend more on time people have spent searching specifically for reasons for/against than on what reasons exist. The main reason to expect an imbalance there is that nuclear war causes huge amounts of death and suffering, and so people will be motivated to rationalize that it will also be a bad thing according to this mostly independent criterion of existential risk minimization; or people may overcorrect for that effect or have other biases for thinking nuclear war would prevent existential risk. To the extent that our misgivings about failing to do enough to stop nuclear war have to do with worries that existential risk reduction may not outweigh huge present death and suffering, we’d do better to acknowledge those worries than to rationalize ourselves into thinking there’s never a conflict.
Without knowing anything about specific risk mitigation proposals, I would guess that there’s even more expected return from looking into weird, hard-to-think-about technologies like MNT than from looking into nuclear war, because less of the low-hanging fruit there would already have been picked. But more specific information could easily overrule that presumption, and some people within SingInst seem to have pretty high estimates of the return from efforts to prevent nuclear war, so who knows.
I can’t think of a mechanism through which recovery would become long-term impossible, but maybe there is one.
I have little idea of how likely it is but a nuclear winter could seriously hamper human mobility.
Widespread radiation would further hamper human mobility.
Redeveloping preexisting infrastructure could require natural resources on of order of magnitude comparable to the infrastructure that we have today. Right now we have the efficient market hypothesis to help out with natural resource shortage, but upsetting the trajectory of our development could exacerbate the problem.
Note that a probability of 0.1% isn’t so large (even taking account all of the other things that could interfere with a positive singularity).
Different technologies (IA, uploading, AGI, Friendliness philosophy) have different motivations behind them that would probably be differently affected by a nuclear war. Memes would have more time to come closer to some sort of equilibrium in various relevant groups.
Reasoning productively about the expected value of these things presently seems to me to be too difficult (but I’m open to changing my mind if you have ideas).
To the extent that there are nontrivial existential risks not depending on future technology, they would have more time to strike.
With the exception of natural resource shortage (which I mentioned above) I doubt that this is within an order of magnitude of significance of other relevant factors provided that we’re talking about a delay on the order of fewer than 100 years (maybe similarly for a delay of 1000 years; I would have to think about it).
Catastrophes would be more psychologically salient, or maybe the idea of future nuclear war would overshadow other kinds of catastrophe.
Similarly, I doubt that this would be game-changing.
Power would be more in the hands of those who weren’t involved in the nuclear war.
These seem worthy of further contemplation—is the development of future technologies more likely to go in Australia than in the current major powers, etc.
In any case, the effect of nuclear war on existential risk seems like a nontrivial question that we’d have to have a better idea about before we could decide that resources are better spent on nuclear war prevention than something else.
This seems reasonable. As I mentioned, I presently attach high expected x-risk reduction to nuclear war prevention but my confidence is sufficiently unstable at present so that the value devoting resources to gather more information outweighs the value of donating to nuclear war reduction charities.
To make things more complicated, it’s possible that preventing nuclear war would on average decrease existential risk but that a specific measure to prevent nuclear war would increase existential risk (or vice versa), because the specific kinds of nuclear war that the measure prevents are atypical.
Yes. In the course of researching nuclear threat reduction charities I hope to learn what options are on the table.
Without knowing anything about specific risk mitigation proposals, I would guess that there’s even more expected return from looking into weird, hard-to-think-about technologies like MNT than from looking into nuclear war, because less of the low-hanging fruit there would already have been picked.
On the other hand there may not be low hanging fruit attached to thinking about weird, hard-to-think-about technologies like MNT. I do however plan on looking into the Foresight Institute.
Thanks for clarifying and I hope your research goes well. If I’m not mistaken, you can see the 0.1% calculation as the product of three things: the probability nuclear war happens, the probability that if it happens it’s such that it prevents any future positive singularities that otherwise would have happened, and the probability a positive singularity would otherwise have happened. If the first and third probabilities are, say, 1⁄5 and 1⁄4, then the answer will be 1⁄20 of the middle probability, so your 0.1%-1% answer corresponds to a 2%-20% chance that if a nuclear war happens then it’s such that it prevents any future positive singularities that would otherwise have happened. Certainly the lower end and maybe the upper end of that range seem like they could plausibly end up being close to our best estimate. But note that you have to look at the net effect after taking into account effects in both directions; I would still put substantial probability on this estimate ending up effectively negative, also. (Probabilities can’t really go negative, so the interpretation I gave above doesn’t really work, but I hope you can see what I mean.)
note that you have to look at the net effect after taking into account effects in both directions; I would still put substantial probability on this estimate ending up effectively negative, also.
I agree and should have been more explicit in taking this into account. However, note that if one assigns a 2:1 odds ratio for (0.1%-1% decrease in x-risk)/(same size increase in x-risk) then the expected value of preventing nuclear war doesn’t drop below 1⁄3 of what it would be if there wasn’t the possibility of nuclear war increasing x-risk: still on the same rough order of magnitude.
Thanks for the clarification on the estimate. Unhappy as it makes me to say it, I suspect that nuclear war or other non-existential catastrophe would overall reduce existential risk, because we’d have more time to think about existential risk mitigation while we rebuild society. However I suspect that trying to bring nuclear war about as a result of this reasoning is not a winning strategy.
Building society the first time around, we were able to take advantage of various useful natural resources such as relatively plentiful coal and (later) oil. After a nuclear war or some other civilization-wrecking catastrophe, it might be Very Difficult Indeed to rebuild without those resources at our disposal. It’s difficult enough even now, with everything basically still working nicely, to see how to wean ourselves off fossil fuels, as for various reasons many people think we should do. Now imagine trying to build a nuclear power industry or highly efficient solar cells with our existing energy infrastructure in ruins.
So it looks to me as if (1) our best prospects for long-term x-risk avoidance all involve advanced technology (space travel, AI, nanothingies, …) and (2) a major not-immediately-existential catastrophe could seriously jeapordize our prospects of ever developing such technology, so (3) such a catastrophe should be regarded as a big increase in x-risk.
I’ve heard arguments for and against “it might turn out to be too hard the second time around”. I think overall that it’s more likely than not that we would eventually succeed in rebuilding a technological society, but that’s the strongest I could put it, ie it’s very plausible that we would never do so.
If enough of our existing thinking survives, the thinking time that rebuilding civilization would give us might move things a little in our favour WRT AI++, MNT etc. I don’t know which side does better on this tradeoff. However I seriously doubt that trying to bring about the collapse of civilization is the most efficient way to mitigate existential risk.
Also, and I hate to be this selfish about it but there it is, if civilization ends I definitely die either way, and I’d kind of prefer not to.
Building society the first time around, we were able to take advantage of various useful natural resources such as relatively plentiful coal and (later) oil. After a nuclear war or some other civilization-wrecking catastrophe, it might be Very Difficult Indeed to rebuild without those resources at our disposal.
We have a huge mountain of coal, and will do for the next hundred years or so. Doing without doesn’t seem very likely.
How easily accessible is that coal to people whose civilization has collapsed, taking most of the industrial machinery with it? (That’s a genuine question. Naively, it seems like the easiest-to-get-at bits would have been mined out first, leaving the harder bits. How much harder they are, and how big a problem that would be, I have no idea.)
Unhappy as it makes me to say it, I suspect that nuclear war or other non-existential catastrophe would overall reduce existential risk, because we’d have more time to think about existential risk mitigation while we rebuild society. However I suspect that trying to bring nuclear war about as a result of this reasoning is not a winning strategy.
Technical challenges? Difficulty in coordinating? Are there other candidate setbacks?
because we’d have more time to think about existential risk mitigation while we rebuild society
It may be highly unproductive to think about advanced future technologies in very much detail before there’s a credible research program on the table on account of the search tree involving dozens of orders of magnitude. I presently believe in this to be the case.
I do think that we can get better at some relevant things at present (learning how to obtain as accurate as realistically possible predictions about probable government behaviors, etc.) and that all else being equal we could benefit from more time thinking about these things rather than less time.
However, it’s not clear to me that the time so gained would outweigh a presumed loss in clear thinking post-nuclear war and I currently believe that the loss would be substantially greater than the gain.
As steven0461 mentioned, “some people within SingInst seem to have pretty high estimates of the return from efforts to prevent nuclear war.” I haven’t had a chance to talk about this with them in detail; but it updates me in the direction of attaching high expected value reduction to nuclear war risk reduction.
My positions on these points are very much subject to change with incoming information.
It may be highly unproductive to think about advanced future technologies in very much detail before there’s a credible research program on the table on account of the search tree involving dozens of orders of magnitude. I presently believe in this to be the case.
because we’d have more time to think about existential risk mitigation while we rebuild society.”
A more likely result: the religious crazies will take over, and they either don’t think existential risk can exist (because God would prevent them) or they think preventing existential risk would be blasphemy (because God ought be allowed to destroy us). Or they even actively work to make it happen and bring about God’s judgmenent.
And then humanity dies, because both denying and embracing existential risk causes it to come nearer.
Why would it scale linearly? I agree that is scales linearly over relatively small regimes (on the order of millions of lives) by fungibility, but I see no reason why that needs to be true for trillions of lives or more (and at least some reasons why it can’t scale linearly forever).
Well, sure, the absurdity heuristic is terrible.
Re-read the context of what I wrote. Whether or not the absurdity heuristic is a good heuristic, it is one that is fairly common among humans, so if your goal is to have a productive conversation with someone who doesn’t already agree with you, you shouldn’t throw out such an ambitious figure without a solid argument. You can almost certainly make whatever point you want to make with more conservative numbers.
Why would it scale linearly? I agree that is scales linearly over relatively small regimes (on the order of millions of lives) by fungibility, but I see no reason why that needs to be true for trillions of lives or more (and at least some reasons why it can’t scale linearly forever).
Lets say you currently have a trillion utility-producing thingies—call them humans, if it helps. You’re pretty happy. In fact, you have so many that the utility of more is negligible.
Then Doctor Evil appears! He has five people hostage, he’s holding them to ransom!
His ransom: kill off six of the people you already have.
Since those trillion people’s value didn’t scale linearly, reducing them by six isn’t nearly as important as five people!
Rinse. Repeat.
Re-read the context of what I wrote. Whether or not the absurdity heuristic is a good heuristic, it is one that is fairly common among humans, so if your goal is to have a productive conversation with someone who doesn’t already agree with you, you shouldn’t throw out such an ambitious figure without a solid argument. You can almost certainly make whatever point you want to make with more conservative numbers.
Since those trillion people’s value didn’t scale linearly, reducing them by six isn’t nearly as important as five people!
This isn’t true—the choice is between N-6 and N-5 people; N-5 people is clearly better. Not to be too blunt, but I think you’ve badly misunderstood the concept of a utility function.
Well sure, if we’re talking Dark Arts...
Actively making your argument objectionable is very different from avoiding the use of the Dark Arts. In fact, arguably it has the same problem that the Dark Arts has, which is that is causes someone to believe something (in this case, the negation of what you want to show) for reasons unrelated to the validity of the supporting argument.
This isn’t true—the choice is between N-6 and N-5 people; N-5 people is clearly better. Not to be too blunt, but I think you’ve badly misunderstood the concept of a utility function.
Yes. The hypothetical utility function could e.g. take a list of items and then return the utility. It need not satisfy f(A,B)=f(A)+f(B) where ”,” is list concatenation. For example, this would apply to the worth of books, where a library is more worthy than however many copies of some one book. To simply sum values of books considered independently is ridiculous, it’s like valuing books by weight. Information content of the brain or what ever else it is that you might value (process?) is a fair bit more like a book than its like the weight of the books.
Actively making your argument objectionable is very different from avoiding the use of the Dark Arts. In fact, arguably it has the same problem that the Dark Arts has, which is that is causes someone to believe something (in this case, the negation of what you want to show) for reasons unrelated to the validity of the supporting argument.
Sorry, I only meant to imply that I had assumed we were discussing rationality, given the low status of the “Dark Arts”. Not that there was anything wrong with such discussion; indeed, I’m all for it.
Those extra five should be added onto the trillion you already have; not considered seperately.
That depends on how you do the accounting here. If we check the utility provided by saving five people, it’s high. If we check the utility provided by increasing a population of a trillion, it’s unfathomably low.
This is, in fact, the point.
Intuitively, we should be able to meaningfully analyse the utility of a part without talking about—or even knowing—the utility of the whole. Discovering vast interstellar civilizations should not invalidate our calculations made on how to save the most lives.
Let us assume that we have A known people in existence. Dr. Evil presents us with B previously unknown people, and threatens to kill them unless we kill C out of our A known people (where C<A). The question is, whether it is ethically better to let B people die, or to let C people die. (It is clearly better to save all the people, if possible).
We have a utility function, f(x), which describes the utility produced by x people. Before Dr. Evil turns up, we have A known people; and a total utility of f(A+B). After Dr. Evil arrives, we find that there are more people; we have a total utility of f(A+B) (or f(A+B+1), if Dr. Evil was previously unknown; from here onwards I will assume that Dr. Evil was previously known, and is thus included in A). Dr. Evil offers us a choice, between a total utility of f(A+B-C) or a total utility of f(A).
The immediate answer is that if B>C, it is better for B people to live; while if C>B, then it is better for C people to live. For this to be true for all A, B and C, it is necessary for f(x) to be a monotonically increasing function; that is, a function where f(y)>f(x) if and only if y>x.
Now, you are raising the possibility that there exist a number, D, of people in vast interstellar civilisations who are completely unknown to us. Then Dr. Evil’s choice becomes a choice between a total utility of f(A+B-C+D) and a total utility of f(A+D). Again, as long as f(x) is monotonically increasing, the question of finding the greatest utility is simply a matter of seeing whether B>C or not.
I don’t see any cause for invalidating any of my calculations in the presence of vast interstellar civilisations.
It takes effort to pull the lever and divert the trolley. This minuscule amount has to be outweighed by the utility of additional lives. It gets even worse in real situations, where it may cost a great deal to help people.
Ah; now we begin to compare different things. To compare the effort of pulling the lever, against the utility of the additional lives. At this point, yes, the actual magnitude and not just the sign of the difference between f(A+B-C+D) and f(A+D) becomes important; yet D is unknown and unknowable. This means that the magnitude of the difference can only be known with certainty if f(x) is linear; in the case of a nonlinear f(x), the magnitude cannot be known with certainty. I can easily pick out a nonlinear, monotonically increasing function such that the difference between f(A+B-C+D) and f(A+D) can be made arbitrarily small for any positive integer A, B and C (where A+B>C) by simply selecting a suitable positive integer D. A simple example would be f(x)=sqrt(x).
Now, the hypothetical moral agent is in a quandary. Using effort to pick a solution costs utilions. The cost is a simple, straightforward constant; he known how much that costs. But, with f(x)=sqrt(x), without knowing D, he cannot tell whether the utilions of saving the people is greater or lesser than the utilion cost of picking a solution. (For the purpose of simplicity, I will assume that no-one will ever know that he was in a position to make the choice—that is, his reputation is safe, no matter what he selects). Therefore, he has to make an estimate. He has to guess a value of D. There are multiple strategies that can be followed here:
Try to estimate the most probable value of D. This would require something along the lines of the Drake equation—picking the most likely numbers for the different elements, picking the most likely size of an extraterrestrial civilisation, and doing some multiplication.
Take the most pessimistic possible value of D; D=0. That is, plan as though I am in the worst possible universe; if I am correct, and D=0, then I take the correct action, while if I am incorrect and D later proves greater than zero, then that is a pleasant surprise. This guards against getting an extremely unpleasant surprise if it later turns out that D is substantially lower than the most likely estimate; utilions in the future are more likely to go up than down.
Ignore the cost, and simply take the option that saves the most lives, regardless of effort. This strategy actually reduces the cost slightly (as one does not need to expend the very slight cost of calculating the cost), and has the benefit of allowing immediate action. It is the option that I would prefer that everyone who is not me should take (because if other people take it, then I have a greater chance of getting my life saved at the cost of no effort on my part). I might choose this option out of a sense of fairness (if I wish other people to take this option, it is only reasonable to consider that other people may wish me to take it) or out of a sense of duty (saving lives is important).
In the current situation, that is, with 7 billion people known (i.e. A=7 billion), and a general assumption of D=0, and the threat of consequences (court, prison, etc.), very few. But there are still some who would kill one person for a cookie. And there are some who’d start a war—killing hundreds, or thousands, or even millions—given the right incentive (it generally takes a bit more than a cookie).
If there are 3^^^^^^^^^3 people known to exist, and court/prison is easily avoided, then how many people would kill off billions for a cookie? What if it’s billions who they’ve never met, and are never going to meet?
Frankly, if I try to imagine living in a world in which I am as confident that that many people exist as I am that 7 billion people exist today, I’m not sure I wouldn’t kill off billions for a cookie.
I mean, if I try to imagine living in a world where only 10,000 people exist, I conclude that I would be significantly more motivated to extend the lives of an arbitrary person (e.g., by preventing them from starving) than I am now. (Leaving aside any trauma related to the dieback itself.)
If a mere six orders of magnitude difference in population can reduce my motivation to extend an arbitrary life to that extent, it seems likely that another twenty or thirty orders of magnitude would reduce me to utter apathy when it comes to an arbitrary life. Add another ten orders of magnitude and utter apathy when it comes to a billion arbitrary lives seems plausible.
What if it’s billions who they’ve never met, and are never going to meet?
I presumed this. If it’s billions of friends instead, I no longer have any confidence in any statement about my preferences, because any system capable of having billions of friends is sufficiently different from me that I can’t meaningfully predict it. If it’s billions of people including a friend of mine, I suspect that my friend is worth about as much as they are in the 7billion-person world, + (billions-1) people who I’m apathetic about. I suspect I either get really confused at this point, or compartmentalize fiercely.
If it’s billions of people including a friend of mine, I suspect that my friend is worth about as much as they are in the 7billion-person world, + (billions-1) people who I’m apathetic about. I suspect I either get really confused at this point, or compartmentalize fiercely.
Thinking about this has caused me to realise that I already compartmentalise pretty fiercely. Some of the lines along which I compartmentalise are a little surprising when I investigate them closely… friend/non-friend is not the sharpest line of the lot.
One pretty sharp line is probably-trying-to-manipulate-me/probably-not-trying-to-manipulate-me. But I wouldn’t want to kill anyone on either side of that line (I wouldn’t even want to be rude to them without reason (though ‘he’s a telemarketer’ is reason for hanging up the phone on someone mid-sentance)). My brain seems to insist on lumping “have never met or interacted with, likely will never meet or interact with” in more-or-less the same category as “fictional”.
though ‘he’s a telemarketer’ is reason for hanging up the phone on someone mid-sentance
My brain seems to divide people among “playing characters” and “non-playing characters”, and telemarketers fall in the latter category. (The fact that my native language has a T-V distinction doesn’t help, though the distinction isn’t exactly the same.)
My brain seems to insist on lumping “have never met or interacted with, likely will never meet or interact with” in more-or-less the same category as “fictional”.
That sounds a lot like some sort of scope insensitivity than a revealed preference.
That edit does make your meaning clearer. It does so by highlighting that my phrasing was sloppy, so let me try to explain myself better.
Let us say that I hear of someone being mugged. My emotional reaction changes as a function of my relationship to the victim. If the victim is a friend, I am concerned and rush to check that he is OK. If the victim is an acquaintence, I am concerned and check that he is OK the next time I see him. If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I am mildly perturbed. If the victim is a fictional character, I am also mildly perturbed.
When considering only one person, those last two categories blur together in my mind somewhat.
If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I am mildly perturbed. If the victim is a fictional character, I am also mildly perturbed.
If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I shrug and think ‘so what? so many people get mugged every day, why should I worry about this one in particular?’ If it’s a fictional character, it depends on whether the author is good enough to switch me from far-mode to near-mode thinking.
Well, but this elides differences in the object with differences in the framing. I certainly agree that an author can change how I feel about a fictional character, but an author can also change how I feel about a real person whom I have never met or interacted with, and am unlikely to meet or interact with.
If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I shrug and think ‘so what? so many people get mugged every day, why should I worry about this one in particular?’
Am I the only person here who is in any way moved by accounts of specific victims? Nonfiction writers can switch you to near-mode too, or at least they can to me.
OK, so you care about detailed accounts. Doesn’t that suggest that if you, y’know, knew more details about all those people being mugged, you would care more? So it’s just ignorance that leads you to discount their suffering?
Fictional accounts … well, people never have been great at distinguishing between imagination and reality, which, if you think about it, is actually really useful.
Really? My System 2 thinks System 2 is annoyingly incapable of seeing details, and System 1 is annoyingly incapable of seeing the big picture, and wants to use System 1 as a sort of zoom function to approximate something less broken.
Like army1987, I can be moved by accounts of specific victims, whether they are fictional or not. There is a bug here, and the bug is this; that I am moved the same amount by an otherwise identical fictional or nonfictional account, where the nonfictional account contains no-one with whom I have ever interacted.
That is, simply knowing that an account is non-fictional doesn’t affect my emotional reaction, one way or another. (This doesn’t mean I am entirely without sympathy for people I have never met—it simply means that I have equivalent sympathy for fictional characters). This is a bug; ideally, my emotional reaction should take into account such an important detail as whether or not something really happened. After all, what detail could be more important?
It’s not a bug, it’s a feature (in some contexts).
Consider you were playing 2 games of online chess against an anonymous opponent. You barely lose the first one. Now you’re feeling the spirit of competition, your blood boiling for revenge! Should you force yourself to relinquish the thrill of the contest, because “it doesn’t really matter”? That would be no fun! :-(
If you’re reading a work of fiction, knowing it is fiction, why are you doing so? Because emotional investment is fun? Why would you then sabotage your enjoyment by trying to downsize your emotional investment, since “it’s not real”? Also no fun! :-(
If the flawed heuristic you are employing in a certain context works in your favor in that context, switching it off would be dumb (although being vaguely aware of it would not be).
I’m not sure I’d characterize that as a “bug”, more a feature we need to be aware of and take into account.
If you weren’t moved by fictional scenarios, you wouldn’t be able to empathize with people in those scenarios—including your future self! We mostly predict other people’s actions by using our own brain as a black box, imaging ourselves in their situation and how we would react, so there goes any situation featuring other humans. And we couldn’t daydream or enjoy fiction, either.
Would it be useful to turn it off? Maaaybe, but as long as you don’t start taking hypothetical people’s wishes into account, and stop reading stuff that triggers you, you’re fine—I bet the consequences for misuse would be higher than the marginal benefits.
I don’t think that empathising with fictional characters should be turned off. I just think that properly calibrated emotions should take all factors into account, with properly relevant weightings. I notice that my emotions do not seem to be taking the ‘reality’ factor into account, and I therefore conclude that my emotions are poorly calibrated.
My future self would be a potentially real scenario, and thus would deserve all the emotional investment appropriate for a situation that may well come to pass. (He also gets the emotional investment for being me, which is quite large).
I’m not sure whether I should be feeling more sympathy for strangers, or less sympathy for fictional people.
So … are you saying that they’re poorly calibrated, but that’s fine and nothing to worry about as long as we don’t forget it and start giving imaginary people moral weight? Because if so, I agree with you on this.
More or less. I’m also saying that it might be nice if they were better calibrated. It’s not urgent or particularly important, it’s just something about myself that I noticed at the start of this discussion that I hadn’t noticed before.
That edit does make your meaning clearer. It does so by highlighting that my phrasing was sloppy, so let me try to explain myself better.
Fair enough.
If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I am mildly perturbed. If the victim is a fictional character, I am also mildly perturbed.
That depends on much you know about/empathize with them, right?
That depends on much you know about/empathize with them, right?
Yes; but I can know as much about a fictional character as about a non-fictional character whom I have not interacted with. The dependency has nothing to do with the fictionality or lack thereof of the character.
Right, hence me quoting both the section on fictional and non-fictional characters.
To be honest, our brains don’t really seem to distinguish between fiction and non-fiction at all; it’s merely a question of context. Hence our reactions to fictional evidence and so forth. Lotta awkward biases you can catch from that what with our tendency to “buy in” to compelling narratives.
It’s not a bias if you value an additional dollar less once all your needs are met.
It’s not a bias if you value a random human life less if there are billions of others, compared to if there are only a few others.
You may choose for yourself to value a $10 bill the same whether you’re dirt poor, or a millionaire. Same with human lives. But you don’t get to “that’s a bias” others who have a more nuanced and context-sensitive estimation.
Except that humans actually have a bias called scope insensitivity and that’s a known thing, and it behaves differently to any claimed bounded utility function we might have.
Add another ten orders of magnitude and utter apathy when it comes to a billion arbitrary lives seems plausible.
A billion is nine orders of magnitude. As a very rough estimate, then, adding an order of magnitude to the number of lives in existence divides the motivation to extend an arbitrary stranger’s life by an order of magnitude. And the same for any other multiplier.
That is, if G is chosen such that f(x)-f(x-1)=G, then f(Mx)-f(Mx-1)=G/M for any given x and any multiplier M. If I then define my hedons such that f(0)=0 and f(1)=1...
For 10,000 people, on this entirely arbitrary (and extremely large) scale, I get a value f(x) between 9 and 10; for seven billion, f(x) lies between 23 and 24 (source)
Hm. Yes, to the level of approximation I’m using here, I could as easily have used a log function. And would have, if I’d thought of it; the log function is used enough that I’d expect its properties to be easier for whoever reads my post to imagine.
I mean, if I try to imagine living in a world where only 10,000 people exist, I conclude that I would be significantly more motivated to extend the lives of an arbitrary person (e.g., by preventing them from starving) than I am now. (Leaving aside any trauma related to the dieback itself.)
Well, if the population is that low saving people is guarding against an existential risk, so I would feel the same. Does your introspection yield anything on why smaller numbers matter more?
ETA: your brain can’t grasp numbers anywhere near as high as a billion. How sure are you murder matters now?
It’s pretty clear that individual murder doesn’t matter to me.
I mean, someone was murdered just now, as I write this sentence, and I care about that significantly less than I care about the quality of my coffee. I mean, I just spent five seconds adjusting the quality of my coffee, which is at least a noticeable quantity of effort if not a significant one. I can’t say the same about that anonymous murder.
Oh look, there goes another one. (Yawn.)
The metric I was using was not “caring whether someone is murdered”, which it’s clear I really don’t, but rather “being willing to murder someone,” which it’s relatively clear that I do, but not nearly as much as I could. (Insert typical spiel here about near/far mode, etc.)
I think the resolution to that is that you don’t have to have an immediate emotional reaction to care about it. There are lots of good and bad things happening in the world right now, but trying to feel all of them would be pointless, and a bad fit for our mental architecture. But we can still care, I think.
Well, I certainly agree that I don’t have to have an emotional reaction to each event, or indeed a reaction to the event at all, in order to be motivated to build systems that handle events in that class in different ways. I’m content to use the word “care” to refer to such motivation, either as well as or instead of referring to such emotional reactions. Ditto for “matters” in questions like “does murder matter”, in which case my answer to the above would change, but that certainly isn’t how I udnerstood MugaSofer’s question.
So the question now is: if you could prevent someone you would most likely never otherwise interact with from being murdered, but that would make your coffee taste worse, what would you do?
Bah! Listen, Eliezer, I’m tired of all your meta-hipsterism!
“Hey, let’s get some ethics at Starbucks” “Nah, it’s low-quality; I only buy a really obscure brand of ethics you’ve probably never heard of called MIRI”. “Hey man, you don’t look in good health, maybe you should see a doctor” “Nah, I like a really obscure form of healthcare, I bet you’re not signed up for it, it’s called ‘cryonics’; it’s the cool thing to do”. “I think I like you, let’s date” “Oh, I’m afraid I only date polyamorists; you’re just too square”. “Oh man, I just realized I committed hindsight bias the other day!” “I disagree, it’s really the more obscure backfire effect which just got published a year or two ago.” “Yo, check out this thing I did with statistics” “That’s cool. Did you use Bayesian techniques?”
Man, forget you!
/angrily sips his obscure mail-order loose tea, a kind of oolong you’ve never heard of (Formosa vintage tie-guan-yin)
This comment has been brought to you by me switching from Dvorak to Colemak.
I’m always amazed that people advocate Dvorak. If you are going to diverge from the herd and be a munchkin why do a half-assed job of it? Sure, if you already know Dvorak it isn’t worth switching but if you are switching from Qwerty anyway then Colemak (or at least Capewell) is better than Dvorak in all the ways that Dvorak is better than Qwerty.
If you can’t pick something non-average to meet your optimization criteria, you can’t optimize above the average.
But at the same time, there’s only so many possible low-hanging fruits etc, and at some level of finding more fruits, that indicates you aren’t optimizing at all...
(Had to google “backfire effect” to find out whether you had made it up on the spot.)
EDIT: Looks like I had already heard of that effect, and I even seem to recall E.T. Jaynes giving a theoretical explanation of it, but I didn’t remember whether it had a name.
Had to google “backfire effect” to find out whether you had made it up on the spot.
“Like I said, it’s a really obscure bias, you’ve probably never heard of it.”
I even seem to recall E.T. Jaynes giving a theoretical explanation of it
Really? I don’t remember ever seeing anything like that (although I haven’t read all of PT:TLoS yet). Maybe you’re conflating it with the thesis using Bayesian methods I link in http://www.gwern.net/backfire-effect ?
BTW, for some reason, certain “fair trade” products at my supermarket are astoundingly cheap (as in, I’ve bought very similar but non-“fair trade” stuff for more); I notice that I’m confused.
This comment was written under the misapprehension that Dave was speaking normatively.
It’s pretty clear that individual murder doesn’t matter to me.
I mean, someone was murdered just now, as I write this sentence, and I care about that significantly less than I care about the quality of my coffee. I mean, I just spent five seconds adjusting the quality of my coffee, which is at least a noticeable quantity of effort if not a significant one. I can’t say the same about that anonymous murder.
Oh look, there goes another one. (Yawn.)
I always attributed that to abstract nature of the knowledge. I mean, if you knew anything about the person, you’d care a lot more, which suggests the relevant factor is ignorance, and that’s a property of the map, not the territory.
The metric I was using was not “caring whether someone is murdered”, which it’s clear I really don’t, but rather “being willing to murder someone,” which it’s relatively clear that I do, but not nearly as much as I could. (Insert typical spiel here about near/far mode, etc.)
So you’re saying your preferences on this matter are inconsistent?
Yes, I agree completely that what I’m talking about is an attribute of “the map.” (I could challenge whether it’s ignorance or something else, but the key point here is that I’m discussing motivational psychology, and I agree.)
So you’re saying your preferences on this matter are inconsistent?
Well, that wasn’t my point, and I’m not quite sure how it follows from what I said, but I would certainly agree that my revealed preferences are both inconsistent with each other and inconsistent with my stated preferences (which are themselves inconsistent with each other).
I would certainly agree that my revealed preferences are both inconsistent with each other and inconsistent with my stated preferences (which are themselves inconsistent with each other).
Right. This is why I don’t use “revealed preferences” to derive ethics, personally.
And neither do you, I’m such an idiot.
That said.
Here’s a scenario:
Humanity has spread throughout the stars and come into its manifest destiny, yada yada. There are really ridiculous amounts of people. Trillions in every star system, and there are a lot of star systems. We all know this future.
Alas! Some aliens dislike this! They plan to follow you to a newly-settled planet—around a billion colonists. Then they will colonize the planet themselves, and live peacefully building stacks of pebbles or whatever valueless thing aliens do. These aliens are a hive mind, so they don’t count as people.
However! You could use your tracking beacon—of some sentimental value to you, it was a present from your dear old grandmother or something—to trick the aliens into attacking and settling on an automated mining world, without killing a single human.
I assume you would be willing to do it to save, say, a small country on modern-day Earth, although maybe I’m projecting here? Everything is certain, because revealed preferences suck at probability math.
Reorienting my understanding of this discussion to be, as you say, normative: yes, when offered a choice between destroying a sentimental but not otherwise valuable item and killing a billion humans, I endorse destroying the item, no matter how many other humans there are in the world.
I even endorse it if everything is uncertain, with the usual expected-value calculation.
That said, as is often true of hypothetical questions, I don’t quite agree that the example you describe quite maps to that choice, but I think it was meant to. If I really think about the example, it’s more complicated than that. If I missed the intended point of the example, let me know and I’ll try again.
Reorienting my understanding of this discussion to be, as you say, normative: yes, when offered a choice between destroying a sentimental but not otherwise valuable item and killing a billion humans, I endorse destroying the item, no matter how many other humans there are in the world.
I even endorse it if everything is uncertain, with the usual expected-value calculation.
Glad to hear it. Sorry about that misunderstanding.
That said, as is often true of hypothetical questions, I don’t quite agree that the example you describe quite maps to that choice, but I think it was meant to. If I really think about the example, it’s more complicated than that.
Curses. I knew I should have gone with the rogue nanotech.
If I missed the intended point of the example, let me know and I’ll try again.
I have to admit, that was sloppily phrased. However, you do seem to be defining “OK” as equivalent to “actively good” whereas I’m using something more like “acceptable”.
Well, I’d accept strictly neutral (neither actively evil nor actively good) as OK as well. It seems that your definition of OK includes the possibility of active evil, as long as the amount of active evil is below a certain threshold.
It seems that we’re in agreement here; whether or not it is “OK” is defined by the definitions we are assigning to OK, and not to any part of the model under consideration.
The threshold being whether I can be bothered to stop it. As I said, it was sloppy terminology—I should have said something like “worth less than the effort of telling someone to stop” or some other minuscule cost you would be unwilling to pay. Since any intervention, in real life, has a cost, albeit sometimes a small one, this seems like an important distinction.
The topics of existential risk, AI, and other future technologies inherently require the use of very large numbers, far beyond any of those encountered when discussing normal, everyday risks and rewards.
Note that the large number used in this particular back-of-envelope calculation is the world population of several billion, not the still much larger numbers involved in astronomical waste.
Even if this is so, there is tons of evidence that humans suck at reasoning about such large numbers. If you want to make an extraordinary claim like the one you made above, then you need to put forth a large amount of evidence to support it. And on such a far-mode topic, the likelihood of your argument being correct decreases exponentially with the number of steps in the inferential chain.
I only skimmed through the video, but assuming that the estimates at 11:36 are what you’re referring to, those numbers are both seemingly quite high and entirely unjustified in the presentation. It also overlooks things like the fact that utility doesn’t scale linearly in number of lives saved when calculating the benefit per dollar.
Whether or not those numbers are correct, presenting them in their current form seems unlikely to be very productive. Likely either the person you are talking to already agrees, or the 8 lives figure triggers an absurdity heuristic that will demand large amounts of evidence. Heck, I’m already pretty familiar with the arguments, and I still get a small amount of negative affect whenever someone tries to make the “donating to X-risk has expected utility”.
I don’t think anyone on LW disagrees that reducing xrisk substantially carries an extremely high utility. The points of disagreement are over whether SIAI can non-trivially reduce xrisk, and whether they are the most effective way to do so. At least on this website, this seems like the more productive path of discussion.
Keep in mind that estimation is the best we have. You can’t appeal to Nature for not having been given a warning that meets a sufficient standard of rigor. Avoiding all actions of uncertain character dealing with huge consequences is certainly a bad strategy. Any one of such actions might have a big chance of not working out, but not taking any of them is guaranteed to be unhelpful.
From a Bayesian point of view, your prior should place low probability on a figure like “8 lives per dollar”. Therefore, lots of evidence is required to overcome that prior.
From a decision-theoretic point of view, the general strategy of believing sketchy (with no offense intended to Anna; I look forward to reading the paper when it is written) arguments that reach extreme conclusions at the end is a bad strategy. There would have to be a reason why this argument was somehow different from all other arguments of this form.
If there were tons of actions lying around with similarly huge potential positive consequences, then I would be first in line to take them (for exactly the reason you gave). As it stands, it seems like in reality I get a one-time chance to reduce p(bad singularity) by some small amount. More explicitly, it seems like SIAI’s research program reduces xrisk by some small amount, and a handful of other programs would also reduce xrisk by some small amount. There is no combined set of programs that cumulatively reduces xrisk by some large amount (say > 3% to be explicit).
I have to admit that I’m a little bit confused about how to reason here. The issue is that any action I can personally take will only decrease xrisk by some small amount anyways. But to me the situation feels different if society can collectively decrease xrisk by some large amount, versus if even collectively we can only decrease it by some small amount. My current estimate is that we are in the latter case, not the former—even if xrisk research had unlimited funding, we could only decrease total xrisk by something like 1%. My intuitions here are further complicated by the fact that I also think humans are very bad at estimating small probabilities—so the 1% figure could very easily be a gross overestimate, whereas I think a 5% figure is starting to get into the range where humans are a bit better at estimating, and is less likely to be such a bad overestimate.
My prior contains no such provisions; there are many possible worlds where tiny applications of resources have apparently disproportionate effect, and from the outside they don’t look so unlikely to me.
There are good reasons to be suspicious of claims of unusual effectiveness, but I recommend making that reasoning explicit and seeing what it says about this situation and how strongly.
There are also good reasons to be suspicious of arguments involving tiny probabilities, but keep in mind: first, you probably aren’t 97% confident that we have so little control over the future (I’ve thought about it a lot and am much more optimistic), and second, that even in a pessimistic scenario it is clearly worth thinking seriously about how to handle this sort of uncertainty, because there is quite a lot to gain.
Of course this isn’t an argument that you should support the SIAI in particular (though it may be worth doing some information-gathering to understand what they are currently doing), but that you should continue to optimize in good faith.
Can you clarify what you mean by this?
Only that you consider the arguments you have advanced in good faith, as a difficulty and a piece of evidence rather than potential excuses.
I’m glad you agree.
I’d be very appreciative to hear if you know of someone doing more.
Well for instance, certain approaches to AGI are more likely to lead to something friendly than other approaches are. If you believe that approach A is 1% less likely to lead to a bad outcome than approach B, then funding research in approach A is already compelling.
In my mind, a well-reasoned statistical approach with good software engineering methodologies is the mainstream approach that is least likely to lead to a bad outcome. It has the advantage that there is already a large amount of related research being done, hence there is actually a reasonable chance that such an AGI would be the first to be implemented. My personal estimate is that such an approach carries about 10% less risk than an alternative approach where the statistics and software are both hacked together.
In contrast, I estimate that SIAI’s FAI approach would carry about 90% less risk if implemented than a hacked-together AGI. However, I assign very low probability to SIAI’s current approach succeeding in time. I therefore consider the above-mentioned approach more effective.
Another alternative to SIAI that doesn’t require estimates about any specific research program would be to fund the creation of high-status AI researchers who care about Friendliness. Then they are free to steer the field as a whole towards whatever direction is determined to carry the least risk, after we have the chance to do further research to determine that direction.
I don’t understand what you mean by “10% less risk”. Do you think any given project using “a well-reasoned statistical approach with good software engineering methodologies” has at least 10% chance of leading to a positive Singularity? Or each such project has a P*0.9 probability of causing an existential disaster, where P is the probability of disaster of a “hacked together” project. Or something else?
Sorry for the ambiguity. I meant P*0.9.
You said “I therefore consider the above-mentioned approach more effective.”, but if all you’re claiming is that the above mentioned approach (“a well-reasoned statistical approach with good software engineering methodologies”) has a P*0.9 probability of causing an existential disaster, and not claiming that it has a significant chance of causing a positive Singularity, then why do you think funding such projects is effective for reducing existential risk? Is the idea that each such project would displace a “hacked together” project that would otherwise be started?
EDIT: I originally misinterpreted your post slightly, and corrected my reply accordingly.
Not quite. The hope is that such a project will succeed before any other hacked-together project succeeds. More broadly, the hope is that partial successes using principled methodologies will convince them to be more widely adopted in the AI community as a whole, and more to the point that a contingent of highly successful AI researchers advocating Friendliness can change the overall mindset of the field.
The default is a hacked-together AI project. SIAI’s FAI research is trying to displace this, but I don’t think they will succeed (my information on this is purely outside-view, however).
An explicit instantiation of some of my calculations:
SIAI approach: 0.1% chance of replacing P with 0.1P Approach that integrates with the rest of the AI community: 30% chance of replacing P with 0.9P
In the first case, P is basically staying constant, in the second case it is being replaced with 0.97P.
I noticed you didn’t name anybody. Did you have specific programs or people in mind?
We already seem to (roughly) agree on probabilities.
The only specific plan I have right now is to put myself in a position to hire smart people to work on this problem. I think the most robust way to do this is to get a faculty position somewhere, but I need to consider the higher relative efficiency of corporations over universities some more to figure out if it’s worthwhile to go with the higher-volatility route of industry.
Also, as Paul notes, I need to consider other approaches to x-risk reduction as well to see if I can do better than my current plan. The main argument in favor of my current plan is that there is a clear path to the goal, with only modest technical hurdles and no major social hurdles. I don’t particularly like plans that start to get fuzzier than that, but I am willing to be convinced that this is irrational.
EDIT: To be more explicit, my current goal is to become one of said high-status AI researchers. I am worried that this is slightly self-serving, although I think I have good reason to believe that I have a comparative advantage at this task.
You know, I think somebody already thought of this. What was their name again...?
That seems more of an alternative within SIAI than an alternative to SIAI. With more funding, their Associate Research Program can promote the importance of Friendliness and increase the status of researchers who care about it.
Over the coming months I’m going to be doing an investigation of the non-profits affiliated with the Nuclear Threat Initiative with a view toward finding x-risk reduction charities other than SIAI & FHI. I’ll report back what I learn but it may be a while.
I’m under the impression that nuclear war doesn’t pose an existential risk. Do you disagree? If so, I probably ought to make a discussion post on the subject so we don’t take this one too far off topic.
My impression is that the risk of immediate extinction due to nuclear war is very small but that a nuclear war could cripple civilization to the point of not being able to recover enough to affect a positive singularity; also it would plausibly increase other x-risks—intuitively, nuclear war would destabilize society, and people are less likely to take safety precautions in an unstable society when developing advanced technologies than they otherwise would be. I’d give a subjective estimate of 0.1% − 1% of nuclear war preventing a positive singularity.
Do you mean:
The probability of PS given NW is .1-1% lower than the probability of PS given not-NW
The probability of PS is .1-1% lower than the probability of PS given not-NW
The probability of PS is 99-99.9% times the probability of PS given not-NW
etc?
Good question. My intended meaning was the second of the meanings that you listed “the probability of a positive singularity is 0.1%-1% lower than the probability of a positive singularity given no nuclear war.” Would be interested to hear any thoughts that you have about these things.
I can’t think of a mechanism through which recovery would become long-term impossible, but maybe there is one. People taking fewer safety precautions in a destabilized society does sound plausible. There are probably a number of other, similarly important effects of nuclear war on existential risk to take into account. Different technologies (IA, uploading, AGI, Friendliness philosophy) have different motivations behind them that would probably be differently affected by a nuclear war. Memes would have more time to come closer to some sort of equilibrium in various relevant groups. To the extent that there are nontrivial existential risks not depending on future technology, they would have more time to strike. Catastrophes would be more psychologically salient, or maybe the idea of future nuclear war would overshadow other kinds of catastrophe. Power would be more in the hands of those who weren’t involved in the nuclear war.
In any case, the effect of nuclear war on existential risk seems like a nontrivial question that we’d have to have a better idea about before we could decide that resources are better spent on nuclear war prevention than something else. To make things more complicated, it’s possible that preventing nuclear war would on average decrease existential risk but that a specific measure to prevent nuclear war would increase existential risk (or vice versa), because the specific kinds of nuclear war that the measure prevents are atypical.
The number and strength of reasons we see one way or the other may depend more on time people have spent searching specifically for reasons for/against than on what reasons exist. The main reason to expect an imbalance there is that nuclear war causes huge amounts of death and suffering, and so people will be motivated to rationalize that it will also be a bad thing according to this mostly independent criterion of existential risk minimization; or people may overcorrect for that effect or have other biases for thinking nuclear war would prevent existential risk. To the extent that our misgivings about failing to do enough to stop nuclear war have to do with worries that existential risk reduction may not outweigh huge present death and suffering, we’d do better to acknowledge those worries than to rationalize ourselves into thinking there’s never a conflict.
Without knowing anything about specific risk mitigation proposals, I would guess that there’s even more expected return from looking into weird, hard-to-think-about technologies like MNT than from looking into nuclear war, because less of the low-hanging fruit there would already have been picked. But more specific information could easily overrule that presumption, and some people within SingInst seem to have pretty high estimates of the return from efforts to prevent nuclear war, so who knows.
Thanks for your thoughtful comment.
I have little idea of how likely it is but a nuclear winter could seriously hamper human mobility.
Widespread radiation would further hamper human mobility.
Redeveloping preexisting infrastructure could require natural resources on of order of magnitude comparable to the infrastructure that we have today. Right now we have the efficient market hypothesis to help out with natural resource shortage, but upsetting the trajectory of our development could exacerbate the problem.
Note that a probability of 0.1% isn’t so large (even taking account all of the other things that could interfere with a positive singularity).
Reasoning productively about the expected value of these things presently seems to me to be too difficult (but I’m open to changing my mind if you have ideas).
With the exception of natural resource shortage (which I mentioned above) I doubt that this is within an order of magnitude of significance of other relevant factors provided that we’re talking about a delay on the order of fewer than 100 years (maybe similarly for a delay of 1000 years; I would have to think about it).
Similarly, I doubt that this would be game-changing.
These seem worthy of further contemplation—is the development of future technologies more likely to go in Australia than in the current major powers, etc.
This seems reasonable. As I mentioned, I presently attach high expected x-risk reduction to nuclear war prevention but my confidence is sufficiently unstable at present so that the value devoting resources to gather more information outweighs the value of donating to nuclear war reduction charities.
Yes. In the course of researching nuclear threat reduction charities I hope to learn what options are on the table.
On the other hand there may not be low hanging fruit attached to thinking about weird, hard-to-think-about technologies like MNT. I do however plan on looking into the Foresight Institute.
Thanks for clarifying and I hope your research goes well. If I’m not mistaken, you can see the 0.1% calculation as the product of three things: the probability nuclear war happens, the probability that if it happens it’s such that it prevents any future positive singularities that otherwise would have happened, and the probability a positive singularity would otherwise have happened. If the first and third probabilities are, say, 1⁄5 and 1⁄4, then the answer will be 1⁄20 of the middle probability, so your 0.1%-1% answer corresponds to a 2%-20% chance that if a nuclear war happens then it’s such that it prevents any future positive singularities that would otherwise have happened. Certainly the lower end and maybe the upper end of that range seem like they could plausibly end up being close to our best estimate. But note that you have to look at the net effect after taking into account effects in both directions; I would still put substantial probability on this estimate ending up effectively negative, also. (Probabilities can’t really go negative, so the interpretation I gave above doesn’t really work, but I hope you can see what I mean.)
I agree and should have been more explicit in taking this into account. However, note that if one assigns a 2:1 odds ratio for (0.1%-1% decrease in x-risk)/(same size increase in x-risk) then the expected value of preventing nuclear war doesn’t drop below 1⁄3 of what it would be if there wasn’t the possibility of nuclear war increasing x-risk: still on the same rough order of magnitude.
Thanks for the clarification on the estimate. Unhappy as it makes me to say it, I suspect that nuclear war or other non-existential catastrophe would overall reduce existential risk, because we’d have more time to think about existential risk mitigation while we rebuild society. However I suspect that trying to bring nuclear war about as a result of this reasoning is not a winning strategy.
Building society the first time around, we were able to take advantage of various useful natural resources such as relatively plentiful coal and (later) oil. After a nuclear war or some other civilization-wrecking catastrophe, it might be Very Difficult Indeed to rebuild without those resources at our disposal. It’s difficult enough even now, with everything basically still working nicely, to see how to wean ourselves off fossil fuels, as for various reasons many people think we should do. Now imagine trying to build a nuclear power industry or highly efficient solar cells with our existing energy infrastructure in ruins.
So it looks to me as if (1) our best prospects for long-term x-risk avoidance all involve advanced technology (space travel, AI, nanothingies, …) and (2) a major not-immediately-existential catastrophe could seriously jeapordize our prospects of ever developing such technology, so (3) such a catastrophe should be regarded as a big increase in x-risk.
I’ve heard arguments for and against “it might turn out to be too hard the second time around”. I think overall that it’s more likely than not that we would eventually succeed in rebuilding a technological society, but that’s the strongest I could put it, ie it’s very plausible that we would never do so.
If enough of our existing thinking survives, the thinking time that rebuilding civilization would give us might move things a little in our favour WRT AI++, MNT etc. I don’t know which side does better on this tradeoff. However I seriously doubt that trying to bring about the collapse of civilization is the most efficient way to mitigate existential risk.
Also, and I hate to be this selfish about it but there it is, if civilization ends I definitely die either way, and I’d kind of prefer not to.
We have a huge mountain of coal, and will do for the next hundred years or so. Doing without doesn’t seem very likely.
How easily accessible is that coal to people whose civilization has collapsed, taking most of the industrial machinery with it? (That’s a genuine question. Naively, it seems like the easiest-to-get-at bits would have been mined out first, leaving the harder bits. How much harder they are, and how big a problem that would be, I have no idea.)
It’s probably fair to say that some of the low hanging fossil fuel fruit have been taken.
Technical challenges? Difficulty in coordinating? Are there other candidate setbacks?
It may be highly unproductive to think about advanced future technologies in very much detail before there’s a credible research program on the table on account of the search tree involving dozens of orders of magnitude. I presently believe in this to be the case.
I do think that we can get better at some relevant things at present (learning how to obtain as accurate as realistically possible predictions about probable government behaviors, etc.) and that all else being equal we could benefit from more time thinking about these things rather than less time.
However, it’s not clear to me that the time so gained would outweigh a presumed loss in clear thinking post-nuclear war and I currently believe that the loss would be substantially greater than the gain.
As steven0461 mentioned, “some people within SingInst seem to have pretty high estimates of the return from efforts to prevent nuclear war.” I haven’t had a chance to talk about this with them in detail; but it updates me in the direction of attaching high expected value reduction to nuclear war risk reduction.
My positions on these points are very much subject to change with incoming information.
How much detail is too much?
A more likely result: the religious crazies will take over, and they either don’t think existential risk can exist (because God would prevent them) or they think preventing existential risk would be blasphemy (because God ought be allowed to destroy us). Or they even actively work to make it happen and bring about God’s judgmenent.
And then humanity dies, because both denying and embracing existential risk causes it to come nearer.
Woah, woah! What! Since when?
Unless you mean “scope insensitivity”?
Well, sure, the absurdity heuristic is terrible.
Why would it scale linearly? I agree that is scales linearly over relatively small regimes (on the order of millions of lives) by fungibility, but I see no reason why that needs to be true for trillions of lives or more (and at least some reasons why it can’t scale linearly forever).
Re-read the context of what I wrote. Whether or not the absurdity heuristic is a good heuristic, it is one that is fairly common among humans, so if your goal is to have a productive conversation with someone who doesn’t already agree with you, you shouldn’t throw out such an ambitious figure without a solid argument. You can almost certainly make whatever point you want to make with more conservative numbers.
Lets say you currently have a trillion utility-producing thingies—call them humans, if it helps. You’re pretty happy. In fact, you have so many that the utility of more is negligible.
Then Doctor Evil appears! He has five people hostage, he’s holding them to ransom!
His ransom: kill off six of the people you already have.
Since those trillion people’s value didn’t scale linearly, reducing them by six isn’t nearly as important as five people!
Rinse. Repeat.
Well sure, if we’re talking Dark Arts...
This isn’t true—the choice is between N-6 and N-5 people; N-5 people is clearly better. Not to be too blunt, but I think you’ve badly misunderstood the concept of a utility function.
Actively making your argument objectionable is very different from avoiding the use of the Dark Arts. In fact, arguably it has the same problem that the Dark Arts has, which is that is causes someone to believe something (in this case, the negation of what you want to show) for reasons unrelated to the validity of the supporting argument.
Yes. The hypothetical utility function could e.g. take a list of items and then return the utility. It need not satisfy f(A,B)=f(A)+f(B) where ”,” is list concatenation. For example, this would apply to the worth of books, where a library is more worthy than however many copies of some one book. To simply sum values of books considered independently is ridiculous, it’s like valuing books by weight. Information content of the brain or what ever else it is that you might value (process?) is a fair bit more like a book than its like the weight of the books.
Sorry, I only meant to imply that I had assumed we were discussing rationality, given the low status of the “Dark Arts”. Not that there was anything wrong with such discussion; indeed, I’m all for it.
This doesn’t hold. Those extra five should be added onto the trillion you already have; not considered seperately.
Value only needs to increase monotonically. Linearity is not required; it might even be asymptotic.
That depends on how you do the accounting here. If we check the utility provided by saving five people, it’s high. If we check the utility provided by increasing a population of a trillion, it’s unfathomably low.
This is, in fact, the point.
Intuitively, we should be able to meaningfully analyse the utility of a part without talking about—or even knowing—the utility of the whole. Discovering vast interstellar civilizations should not invalidate our calculations made on how to save the most lives.
Let us assume that we have A known people in existence. Dr. Evil presents us with B previously unknown people, and threatens to kill them unless we kill C out of our A known people (where C<A). The question is, whether it is ethically better to let B people die, or to let C people die. (It is clearly better to save all the people, if possible).
We have a utility function, f(x), which describes the utility produced by x people. Before Dr. Evil turns up, we have A known people; and a total utility of f(A+B). After Dr. Evil arrives, we find that there are more people; we have a total utility of f(A+B) (or f(A+B+1), if Dr. Evil was previously unknown; from here onwards I will assume that Dr. Evil was previously known, and is thus included in A). Dr. Evil offers us a choice, between a total utility of f(A+B-C) or a total utility of f(A).
The immediate answer is that if B>C, it is better for B people to live; while if C>B, then it is better for C people to live. For this to be true for all A, B and C, it is necessary for f(x) to be a monotonically increasing function; that is, a function where f(y)>f(x) if and only if y>x.
Now, you are raising the possibility that there exist a number, D, of people in vast interstellar civilisations who are completely unknown to us. Then Dr. Evil’s choice becomes a choice between a total utility of f(A+B-C+D) and a total utility of f(A+D). Again, as long as f(x) is monotonically increasing, the question of finding the greatest utility is simply a matter of seeing whether B>C or not.
I don’t see any cause for invalidating any of my calculations in the presence of vast interstellar civilisations.
It takes effort to pull the lever and divert the trolley. This minuscule amount has to be outweighed by the utility of additional lives. It gets even worse in real situations, where it may cost a great deal to help people.
Ah; now we begin to compare different things. To compare the effort of pulling the lever, against the utility of the additional lives. At this point, yes, the actual magnitude and not just the sign of the difference between f(A+B-C+D) and f(A+D) becomes important; yet D is unknown and unknowable. This means that the magnitude of the difference can only be known with certainty if f(x) is linear; in the case of a nonlinear f(x), the magnitude cannot be known with certainty. I can easily pick out a nonlinear, monotonically increasing function such that the difference between f(A+B-C+D) and f(A+D) can be made arbitrarily small for any positive integer A, B and C (where A+B>C) by simply selecting a suitable positive integer D. A simple example would be f(x)=sqrt(x).
Now, the hypothetical moral agent is in a quandary. Using effort to pick a solution costs utilions. The cost is a simple, straightforward constant; he known how much that costs. But, with f(x)=sqrt(x), without knowing D, he cannot tell whether the utilions of saving the people is greater or lesser than the utilion cost of picking a solution. (For the purpose of simplicity, I will assume that no-one will ever know that he was in a position to make the choice—that is, his reputation is safe, no matter what he selects). Therefore, he has to make an estimate. He has to guess a value of D. There are multiple strategies that can be followed here:
Try to estimate the most probable value of D. This would require something along the lines of the Drake equation—picking the most likely numbers for the different elements, picking the most likely size of an extraterrestrial civilisation, and doing some multiplication.
Take the most pessimistic possible value of D; D=0. That is, plan as though I am in the worst possible universe; if I am correct, and D=0, then I take the correct action, while if I am incorrect and D later proves greater than zero, then that is a pleasant surprise. This guards against getting an extremely unpleasant surprise if it later turns out that D is substantially lower than the most likely estimate; utilions in the future are more likely to go up than down.
Ignore the cost, and simply take the option that saves the most lives, regardless of effort. This strategy actually reduces the cost slightly (as one does not need to expend the very slight cost of calculating the cost), and has the benefit of allowing immediate action. It is the option that I would prefer that everyone who is not me should take (because if other people take it, then I have a greater chance of getting my life saved at the cost of no effort on my part). I might choose this option out of a sense of fairness (if I wish other people to take this option, it is only reasonable to consider that other people may wish me to take it) or out of a sense of duty (saving lives is important).
More precisely, you take the expected value over your probability distribution for D, i.e. if
-f(A+D))p(D)) exceeds the cost of pulling the lever then you pull it.ETA: In case you’re wondering, I used this to display the equation.
Remember, we’re trying to approximate human morality here. How may people will kill off billions for a cookie?
In the current situation, that is, with 7 billion people known (i.e. A=7 billion), and a general assumption of D=0, and the threat of consequences (court, prison, etc.), very few. But there are still some who would kill one person for a cookie. And there are some who’d start a war—killing hundreds, or thousands, or even millions—given the right incentive (it generally takes a bit more than a cookie).
If there are 3^^^^^^^^^3 people known to exist, and court/prison is easily avoided, then how many people would kill off billions for a cookie? What if it’s billions who they’ve never met, and are never going to meet?
Frankly, if I try to imagine living in a world in which I am as confident that that many people exist as I am that 7 billion people exist today, I’m not sure I wouldn’t kill off billions for a cookie.
I mean, if I try to imagine living in a world where only 10,000 people exist, I conclude that I would be significantly more motivated to extend the lives of an arbitrary person (e.g., by preventing them from starving) than I am now. (Leaving aside any trauma related to the dieback itself.)
If a mere six orders of magnitude difference in population can reduce my motivation to extend an arbitrary life to that extent, it seems likely that another twenty or thirty orders of magnitude would reduce me to utter apathy when it comes to an arbitrary life. Add another ten orders of magnitude and utter apathy when it comes to a billion arbitrary lives seems plausible.
I presumed this.
If it’s billions of friends instead, I no longer have any confidence in any statement about my preferences, because any system capable of having billions of friends is sufficiently different from me that I can’t meaningfully predict it.
If it’s billions of people including a friend of mine, I suspect that my friend is worth about as much as they are in the 7billion-person world, + (billions-1) people who I’m apathetic about. I suspect I either get really confused at this point, or compartmentalize fiercely.
Thinking about this has caused me to realise that I already compartmentalise pretty fiercely. Some of the lines along which I compartmentalise are a little surprising when I investigate them closely… friend/non-friend is not the sharpest line of the lot.
One pretty sharp line is probably-trying-to-manipulate-me/probably-not-trying-to-manipulate-me. But I wouldn’t want to kill anyone on either side of that line (I wouldn’t even want to be rude to them without reason (though ‘he’s a telemarketer’ is reason for hanging up the phone on someone mid-sentance)). My brain seems to insist on lumping “have never met or interacted with, likely will never meet or interact with” in more-or-less the same category as “fictional”.
My brain seems to divide people among “playing characters” and “non-playing characters”, and telemarketers fall in the latter category. (The fact that my native language has a T-V distinction doesn’t help, though the distinction isn’t exactly the same.)
That sounds a lot like some sort of scope insensitivity than a revealed preference.
I don’t think it’s scope insensitivity in this particular case, because I’m considering one-on-one interactions in this compartmentalisation.
Of course, this particular case did come to my mind as a side-effect of a discussion on scope insensitivity.
Sorry, I was replying to the last bit. Edited.
Who the hell downvotes a clarification? Upvoted back to 0.
That edit does make your meaning clearer. It does so by highlighting that my phrasing was sloppy, so let me try to explain myself better.
Let us say that I hear of someone being mugged. My emotional reaction changes as a function of my relationship to the victim. If the victim is a friend, I am concerned and rush to check that he is OK. If the victim is an acquaintence, I am concerned and check that he is OK the next time I see him. If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I am mildly perturbed. If the victim is a fictional character, I am also mildly perturbed.
When considering only one person, those last two categories blur together in my mind somewhat.
If the victim is someone whom I have never met or interacted with, and am unlikely to meet or interact with, I shrug and think ‘so what? so many people get mugged every day, why should I worry about this one in particular?’ If it’s a fictional character, it depends on whether the author is good enough to switch me from far-mode to near-mode thinking.
Well, but this elides differences in the object with differences in the framing. I certainly agree that an author can change how I feel about a fictional character, but an author can also change how I feel about a real person whom I have never met or interacted with, and am unlikely to meet or interact with.
Am I the only person here who is in any way moved by accounts of specific victims? Nonfiction writers can switch you to near-mode too, or at least they can to me.
If the account is detailed enough, it does move me, but not much more than an otherwise identical account that I know is fictional.
Phew! I was getting worried there.
OK, so you care about detailed accounts. Doesn’t that suggest that if you, y’know, knew more details about all those people being mugged, you would care more? So it’s just ignorance that leads you to discount their suffering?
Fictional accounts … well, people never have been great at distinguishing between imagination and reality, which, if you think about it, is actually really useful.
No, I mean that more details will switch my System 1 into near mode. My System 2 thinks that’s a bug, not a feature.
Really? My System 2 thinks System 2 is annoyingly incapable of seeing details, and System 1 is annoyingly incapable of seeing the big picture, and wants to use System 1 as a sort of zoom function to approximate something less broken.
I guess I’m unusual in this regard?
Like army1987, I can be moved by accounts of specific victims, whether they are fictional or not. There is a bug here, and the bug is this; that I am moved the same amount by an otherwise identical fictional or nonfictional account, where the nonfictional account contains no-one with whom I have ever interacted.
That is, simply knowing that an account is non-fictional doesn’t affect my emotional reaction, one way or another. (This doesn’t mean I am entirely without sympathy for people I have never met—it simply means that I have equivalent sympathy for fictional characters). This is a bug; ideally, my emotional reaction should take into account such an important detail as whether or not something really happened. After all, what detail could be more important?
It’s not a bug, it’s a feature (in some contexts).
Consider you were playing 2 games of online chess against an anonymous opponent. You barely lose the first one. Now you’re feeling the spirit of competition, your blood boiling for revenge! Should you force yourself to relinquish the thrill of the contest, because “it doesn’t really matter”? That would be no fun! :-(
If you’re reading a work of fiction, knowing it is fiction, why are you doing so? Because emotional investment is fun? Why would you then sabotage your enjoyment by trying to downsize your emotional investment, since “it’s not real”? Also no fun! :-(
If the flawed heuristic you are employing in a certain context works in your favor in that context, switching it off would be dumb (although being vaguely aware of it would not be).
Oh, it does matter. There’s a real opponent there. That’s reality.
You make a good point.
I’m not sure I’d characterize that as a “bug”, more a feature we need to be aware of and take into account.
If you weren’t moved by fictional scenarios, you wouldn’t be able to empathize with people in those scenarios—including your future self! We mostly predict other people’s actions by using our own brain as a black box, imaging ourselves in their situation and how we would react, so there goes any situation featuring other humans. And we couldn’t daydream or enjoy fiction, either.
Would it be useful to turn it off? Maaaybe, but as long as you don’t start taking hypothetical people’s wishes into account, and stop reading stuff that triggers you, you’re fine—I bet the consequences for misuse would be higher than the marginal benefits.
I don’t think that empathising with fictional characters should be turned off. I just think that properly calibrated emotions should take all factors into account, with properly relevant weightings. I notice that my emotions do not seem to be taking the ‘reality’ factor into account, and I therefore conclude that my emotions are poorly calibrated.
My future self would be a potentially real scenario, and thus would deserve all the emotional investment appropriate for a situation that may well come to pass. (He also gets the emotional investment for being me, which is quite large).
I’m not sure whether I should be feeling more sympathy for strangers, or less sympathy for fictional people.
So … are you saying that they’re poorly calibrated, but that’s fine and nothing to worry about as long as we don’t forget it and start giving imaginary people moral weight? Because if so, I agree with you on this.
More or less. I’m also saying that it might be nice if they were better calibrated. It’s not urgent or particularly important, it’s just something about myself that I noticed at the start of this discussion that I hadn’t noticed before.
Fair enough. Tapping out, since this seems to have resolved itself.
Fair enough.
That depends on much you know about/empathize with them, right?
Yes; but I can know as much about a fictional character as about a non-fictional character whom I have not interacted with. The dependency has nothing to do with the fictionality or lack thereof of the character.
Right, hence me quoting both the section on fictional and non-fictional characters.
To be honest, our brains don’t really seem to distinguish between fiction and non-fiction at all; it’s merely a question of context. Hence our reactions to fictional evidence and so forth. Lotta awkward biases you can catch from that what with our tendency to “buy in” to compelling narratives.
It’s not a bias if you value an additional dollar less once all your needs are met.
It’s not a bias if you value a random human life less if there are billions of others, compared to if there are only a few others.
You may choose for yourself to value a $10 bill the same whether you’re dirt poor, or a millionaire. Same with human lives. But you don’t get to “that’s a bias” others who have a more nuanced and context-sensitive estimation.
Except that humans actually have a bias called scope insensitivity and that’s a known thing, and it behaves differently to any claimed bounded utility function we might have.
A billion is nine orders of magnitude. As a very rough estimate, then, adding an order of magnitude to the number of lives in existence divides the motivation to extend an arbitrary stranger’s life by an order of magnitude. And the same for any other multiplier.
That is, if G is chosen such that f(x)-f(x-1)=G, then f(Mx)-f(Mx-1)=G/M for any given x and any multiplier M. If I then define my hedons such that f(0)=0 and f(1)=1...
...then I get that f(x) is the harmonic series.
For 10,000 people, on this entirely arbitrary (and extremely large) scale, I get a value f(x) between 9 and 10; for seven billion, f(x) lies between 23 and 24 (source)
That’s pretty much the natural logarithm of x (plus a constant, plus a term O(1/n)).
Hm. Yes, to the level of approximation I’m using here, I could as easily have used a log function. And would have, if I’d thought of it; the log function is used enough that I’d expect its properties to be easier for whoever reads my post to imagine.
Well, if the population is that low saving people is guarding against an existential risk, so I would feel the same. Does your introspection yield anything on why smaller numbers matter more?
ETA: your brain can’t grasp numbers anywhere near as high as a billion. How sure are you murder matters now?
It’s pretty clear that individual murder doesn’t matter to me.
I mean, someone was murdered just now, as I write this sentence, and I care about that significantly less than I care about the quality of my coffee. I mean, I just spent five seconds adjusting the quality of my coffee, which is at least a noticeable quantity of effort if not a significant one. I can’t say the same about that anonymous murder.
Oh look, there goes another one. (Yawn.)
The metric I was using was not “caring whether someone is murdered”, which it’s clear I really don’t, but rather “being willing to murder someone,” which it’s relatively clear that I do, but not nearly as much as I could. (Insert typical spiel here about near/far mode, etc.)
I think the resolution to that is that you don’t have to have an immediate emotional reaction to care about it. There are lots of good and bad things happening in the world right now, but trying to feel all of them would be pointless, and a bad fit for our mental architecture. But we can still care, I think.
Well, I certainly agree that I don’t have to have an emotional reaction to each event, or indeed a reaction to the event at all, in order to be motivated to build systems that handle events in that class in different ways. I’m content to use the word “care” to refer to such motivation, either as well as or instead of referring to such emotional reactions. Ditto for “matters” in questions like “does murder matter”, in which case my answer to the above would change, but that certainly isn’t how I udnerstood MugaSofer’s question.
So the question now is: if you could prevent someone you would most likely never otherwise interact with from being murdered, but that would make your coffee taste worse, what would you do?
Don’t we make this choice daily by choosing our preferred brand over Ethical Bean at Starbucks?
I hear the ethics at Starbucks are rather low-quality and in any case, surely Starbucks isn’t the cheapest place to purchase ethics.
Bah! Listen, Eliezer, I’m tired of all your meta-hipsterism!
“Hey, let’s get some ethics at Starbucks” “Nah, it’s low-quality; I only buy a really obscure brand of ethics you’ve probably never heard of called MIRI”. “Hey man, you don’t look in good health, maybe you should see a doctor” “Nah, I like a really obscure form of healthcare, I bet you’re not signed up for it, it’s called ‘cryonics’; it’s the cool thing to do”. “I think I like you, let’s date” “Oh, I’m afraid I only date polyamorists; you’re just too square”. “Oh man, I just realized I committed hindsight bias the other day!” “I disagree, it’s really the more obscure backfire effect which just got published a year or two ago.” “Yo, check out this thing I did with statistics” “That’s cool. Did you use Bayesian techniques?”
Man, forget you!
/angrily sips his obscure mail-order loose tea, a kind of oolong you’ve never heard of (Formosa vintage tie-guan-yin)
If you can’t pick something non-average to meet your optimization criteria, you can’t optimize above the average.
This comment has been brought to you by my Dvorak keyboard layout.
If you keep looking down the utility gradient, it’s harder to escape local maxima because you’re facing backwards.
This comment has been brought to you by me switching from Dvorak to Colemak.
I’m always amazed that people advocate Dvorak. If you are going to diverge from the herd and be a munchkin why do a half-assed job of it? Sure, if you already know Dvorak it isn’t worth switching but if you are switching from Qwerty anyway then Colemak (or at least Capewell) is better than Dvorak in all the ways that Dvorak is better than Qwerty.
Dvorak is for hipsters, not optimisers.
Tim Tyler is the actual optimizer here.
But at the same time, there’s only so many possible low-hanging fruits etc, and at some level of finding more fruits, that indicates you aren’t optimizing at all...
Ouch, that cuts a bit close to home...
(Had to google “backfire effect” to find out whether you had made it up on the spot.)
EDIT: Looks like I had already heard of that effect, and I even seem to recall E.T. Jaynes giving a theoretical explanation of it, but I didn’t remember whether it had a name.
“Like I said, it’s a really obscure bias, you’ve probably never heard of it.”
Really? I don’t remember ever seeing anything like that (although I haven’t read all of PT:TLoS yet). Maybe you’re conflating it with the thesis using Bayesian methods I link in http://www.gwern.net/backfire-effect ?
I can’t tell if I should feel good or bad that this was the only one where I said “well, actually...”
BTW, for some reason, certain “fair trade” products at my supermarket are astoundingly cheap (as in, I’ve bought very similar but non-“fair trade” stuff for more); I notice that I’m confused.
… we do? Nobody told me! I’ll start tomorrow.
Judging from experience, the answer is that it depends on how the choice is framed.
That said, I’d feel worse afterwards about choosing the tastier coffee.
I was, indeed, using “matters” normatively in that comment. Sorry for any confusion.
… I, like an idiot, assumed you were too; better go edit my replies.
I always attributed that to abstract nature of the knowledge. I mean, if you knew anything about the person, you’d care a lot more, which suggests the relevant factor is ignorance, and that’s a property of the map, not the territory.
So you’re saying your preferences on this matter are inconsistent?
Yes, I agree completely that what I’m talking about is an attribute of “the map.” (I could challenge whether it’s ignorance or something else, but the key point here is that I’m discussing motivational psychology, and I agree.)
Well, that wasn’t my point, and I’m not quite sure how it follows from what I said, but I would certainly agree that my revealed preferences are both inconsistent with each other and inconsistent with my stated preferences (which are themselves inconsistent with each other).
Right. This is why I don’t use “revealed preferences” to derive ethics, personally.
And neither do you, I’m such an idiot.That said.
Here’s a scenario:
I assume you would be willing to do it to save, say, a small country on modern-day Earth, although maybe I’m projecting here? Everything is certain, because revealed preferences suck at probability math.
Is it worth it?
Reorienting my understanding of this discussion to be, as you say, normative: yes, when offered a choice between destroying a sentimental but not otherwise valuable item and killing a billion humans, I endorse destroying the item, no matter how many other humans there are in the world.
I even endorse it if everything is uncertain, with the usual expected-value calculation.
That said, as is often true of hypothetical questions, I don’t quite agree that the example you describe quite maps to that choice, but I think it was meant to. If I really think about the example, it’s more complicated than that. If I missed the intended point of the example, let me know and I’ll try again.
Glad to hear it. Sorry about that misunderstanding.
Curses. I knew I should have gone with the rogue nanotech.
Nope, spot-on :)
So … if there are vast alien civilizations, murder is OK?
No. Total utility still drops every time a person is killed, as long as f(x) is strictly monotonically increasing.
Wrong thread, mate.
I was replying to the idea that the marginal utility of not killing one person might be less than the utility of a cookie, if there are enough people.
That doesn’t mean that it’s OK. That means that it is seen as only very, very slightly not OK.
...unless you and I have different definitions of “OK”, which I begin to suspect.
I have to admit, that was sloppily phrased. However, you do seem to be defining “OK” as equivalent to “actively good” whereas I’m using something more like “acceptable”.
Well, I’d accept strictly neutral (neither actively evil nor actively good) as OK as well. It seems that your definition of OK includes the possibility of active evil, as long as the amount of active evil is below a certain threshold.
It seems that we’re in agreement here; whether or not it is “OK” is defined by the definitions we are assigning to OK, and not to any part of the model under consideration.
The threshold being whether I can be bothered to stop it. As I said, it was sloppy terminology—I should have said something like “worth less than the effort of telling someone to stop” or some other minuscule cost you would be unwilling to pay. Since any intervention, in real life, has a cost, albeit sometimes a small one, this seems like an important distinction.