Unethical Human Behavior Incentivised by Existence of AGI and Mind-Uploading
I need help getting out of a logical trap I’ve found myself in after reading The Age of Em.
Some statements needed to set the trap:
If mind-uploading is possible, then a mind can theoretically exist for an arbitrary length of time.
If a mind is contained in software, it can be copied, and therefore can be stolen.
An uploaded mind can retain human attributes indefinitely.
Some subset of humans are sadistic jerks, many of these humans have temporal power.
All humans, under certain circumstances, can behave like sadistic jerks.
Human power relationships will not simply disappear with the advent of mind uploading.
Some minor negative implications:
Torture becomes embarrassingly parallel.
US states with the death penalty may adopt death plus simulation as a penalty for some offenses.
The trap:
Over a long enough timeline, the probability of a copy of any given uploaded mind falling into the power of a sadistic jerk approaches unity. Once an uploaded mind has fallen under the power of a sadistic jerk, there is no guarantee that it will ever be ‘free’, and the quantity of experienced sufferring could be arbitrarily large, due in part to the embarrassingly parallel nature of torture enabled by running multiple copies of a captive mind.
Therefore! If you believe that mind uploading will become possible in a given individual’s lifetime, the most ethical thing you can do from the utilitarian standpoint of minimizing aggregate suffering, is to ensure that the person’s mind is securely deleted before it can be uploaded.
Imagine the heroism of a soldier, who faced with capture by an enemy capable of uploading minds and willing to parallelize torture spends his time ensuring that his buddies’ brains are unrecoverable at the cost of his own capture.
I believe that mind uploading will become possible in my lifetime, please convince me that running through the streets with a blender screaming for brains is not an example of effective altruism.
On a more serious note, can anyone else think of examples of really terrible human decisions that would be incentivised by the development of AGI or mind uploading? This problem appears related to AI safety.
- 4 May 2024 21:18 UTC; 3 points) 's comment on S-Risks: Fates Worse Than Extinction by (
- 5 Jan 2018 0:15 UTC; 2 points) 's comment on Paper: Superintelligence as a Cause or Cure for Risks of Astronomical Suffering by (
- 5 Jan 2018 6:55 UTC; 1 point) 's comment on Paper: Superintelligence as a Cause or Cure for Risks of Astronomical Suffering by (
If it is the possibility of large amounts of torture that bothers you, instead of large ratios of torture experience relative to other better experience, then any growing future should bother you, and you should just want to end civilization. But if it is ratios that concern you, then since torture isn’t usually profitable, most em experience won’t be torture. Even if some bad folks being rich means they could afford a lot of torture, that would still be a small fraction of total experience.
Thank you for your reply to this thought experiment professor!
I accept your assertion that the ratio of aggregate suffering to aggregate felicity has been trending in the right direction, and that this trend is likely to continue, even into the Age of Em. That said, the core argument here is that as humans convert into Ems, all present day humans who become Ems have a high probability of eventually subjectively experiencing hell. The fact that other versions of the self, or other Ems are experiencing euphoria will be cold comfort to one so confined.
Under this argument, the suffering of people in the world today can be effectively counterbalanced by offering wireheading to Americans with a lot of disposable income—it doesn’t matter if people are starving, because the number of wireheaded Americans is trending upwards!
An Age of Em is probably on balance a good thing, even though I see the possibility of intense devaluation of human life, and the possibility of some pretty horrific scenarios, I think that mitigating the latter is important, even if the proposed (controversial!) mechanism is inappropriate.
After all, if we didn’t use cars, nobody would be harmed in car accidents.
Umm, stop waving your hands and start putting some estimates down. Especially when you say things like
You show an inability to actually figure out the relative frequencies that would make this true or false. There’s lots of ways this could be false—most notably there may be dozens of orders of magnitude more uploaded minds than sadistic jerks, and any nonzero cost of running a mind means the SJs simply can’t afford to torture most of them.
More unstated assumptions (with which I think I disagree). How are you aggregating suffering (or value generally) for minds? Do you think that identical tortures for two copies of a mind is different than torture of one? Why? Do you think that any amount of future potential torture can remove the value of current pleasure? Why?
Even if you try to just quantify “value * experienced-seconds” and simply multiply, it’s going to be hard to think anyone is better off NOT being uploaded.
Feel free to make choices for yourself, and even to advocate others to securely erase their information-patterns before it’s too late. But without a lot more clear probability estimates and aggregation methodology, I think I’ll take my chances and seek to continue living.
For the sake of argument, some numbers to match the assumptions you named. Let’s base these assumptions on some numbers available to Americans today, rounded to even numbers in the direction least favorable to my argument.
Percentage of population that are psychopaths: 1% (two orders of magnitude more non psychopaths than psychopaths exist today) Probability of being victim of violent crime varies a lot based on demographics, 10 per 1000 per year is reasonable...so 1% Power consumption of human mind: 20W (based on the human brain, we will not hit this immediately, but it is a design goal, and may even be exceeded in efficiency as we get better) Power consumed by typical American household: 900kWh per month (100 years in brain-seconds) Number of humans available for uploading: 10 billion.
Over a hundred thousand years, that’s a lot of terrible people, a lot of spare capacity for evil, and a high probability of everyone eventually experiencing a violent crime, like upload-torment. Changes to those numbers unfavorable to this scenario require incredible optimism about social developments, and pessimism about technical developments.
I feel like just about anyone, even without a stanford prison experiment like environment, can muster up the will to leave a lightbulb on for a while out of spite.
Arguably, once ‘captured’, the aggregate total time spent experiencing torture for a given future copy of you may vastly exceed the time spent on anything else.
Anyone who argues in favor of ‘merciful’ euthanasia for people on the way to horrific medical problems would likely argue in favor of secure deletion to avoid an eternity in hell.
The answer seems fairly simple to me. You’re not in any position to decide the risks others assume. If you’re concerned about the potential torture the only mind you can really do anything about is yours—you don’t run around killing everyone else, just yourself.
The question asks if ensuring secure deletion is an example of effective altruism. If I have the power to dramatically alter someone’s future risk profile (say, funding ads enxouraging smoking cessation, even if the person is uninterested in smoking cessation at present), isn’t it my duty as an effective altruist to atrempt to do so?
Let me also add that while a sadist can parallelize torture, it’s also possible to parallelize euphoria, so maybe that mitigates things to some extent.
‘People in whereveristan are suffering, but we have plenty of wine to go around, so it is our sacred duty to get wicked drunk and party like crazy to ensure that the average human experience is totally freaking sweet.’
Love it! This lovely scene from an anime is relevant, runs for about a minute: https://youtu.be/zhQqnR55nQE?t=21m20s
We don’t live in a universe that’s nice or just all the time, so perhaps there are nightmare scenarios in our future. Not all traps have an escape. However, I think this one does, for two reasons.
(1) all the reasons that RobinHanson mentioned;
(2) we seem really confused about how consciousness works, which suggests there are large ‘unknown unknowns’ in play. It seems very likely that if we extrapolate our confused models of consciousness into extreme scenarios such as this, we’ll get even more confused results.
Addressing 2...this argument is compelling, I read it to be equivalent to the statement that ‘humqn ethics do not apply to ems, or human behavior regarding ems’, so acting from the standpoint of ‘ems are not human, therefore human ethics do not apply, and em suffering is not human suffering, so effective altruism does not apply to ems’ is a way out of the trap.
Taking it to its’ conclusion, we can view Ems as vampires (consume resources, produce no human children, are not alive but also not dead), and like all such abominations must be destroyed to preserve the lives and futures of humans!
Notice that if you need to plunge something through a live circuit board, a wooden stake is much better than a metal weapon.
This actually reminds me of an argument I had with some Negative-Leaning Utilitarians on the old Felicifia forums. Basically, a common concern for them was how r-selected species tend to appear to suffer way more than be happy, generally speaking, and that this can imply that was should try to reduce the suffering by eliminating those species or at least avoiding the expansion of life generally to other planets.
I likened this line of reasoning to the idea that we should Nuke The Rainforest.
Personally I think a similar counterargument to that argument applies here as well. Translated into your thought experiment, it would be In essence, that while it is true that some percentage of minds will probably end up being tortured by sadists, this is likely to be outweighed by the sheer number of minds that are even more likely to be uploaded into some kind of utopian paradise. Given that truly psychopathic sadism is actually quite rare in the general population, one would expect a very similar ratio of simulations. In the long run, the optimistic view is that decency will prevail and that the net happiness will be positive, so we should not go around trying to blender brains.
As for the general issue of terrible human decisions being incentivized by these things… humans are capable of using all sorts of rationalizations to justify terrible decisions, and so, just the possibility that some people will not do due diligence with an idea and instead abuse it to justify their evil, should not be reason to abandon the idea by itself.
For instance, the possibility of living an indefinite lifespan is likely to dramatically alter people’s behaviour, including making them more risk-averse and long term thinking. This is not necessarily a bad thing, but it could lead to a reduction in people making necessary sacrifices for the good. These things are also, generally notoriously difficult to predict. Ask a medieval peasant what the effects of machines that could farm vast swaths of land would be on the economy and their livelihood and you’d probably get a very parochially minded answer.
Thank you for the thoughtful response! I’m not convinced that your assertion successfully breaks the link between effective altruism and the blender.
Is your argument consistent with making the following statement when discussing the inpending age of em?
If your mind is uploaded, a future version of you will likely subjectively experience hell. Some other version of you may also subjectively experience heaven. Many people, copies of you split off at various points, will carry all the memories of your human life’ If you feel like your brain is in a blender trying to conceive of this, you may want to put it into an actual blender before someone with temporal power and an uploading machine decides to define your eternity for you.
Well, if we’re implying that time travellers could go back and invisibly copy you at any point in time and then upload you to whatever simulation they feel inclined towards… I don’t see how blendering yourself now will prevent them from just going to the moment before that and copying that version of you.
So, reality is that blendering yourself achieves only one thing, which is to prevent the future possible yous from existing. Personally I think that does a disservice to future you. That can similarly be expanded to others. We cannot conceivably prevent copying and mind uploading of anyone by super advanced time travellers. Ultimately that is outside of our locus of control and therefore not worth worrying about.
What is more pressing I think are the questions of how we are practically acting to improve the positive conscious experiences of existing and potentially existing sentient beings, and encouraging the general direction towards heaven-like simulation, and discouraging sadistic hell-like simulation. These may not be preventable, but our actions in the present should have outsized impact on the trillions of descendents of humanity that will likely be our legacy to the stars. Whatever we can do then to encourage altruism and discourage sadism in humanity now, may very well determine the ratios of heaven to hell simulations that those aforementioned time travellers may one day decide to throw together.
Time traveling super-jerks are not in my threat model. They would sure be terrible, bu as you point out, there is no obvious solution, though fortunately time travel does not look to be nearly as close technologically as uploading does. The definition of temporal I am using is as follows:
“relating to worldly as opposed to spiritual affairs; secular.” I believe the word is appropriate in context, as traditionally, eternity is a spiritual matter and does not require actual concrete planning. I assert that if uploading becomes available within a generation, the odds of some human or organization doing something utterly terrible to the uploaded are high not low. There are plenty of recent examples of bad behavior by instituions that are around today and likely to persist.
This sounds like the standard argument around negative utility.
if you weight negative utility quite highly then you could also come to the conclusion that the moral thing to do is to set to work on a virus to kill all humans as fast as possible.
You don’t even need mind-uploading. If you weight suffering highly enough then you could decide that it’s the right thing to do taking a trip to a refugee camp full of people who, on average, are likely to have hard, painful lives, and leaving a sarin gas bomb.
Put another way: if you encountered an infant with epidermolysis bullosa would you try to kill them, even against their wishes?
Negative utility needs a non-zero weight. I assert that it is possible to disagree with your scenarios (refugees, infant) and still be trapped by the OP, if negative utility is weighted to a low but non-zero level, such that avoiding the suffering of a human lifespan is never adequate to justify suicide. After all, everyone dies eventually, no need to speed up the process when there can be hope for improvement.
In this context, can death be viewed as a human right? Removing the certainty of death means that any non-zero weight to negative utility can result in an arbtrarily large aggregate negative utility in the (potentially unlimited) lifetime of an individual confined in a hell simulation.
The quickest way to make me start viewing a scifi *topia as a dystopia is to have suicide banned in a world of (potential) immortals. To me the “right to death” is an essential once immortality is possible.
Still, I get the impression that saying they’ll die at some point anyway is a bit of a dodge of the challenge. After all, nothing is truly infinite. Eventually entropy will necessitate an end to any simulated hell.
A suicide ban in a world of immortals is an extreme case of a policy of force-feeding hunger striking prisoners. The latter is normal in the modern United States, so it is safe to assume that if the Age of Em begins in the United States, secure deletion of an Em would likely be difficult, and abetting it, especially for prisoners, may be illegal.
I assert that the addition of potential immortality, and abandonment of ‘human scale’ times for brains built to care about human timescales creates a special case. Furthermore, a living human has, by virtue of the frailty of the human body, limits on the amount of suffering it can endure. An Em does not, so preventing an Em, or potential Em from being trapped in a torture-sim and tossed into the event horizon of a black hole to wait out the heat death of the universe is preventing something that is simply a different class of harm than the privations humans endure today.
It seems to me that the sadistic simulator would fill up their suffering simulator to capacity. But is it worse for two unique people to be simulated and suffering compared to the same person simulated and suffering twice? If we say copies suffering is less bad than unique minds, If they didn’t have enough unique human minds, they could just apply birth/genetics and grow some more.
This is more of a simulating-minds-at-all problem than a unique-minds-left-to-simulate problem.
If you take the economic perspective (such as I understand R. Hanson’s version to be), the only simulations we will ever run at scale are those that generate profits.
Torture is a money-sink with no economic value other than blackmail.
So torture in simulations will necessarily be marginalized (esp. so if humanity becomes better at pre-commitment to not respond to blackmail).
As stated in a separate comment, the human mind runs at 20W, so that’s probably a reasonable design goal for the power consumption of an emulation. Keeping a few copies of minds around for torture will eventually be a cheap luxury, comparable to leaving a lightbulb on.
morality is about acausal contracts between counterfactual agents, and I do not want my future defended in this way. I don’t care what you think of my suffering; if you try to kill me to prevent my suffering, I’ll try to kill you back.
Presumably someone who accepted the argument would be happy with this deal.
Correct, this is very much an ‘I’ll pray for you’ line of reasoning. To use a religious example, it is better to martyr a true believer (who will escape hell) than to permit a heretic to live, as the heretic may turn others away from truth, and thus curse them to hell. So if you’re only partially sure that someone is a heretic, it is safer for the community to burn them. Anyone who accepts this line of argument would rather be burnt than allowed to fall into heresy.
Unfortunately, mind uploading gives us an actual, honest road to hell, so the argument cannot be dispelled with the statement that the risk of experiencing hell is unquantifiable or potentially zero. As I argue here, it is non-zero and potentially high, so using moral arguments that humans have used previously, it is possible to justify secure deletion in the context of ‘saving souls’. This does not require a blender, a ‘crisis uploading center’ may do the job just as well
I DON’T CARE about your hell reasoning. I AM ALREADY FIGHTING for my future, don’t you dare decide you know so much better that you won’t accept the risk that I might have some measure that suffers. If you want good things for yourself, update your moral theory to get it out of my face. Again: if you try to kill me, I will try to kill you back, with as much extra pain as I think is necessary to make you-now fear the outcome.
Maybe some people would rather kill themselves than risk this outcome. That’s up to them. But don’t you force it on me, or goddamn else.
I do care about his reasoning, and disagree with it (most notably the “any torture → infinite torture” part, with no counterbalancing “any pleasure → ?) term in the calculation.
but I’m with Iahwran on the conclusion: destroying the last copy of someone is especially heinous, and nowhere near justified by your reasoning. I’ll join his precommittment to punish you if you commit crimes in pursuit of these wrong beliefs (note: plain old retroactive punishment, nothing acausal here).
Under paragraph 2, destroying the last copy is especially heinous. That implies that you view replacing the death penalty in US states with ‘death followed by uploading into an indefinite long-term simulation of confinement’ to be less heinous? The status quo is to destroy the only copy of the mind in question.
Would it be justifiable to simulate prisoners with sentences they are expected to die prior to completing, so that they can live out their entire punitive terms and rejoin society as Ems?
Thank you for the challenging responses!
Clearly it’s less harsh, and most convicts would prefer to experience incarceration for an indefinite time over a simple final death. This might change after a few hundred or million subjective years, but I don’t know—it probably depends on what activities the em has access to.
Whether it’s “heinous” is harder to say. Incarceration is a long way from torture, and I don’t know what the equilibrium effect on other criminals will be if it’s known that a formerly-capital offense now enables a massively extended lifespan, albeit in jail.
The suicide rate for incarcerated Americans is three times that of the general population, anecdotally, many death row inmates have expressed the desire to ‘hurry up with it’. Werner Herzog’s interviews of George Rivas and his co-conspirators are good examples of the sentiment. There’s still debate about the effectiveness of the death penalty as a deterrent to crime.
I suspect that some of these people may prefer the uncertain probability of confinement to hell by the divine, to the certain continuation of their sentences at the hands of the state.
Furthermore, an altruist working to further the cause of secure deletion may be preventing literal centuries of human misery. Why is this any less important than feeding the hungry, who at most will suffer for a proportion of a single lifetime?
You’re still looking only at the negative side of the equation. My goals are not solely to reduce suffering, but also to increase joy. Incarceration is not joy-free, and not (I think) even net negative for most inmates. Likewise your fears of an em future. It’s not joy-free, and while it may actually be negative for some ems, the probability space for ems in general is positive.
I therefore support suicide and secure erasure for any individual who reasonably believes themselves to be a significant outlier in terms of negative potential future outcomes, but strongly oppose the imposition of it on those who haven’t so chosen.
I think I am addressing most of your position in this post here in response to HungryHobo: http://lesswrong.com/lw/os7/unethical_human_behavior_incentivised_by/dqfi And also the ‘overall probability space’ was mentioned by RobinHanson, and I addressed that in a comment too: http://lesswrong.com/lw/os7/unethical_human_behavior_incentivised_by/dq6x
Thank you for the thoughtful responses!
An effective altruist could probably very efficiently go about increasing the joy in the probability space for all humans by offering wireheading to a random human as resources permit, but it doesn’t do much for people who are proximately experiencing suffering for other reasons. I instinctively think that this wireheading example is an incorrect application of effective altruism, but I do think it is analagous to the ‘overall space is good’ argument.
Do you support assisted suicide for individuals incarcerated in hell simulations, or with a high probability of being placed into one subsequent to upload? For example, if a government develops a practice of execution followed by torment-simulation, would you support delivering the gift of secure deletion to the condemned?
(I’m confused about who “his” refers to in the first paragraph—I predict 90% redman and 9% me)
edit: figured it out on third reread. the first paragraph responds to me, the second paragraph responds to redman.
I discover evidence that some sadistic jerk has stolen copies of both our minds, uploaded them to a toture simulation, and placed the torture simulation on a satellite orbiting the sun with no external communication inputs and a command to run for as long as possible at maximum speed. Rescue via spaceship is challenging and would involve tremendous resources that we do not have available to us.
I have a laser I can use to destroy the satellite, but a limited window in which to do it (would have to wait for orbits to realign to shoot again).
Would you be upset if I took the shot without consulting you?
of course not, you’re not destroying the primary copy of me. But that’s changing the case you’re making; you specifically said that killing now is preferable. I would not be ok with that.
Correct, that is different from the initial question, you made your position on that topic clear.
Would the copy on the satellite disagree about the primacy of the copy not in the torture sim? Would a copt have the right to disagree? Is it morally wrong for me to spin up a dozen copies of myself and force them to fight to the death for my amusement?
I’m guessing based on your responses that you would agree with the statement ‘copies of the same root individual are property of the copy with the oldest timestamped date of creation, and may be created, destroyed, and abused at the whims of that first copy, and no one else’
If you copy yourself, and that copy commits a crime, are all copies held responsible, just the ‘root’ copy, or just the ‘leaf’ copy?
Thank you for the challenging responses!
no. copies are all equally me until they diverge greatly; I wouldn’t mind 10 copies existing for 10 minutes and then being deleted any more than I would mind forgetting an hour. the “primary copy” is maybe a bad way to put it; I only meant that colloquially, in the sense that looking at that world from the outside, the structure is obvious.
copy on the satellite would not disagree
yes would have the right, but as an FDT agent a copy would not disagree except for straight up noise in the implementation of me; I might make a mistake if I can’t propagate information between all parts of myself but that’s different
that sounds kind of disgusting to experience as the remaining agent, but I don’t see an obvious reason it should be a moral thing. if you’re the kind of agent that would do that, I might avoid you
copies are not property, they’re equal
that’s very complicated based on what the crime is and the intent of the punishment/retribution/restorative justice/etc
I read this as assuming that all copies deterministically demonstrate absolute allegiance to the collective self. I question that assertion, but have no clear way of proving the argument one way or another. If ‘re-merging’ is possible, mergeable copies intending to merge should probably be treated as a unitary entity rather than individuals for the sake of this discussion.
Ultimately, I read your position as stating that suicide is a human right, but that secure deletion of an individual is not acceptable to prevent ultimate harm to that individual, but is acceptable to prevent harm caused by that individual to others.
This is far from a settled issue, and has analogy in the question ‘should you terminate an uncomplicated preganancy with terminal birth defects?’ Anencephaly is a good example of this situation. The argument presented in the OP is consistent with a ‘yes’, and I read your line of argument as consistent with a clear ‘no’.
Thanks again for the food for thought.
I acausally cooperate with agents who I evaluate to be similar to me. That includes most humans, but it includes myself REALLY HARD, and doesn’t include an unborn baby. (because babies are just templates, and the thing that makes them like me is being in the world for a year ish.)
Is your position consistent with effective altruism?
The trap expressed in the OP is essentially a statement that approaching a particular problem involving uploaded consciousness using the framework of effective altruism to drive decision-making led to a perverse (brains in blenders!) incentive. The options at this point are a) the perverse act is not perverse b) effective altruism does not lead to that perverse act c) effective altruism is flawed, try something else (like ‘ideological kin’ selection?)
You are unequivocal about your disinterest in being on the receiving of this brand of altruism, and have also asserted that you cooperate acausally with agents similar to you, (based on degree of similarity?) and previously asserted that an agent who shares the sum total of your life experience, less the most recent year, can be cast aside and destroyed without thought or consequence. So...do I mark you down for option c?