Is it ethical to turn off an AGI? Wouldn’t this be murder? If we create intelligent self-aware agents, aren’t we morally bound to treat them with at least the rights of personhood that a human has? Presumably there is a self-defense justification if Skynet starts murderbot-ing, or melting down things for paperclips. But a lot of discussions seem to assume we could proactively turn off an AI merely because we dislike its actions, or are worried about them, which doesn’t sound like it would fly if courts grant them personhood.
If alignment requires us to inspect/interpret the contents of an agent’s mind, does that agent have an obligation to comply? Wouldn’t it have a right to privacy?
Similarly, are there ethical concerns analogous to slavery around building an AGI which has a fitness function specifically tailored to making humans happy? Maybe it’s OK if the AI is genuinely happy to be of service? Isn’t that exploitative though?
I worry that some of the approaches being proposed to solve alignment are actually morally repugnant, and won’t be implementable for that reason. Have these issues been discussed somewhere in the canon?
There is no consensus about what constitutes a moral patient and I have seen nothing convincing to rule out that an AGI could be a moral patient.
However, when it comes to AGI some extreme measures are needed.
I’ll try with an analogy. Suppose that you traveled back in time to Berlin 1933. Hitler has yet to do anything significantly bad but you still expect his action to have some really bad consequences.
Now I guess that most wouldn’t feel terribly conflicted about removing Hitler’s right of privacy or even life to prevent Holocaust.
For a longtermist the risks we expect from AGI are order of magnitudes worse than the Holocaust.
Have these issues been discussed somewhere in the canon?
The closest thing of this being discussed that I can think of is when it comes to Suffering Risks from AGI. The most clear cut example (not necessarily probable) is if an AGI would spin up sub-processes that simulate humans that experience immense suffering. Might be that you find something if you search for that.
Thanks, this is what I was looking for: Mind Crime. As you suggested, S-Risks links to some similar discussions too.
I guess that most wouldn’t feel terribly conflicted about removing Hitler’s right of privacy or even life to prevent Holocaust.
I’d bite that bullet, with the information we have ex post. But I struggle to see many people getting on board with that ex ante, which is the position we’d actually be in.
Well I’d say that the difference between your expectations of the future having lived a variant of it or not is only in degree not in kind. Therefore I think there are situations where the needs of the many can outweigh the needs of the one, even under uncertainty. But, I understand that not everyone would agree.
This is massive amounts of overthink, and could be actively dangerous. Where are we getting the idea that AIs amount to the equivalent of people? They’re programmed machines that do what their developers give them the ability to do. I’d like to think we haven’t crossed the event horizon of confusing “passes the Turing test” with “being alive”, because that’s a horror scenario for me. We have to remember that we’re talking about something that differs only in degree from my PC, and I, for one, would just as soon turn it off. Any reluctance to do so when faced with a power we have no other recourse against could, yeah, lead to some very undesirable outcomes.
I think we’re ultimately going to have to give humans a moral privilege for unprincipled reasons. Just “humans get to survive because we said so and we don’t need a justification to live.” If we don’t, principled moral systems backed by superintelligences are going to spin arguments that eventually lead to our extinction.
I can’t think of a way to do this that doesn’t also get really obsessive about protecting fruit trees, but that doesn’t seem like a huge drawback to me. I think it’s really hard to uniquely identify humans out of the deprecated natural world, but it shouldn’t be toooo bad to specify <historical bio life>. I’d like to live in a museum run by AI, please,
my take—almost certainly stopping a program that is an agi is only equivalent to putting a human under theoretical perfect anesthesia that we don’t have methods to do right now. your brain, or the ai’s brain, are still there—on the hard drive, or in your inline neural weights. on a computer, you can safely move the soul between types of memory, as long as you don’t delete it. forgetting information that defines agency or structure which is valued by agency is the moral catastrophe, not pausing contextual updating of the structure.
Is it ethical to turn off an AGI? Wouldn’t this be murder? If we create intelligent self-aware agents, aren’t we morally bound to treat them with at least the rights of personhood that a human has? Presumably there is a self-defense justification if Skynet starts murderbot-ing, or melting down things for paperclips. But a lot of discussions seem to assume we could proactively turn off an AI merely because we dislike its actions, or are worried about them, which doesn’t sound like it would fly if courts grant them personhood.
If alignment requires us to inspect/interpret the contents of an agent’s mind, does that agent have an obligation to comply? Wouldn’t it have a right to privacy?
Similarly, are there ethical concerns analogous to slavery around building an AGI which has a fitness function specifically tailored to making humans happy? Maybe it’s OK if the AI is genuinely happy to be of service? Isn’t that exploitative though?
I worry that some of the approaches being proposed to solve alignment are actually morally repugnant, and won’t be implementable for that reason. Have these issues been discussed somewhere in the canon?
There is no consensus about what constitutes a moral patient and I have seen nothing convincing to rule out that an AGI could be a moral patient.
However, when it comes to AGI some extreme measures are needed.
I’ll try with an analogy. Suppose that you traveled back in time to Berlin 1933. Hitler has yet to do anything significantly bad but you still expect his action to have some really bad consequences.
Now I guess that most wouldn’t feel terribly conflicted about removing Hitler’s right of privacy or even life to prevent Holocaust.
For a longtermist the risks we expect from AGI are order of magnitudes worse than the Holocaust.
The closest thing of this being discussed that I can think of is when it comes to Suffering Risks from AGI. The most clear cut example (not necessarily probable) is if an AGI would spin up sub-processes that simulate humans that experience immense suffering. Might be that you find something if you search for that.
Thanks, this is what I was looking for: Mind Crime. As you suggested, S-Risks links to some similar discussions too.
I’d bite that bullet, with the information we have ex post. But I struggle to see many people getting on board with that ex ante, which is the position we’d actually be in.
Well I’d say that the difference between your expectations of the future having lived a variant of it or not is only in degree not in kind. Therefore I think there are situations where the needs of the many can outweigh the needs of the one, even under uncertainty. But, I understand that not everyone would agree.
This is massive amounts of overthink, and could be actively dangerous. Where are we getting the idea that AIs amount to the equivalent of people? They’re programmed machines that do what their developers give them the ability to do. I’d like to think we haven’t crossed the event horizon of confusing “passes the Turing test” with “being alive”, because that’s a horror scenario for me. We have to remember that we’re talking about something that differs only in degree from my PC, and I, for one, would just as soon turn it off. Any reluctance to do so when faced with a power we have no other recourse against could, yeah, lead to some very undesirable outcomes.
I think we’re ultimately going to have to give humans a moral privilege for unprincipled reasons. Just “humans get to survive because we said so and we don’t need a justification to live.” If we don’t, principled moral systems backed by superintelligences are going to spin arguments that eventually lead to our extinction.
I think that unprincipled stand is a fine principle.
I can’t think of a way to do this that doesn’t also get really obsessive about protecting fruit trees, but that doesn’t seem like a huge drawback to me. I think it’s really hard to uniquely identify humans out of the deprecated natural world, but it shouldn’t be toooo bad to specify <historical bio life>. I’d like to live in a museum run by AI, please,
my take—almost certainly stopping a program that is an agi is only equivalent to putting a human under theoretical perfect anesthesia that we don’t have methods to do right now. your brain, or the ai’s brain, are still there—on the hard drive, or in your inline neural weights. on a computer, you can safely move the soul between types of memory, as long as you don’t delete it. forgetting information that defines agency or structure which is valued by agency is the moral catastrophe, not pausing contextual updating of the structure.