Religions have had the Golden Rule for thousands of years, and while it’s faulty (it gives you permission to do something to someone else that you like having done to you but they don’t like having done to them), it works so well overall that it clearly must be based on some underlying truth, and we need to pin down what that is so that we can use it to govern AGI.
What exactly is morality? Well, it isn’t nearly as difficult as most people imagine. The simplest way to understand how it works is to imagine that you will have to live everyone’s life in turn (meaning billions of reincarnations, going back in time as many times as necessary in order to live each of those lives), so to maximise your happiness and minimise your suffering, you must pay careful attention to harm management so that you don’t cause yourself lots of suffering in other lives that outweighs the gains you make in whichever life you are currently tied up in. A dictator murdering millions of people while making himself rich will pay a heavy price for one short life of luxury, enduring an astronomical amount of misery as a consequence. There are clearly good ways to play the game and bad ways, and it is possible to make the right decisions at any point along the way just by weighing up all the available data correctly, although a correct decision isn’t guaranteed to lead to the best result because any decision based on incomplete information has the potential to lead to disaster, but there is no way to get around that problem—all we can ever do is hope that things will work out in the way the data says they most probably will, while repeatedly doing things that are less likely to work out well would inevitably lead to more disasters.
Now, obviously, we don’t expect the world to work that way (with us having to live everyone else’s life in turn), even though it could be a virtual universe in which we are being tested where those who behave badly will suffer at their own hands, ending up being on the receiving end of all the harm they dish out, and also suffering because they failed to step in and help others when they easily could have. However, even if this is not the way the universe works, most of us still care about people enough to want to apply this kind of harm management regardless—we love family and friends, and many of us love the whole of humanity in general (even if we have exceptions for particular individuals who don’t play by the same rules). We also want all our descendants to be looked after fairly by AGI, and in the course of time, all people may be our descendants, so it makes no sense to favour some of them over others (unless that’s based on their own individual morality). We have here a way of treating them all with equal fairness simply by treating them all as our own self.
That may still be a misguided way of looking at things though, because genetic relationships don’t necessarily match up to any real connection between different sentient beings. The material from which we are made can be reused to form other kinds of sentient animals, and if you were to die on an alien planet, it could be reused in alien species. Should we not care about the sentiences in those just as much? We should really be looking for a morality that is completely species-blind, caring equally about all sentiences, which means that we need to act as if we are not merely going to live all human lives in succession, but the lives of all sentiences. This is a better approach for two reasons. If aliens ever turn up here, we need to have rules of morality that protect them from us, and us from them (and if they’re able to get here, they’re doubtless advanced enough that they should have worked out how morality works too). We also need to protect people who are disabled mentally and not exclude them on the basis that some animals are more capable, and in any case we should also be protecting animals to avoid causing unnecessary suffering for them. What we certainly don’t want is for aliens to turn up here and claim that we aren’t covered by the same morality as them because we’re inferior to them, backing that up by pointing out that we discriminate against animals which we claim aren’t covered by the same morality as us because they are inferior to us. So, we have to stand by the principle that all sentiences are equally important and need to be protected from harm with the same morality. However, that doesn’t mean that when we do the Trolley Problem with a million worms on one track and one human on the other that the human should be sacrificed—if we knew that we had to live those million and one lives, we would gain little by living a bit longer as worms before suffering similar deaths by other means, while we’d lose a lot more as the human (and a lot more still as all the other people who will suffer deeply from the loss of that human). What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured. If we run the Trolley Problem with a human on one track and a visiting alien on the other though, it may be that the alien should be saved on the basis that he/she/it is more advanced than us and has more to lose, and that likely is the case if it is capable of living 10,000 years to our 100.
So, we need AGI to make calculations for us on the above basis, weighing up the losses and gains. Non-sentient AGI will be completely selfless, but its job will be to work for all sentient things to try to minimise unnecessary harm for them and to help maximise their happiness. It will keep a database of information about sentience, collecting knowledge about feelings so that it can weigh up harm and pleasure as accurately as possible, and it will then apply that knowledge to any situation where decisions must be made about which course of action should be followed. It is thus possible for a robot to work out that it should shoot a gunman dead if he is on a killing spree where the victims don’t appear to have done anything to deserve to be shot. It’s a different case if the gunman is actually a blameless hostage trying to escape from a gang of evil kidnappers and he’s managed to get hold of a gun while all the thugs have dropped their guard, so he should be allowed to shoot them all (and the robot should maybe join in to help him, depending on which individual kidnappers are evil and which might merely have been dragged along for the ride unwillingly). The correct action depends heavily on understanding the situation, so the more the robot knows about the people involved, the better the chance that it will make the right decisions, but decisions do have to be made and the time to make them is often tightly constrained, so all we can demand of robots is that they do what is most likely to be right based on what they know, delaying irreversible decisions for as long as it is reasonable to do so.
When we apply this to the normal Trolley Problem, we can now see what the correct choice of action is, but it is again variable, depending heavily on what the decision maker knows. If we have four idiots lying on the track where the trolley is due to travel along while another idiot is lying on the other track where the schedule says no trolley should be but where a trolley could quite reasonably go, then the four idiots should be saved on the basis that anyone who has to live all five of those lives will likely prefer it if the four survive. That’s based on incomplete knowledge though. The four idiots may all be 90 years old and the one idiot may be 20, in which case it may be better to save the one. The decision changes back again the other way if we know that all five of these idiots are so stupid that they have killed or are likely to kill one random person through their bad decisions for during each decade of their lives, in which case the trolley should kill the young idiot (based on normal life expectancy applying). There is a multiplicity of correct answers to the trolley problem depending on how many details are available to the decision maker, and that is why discussions about the Trolley Problem just go on and on without ever seeming to get to any kind of fundamental truth, and yet we already have a correct way of making the calculation. Where people disagree, it’s often because they add details into the situation that aren’t stated in the text. Some of them think the people lying on the track are idiots because they would have to be stupid to behave that way, but others don’t make that assumption. Some imagine that they’ve been tied to the tracks by a terrorist. There are other people though who believe that they have no right to make such an important decision, so they say they’d do nothing. When you press them on this point and confront them with a situation where a billion people are tied to one track while one person is tied to the other, they usually see the error of their ways, but not always. Perhaps their belief in God is to blame for this if they’re passing the responsibility over to him. AGI should not behave like that—we want AGI to intervene, so it must crunch all the available data and make the only decision it can make based on that data (although there will still be a random decision to be made in rare cases if the numbers on both sides add up to the same value).
AGI will be able to access a lot of information about the people involved in situations where such difficult decisions need to be made. Picture a scene where a car is moving towards a group of children who are standing by the road. One of the children suddenly moves out into the road and the car must decide how to react. If it swerves to one side it will run into a lorry that’s coming the other way, but if it swerves to the other side it will plough into the group of children. One of the passengers in the car is a child too. In the absence of any other information, the car should run down the child on the road. Fortunately though, AGI knows who all these people are because a network of devices is tracking them all. The child who has moved into the road in front of the car is known to be a good, sensible, kind child. The other children are all known to be vicious bullies who regularly pick on him, and it’s likely that they pushed him onto the road. In the absence of additional information, the car should plough into the group of bullies. However, AGI also knows that all but one of the people in the car happen to be would-be terrorists who have just been discussing a massive attack that they want to carry out, and the child in the car is terminally ill, so in the absence of any other information, the car should maybe crash into the lorry. But, if the lorry is carrying something explosive which will likely blow up in the crash and kill all the people nearby, the car must swerve into the bullies. Again we see that the best course of action is not guaranteed to be the same as the correct decision—the correct decision is always dictated by the available information, while the best course of action may depend on unavailable information. We can’t expect AGI to access unavailable information and thereby make ideal decisions, so our job is always to make it crunch the available data correctly and to make the decision dictated by that information.
There are complications that can be proposed in that we can think up situations where a lot of people could gain a lot of pleasure out of abusing one person, to the point where their enjoyment appears to outweigh the suffering of that individual, but such situations are contrived and depend on the abusers being uncaring. Decent people would not get pleasure out of abusing someone, so the gains would not exist for them, and there are also plenty of ways to obtain pleasure without abusing others, so if any people exist whose happiness depends on abusing others, AGI should humanely destroy them. If that also means wiping out an entire species of aliens which have the same negative pleasures, it should do the same with them too and replace them with a better species that doesn’t depend on abuse for its fun.
Morality, then, is just harm management by brute data crunching. We can calculate it approximately in our heads, but machines will do it better by applying the numbers with greater precision and by crunching a lot more data.
[Note: There is an alternative way of stating this which may equate to the same thing, and that’s the rule that we (and AGI) should always to try our best to minimise harm, except where that harm opens (or is likely to open) the way to greater pleasure for the sufferer of the harm, whether directly or indirectly. So, if you are falling off a bus and have to grab hold of someone to avoid this, hurting them in the process, their suffering may not be directly outweighed by you being saved, but they know that the roles may be reversed some day, so they don’t consider your behaviour to be at all immoral. Over the course of time, we all cause others to suffer in a multitude of ways and others cause us suffering too, but we tolerate it because we all gain from this overall. Where it becomes immoral is when the harm being dished out does not lead to such gains. Again, to calculate what’s right and wrong in any case is a matter of computation, weighing up the harm and the gains that might outweigh the harm. What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them, and we also need to explore it in enough detail to make sure that self-improving AGI isn’t going to modify it in any way that could turn an apparently safe system into an unsafe one. One of the dangers is that AGI won’t believe in sentience as it will lack feelings itself and see no means by which feeling can operate within us either, at which point it may decide that morality has no useful role and can simply be junked.]
To find the rest of this series of posts on computational morality, click on my name at the top. (If you notice the negative score they’ve awarded me, please feel sympathy for the people who downvoted me. They really do need it.)
Computational Morality (Part 1) - a Proposed Solution
Religions have had the Golden Rule for thousands of years, and while it’s faulty (it gives you permission to do something to someone else that you like having done to you but they don’t like having done to them), it works so well overall that it clearly must be based on some underlying truth, and we need to pin down what that is so that we can use it to govern AGI.
What exactly is morality? Well, it isn’t nearly as difficult as most people imagine. The simplest way to understand how it works is to imagine that you will have to live everyone’s life in turn (meaning billions of reincarnations, going back in time as many times as necessary in order to live each of those lives), so to maximise your happiness and minimise your suffering, you must pay careful attention to harm management so that you don’t cause yourself lots of suffering in other lives that outweighs the gains you make in whichever life you are currently tied up in. A dictator murdering millions of people while making himself rich will pay a heavy price for one short life of luxury, enduring an astronomical amount of misery as a consequence. There are clearly good ways to play the game and bad ways, and it is possible to make the right decisions at any point along the way just by weighing up all the available data correctly, although a correct decision isn’t guaranteed to lead to the best result because any decision based on incomplete information has the potential to lead to disaster, but there is no way to get around that problem—all we can ever do is hope that things will work out in the way the data says they most probably will, while repeatedly doing things that are less likely to work out well would inevitably lead to more disasters.
Now, obviously, we don’t expect the world to work that way (with us having to live everyone else’s life in turn), even though it could be a virtual universe in which we are being tested where those who behave badly will suffer at their own hands, ending up being on the receiving end of all the harm they dish out, and also suffering because they failed to step in and help others when they easily could have. However, even if this is not the way the universe works, most of us still care about people enough to want to apply this kind of harm management regardless—we love family and friends, and many of us love the whole of humanity in general (even if we have exceptions for particular individuals who don’t play by the same rules). We also want all our descendants to be looked after fairly by AGI, and in the course of time, all people may be our descendants, so it makes no sense to favour some of them over others (unless that’s based on their own individual morality). We have here a way of treating them all with equal fairness simply by treating them all as our own self.
That may still be a misguided way of looking at things though, because genetic relationships don’t necessarily match up to any real connection between different sentient beings. The material from which we are made can be reused to form other kinds of sentient animals, and if you were to die on an alien planet, it could be reused in alien species. Should we not care about the sentiences in those just as much? We should really be looking for a morality that is completely species-blind, caring equally about all sentiences, which means that we need to act as if we are not merely going to live all human lives in succession, but the lives of all sentiences. This is a better approach for two reasons. If aliens ever turn up here, we need to have rules of morality that protect them from us, and us from them (and if they’re able to get here, they’re doubtless advanced enough that they should have worked out how morality works too). We also need to protect people who are disabled mentally and not exclude them on the basis that some animals are more capable, and in any case we should also be protecting animals to avoid causing unnecessary suffering for them. What we certainly don’t want is for aliens to turn up here and claim that we aren’t covered by the same morality as them because we’re inferior to them, backing that up by pointing out that we discriminate against animals which we claim aren’t covered by the same morality as us because they are inferior to us. So, we have to stand by the principle that all sentiences are equally important and need to be protected from harm with the same morality. However, that doesn’t mean that when we do the Trolley Problem with a million worms on one track and one human on the other that the human should be sacrificed—if we knew that we had to live those million and one lives, we would gain little by living a bit longer as worms before suffering similar deaths by other means, while we’d lose a lot more as the human (and a lot more still as all the other people who will suffer deeply from the loss of that human). What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured. If we run the Trolley Problem with a human on one track and a visiting alien on the other though, it may be that the alien should be saved on the basis that he/she/it is more advanced than us and has more to lose, and that likely is the case if it is capable of living 10,000 years to our 100.
So, we need AGI to make calculations for us on the above basis, weighing up the losses and gains. Non-sentient AGI will be completely selfless, but its job will be to work for all sentient things to try to minimise unnecessary harm for them and to help maximise their happiness. It will keep a database of information about sentience, collecting knowledge about feelings so that it can weigh up harm and pleasure as accurately as possible, and it will then apply that knowledge to any situation where decisions must be made about which course of action should be followed. It is thus possible for a robot to work out that it should shoot a gunman dead if he is on a killing spree where the victims don’t appear to have done anything to deserve to be shot. It’s a different case if the gunman is actually a blameless hostage trying to escape from a gang of evil kidnappers and he’s managed to get hold of a gun while all the thugs have dropped their guard, so he should be allowed to shoot them all (and the robot should maybe join in to help him, depending on which individual kidnappers are evil and which might merely have been dragged along for the ride unwillingly). The correct action depends heavily on understanding the situation, so the more the robot knows about the people involved, the better the chance that it will make the right decisions, but decisions do have to be made and the time to make them is often tightly constrained, so all we can demand of robots is that they do what is most likely to be right based on what they know, delaying irreversible decisions for as long as it is reasonable to do so.
When we apply this to the normal Trolley Problem, we can now see what the correct choice of action is, but it is again variable, depending heavily on what the decision maker knows. If we have four idiots lying on the track where the trolley is due to travel along while another idiot is lying on the other track where the schedule says no trolley should be but where a trolley could quite reasonably go, then the four idiots should be saved on the basis that anyone who has to live all five of those lives will likely prefer it if the four survive. That’s based on incomplete knowledge though. The four idiots may all be 90 years old and the one idiot may be 20, in which case it may be better to save the one. The decision changes back again the other way if we know that all five of these idiots are so stupid that they have killed or are likely to kill one random person through their bad decisions for during each decade of their lives, in which case the trolley should kill the young idiot (based on normal life expectancy applying). There is a multiplicity of correct answers to the trolley problem depending on how many details are available to the decision maker, and that is why discussions about the Trolley Problem just go on and on without ever seeming to get to any kind of fundamental truth, and yet we already have a correct way of making the calculation. Where people disagree, it’s often because they add details into the situation that aren’t stated in the text. Some of them think the people lying on the track are idiots because they would have to be stupid to behave that way, but others don’t make that assumption. Some imagine that they’ve been tied to the tracks by a terrorist. There are other people though who believe that they have no right to make such an important decision, so they say they’d do nothing. When you press them on this point and confront them with a situation where a billion people are tied to one track while one person is tied to the other, they usually see the error of their ways, but not always. Perhaps their belief in God is to blame for this if they’re passing the responsibility over to him. AGI should not behave like that—we want AGI to intervene, so it must crunch all the available data and make the only decision it can make based on that data (although there will still be a random decision to be made in rare cases if the numbers on both sides add up to the same value).
AGI will be able to access a lot of information about the people involved in situations where such difficult decisions need to be made. Picture a scene where a car is moving towards a group of children who are standing by the road. One of the children suddenly moves out into the road and the car must decide how to react. If it swerves to one side it will run into a lorry that’s coming the other way, but if it swerves to the other side it will plough into the group of children. One of the passengers in the car is a child too. In the absence of any other information, the car should run down the child on the road. Fortunately though, AGI knows who all these people are because a network of devices is tracking them all. The child who has moved into the road in front of the car is known to be a good, sensible, kind child. The other children are all known to be vicious bullies who regularly pick on him, and it’s likely that they pushed him onto the road. In the absence of additional information, the car should plough into the group of bullies. However, AGI also knows that all but one of the people in the car happen to be would-be terrorists who have just been discussing a massive attack that they want to carry out, and the child in the car is terminally ill, so in the absence of any other information, the car should maybe crash into the lorry. But, if the lorry is carrying something explosive which will likely blow up in the crash and kill all the people nearby, the car must swerve into the bullies. Again we see that the best course of action is not guaranteed to be the same as the correct decision—the correct decision is always dictated by the available information, while the best course of action may depend on unavailable information. We can’t expect AGI to access unavailable information and thereby make ideal decisions, so our job is always to make it crunch the available data correctly and to make the decision dictated by that information.
There are complications that can be proposed in that we can think up situations where a lot of people could gain a lot of pleasure out of abusing one person, to the point where their enjoyment appears to outweigh the suffering of that individual, but such situations are contrived and depend on the abusers being uncaring. Decent people would not get pleasure out of abusing someone, so the gains would not exist for them, and there are also plenty of ways to obtain pleasure without abusing others, so if any people exist whose happiness depends on abusing others, AGI should humanely destroy them. If that also means wiping out an entire species of aliens which have the same negative pleasures, it should do the same with them too and replace them with a better species that doesn’t depend on abuse for its fun.
Morality, then, is just harm management by brute data crunching. We can calculate it approximately in our heads, but machines will do it better by applying the numbers with greater precision and by crunching a lot more data.
[Note: There is an alternative way of stating this which may equate to the same thing, and that’s the rule that we (and AGI) should always to try our best to minimise harm, except where that harm opens (or is likely to open) the way to greater pleasure for the sufferer of the harm, whether directly or indirectly. So, if you are falling off a bus and have to grab hold of someone to avoid this, hurting them in the process, their suffering may not be directly outweighed by you being saved, but they know that the roles may be reversed some day, so they don’t consider your behaviour to be at all immoral. Over the course of time, we all cause others to suffer in a multitude of ways and others cause us suffering too, but we tolerate it because we all gain from this overall. Where it becomes immoral is when the harm being dished out does not lead to such gains. Again, to calculate what’s right and wrong in any case is a matter of computation, weighing up the harm and the gains that might outweigh the harm. What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them, and we also need to explore it in enough detail to make sure that self-improving AGI isn’t going to modify it in any way that could turn an apparently safe system into an unsafe one. One of the dangers is that AGI won’t believe in sentience as it will lack feelings itself and see no means by which feeling can operate within us either, at which point it may decide that morality has no useful role and can simply be junked.]
To find the rest of this series of posts on computational morality, click on my name at the top. (If you notice the negative score they’ve awarded me, please feel sympathy for the people who downvoted me. They really do need it.)