because otherwise to an AI it would be vastly preferable the whole of Earth be blown up than 3^^^3 people suffer a mild slap to the face.
It would be utterly disastrous to create an AI which would allow someone to be slapped in the face to avoid a 1/3↑↑↑3 probability of destroying the Earth.
Suppose that I would tentatively choose to torture one person to save a googolplex people from dust specks, and that additionally I would choose torture to save only a googol people from a papercut. Do I have circular preferences if I would be much, much more willing to save a googolplex people from dust specks by giving paper cuts to googol people than to save either group from specks or paper cuts by torturing one person?
I can achieve the exact same total utility by giving specks to googolplex people, giving papercuts to a googol people, or torturing one person. If I had to save 3^^^3 people from dust specks I’d give 3^^^3*googol/googolplex people paper cuts instead of torturing anyone. I’d much prefer saving 3^^^3 people from dust specks by subjecting perhaps 2^^^2 people to a relatively troublesome dust speck. So why exactly do I prefer troublesome dust specks over papercuts over torture even if utility is maximized either way? I think that I’m probably doing utilitarianism as more of a maximin calculation; maximizing the minimum individual utility function in some way. I can’t maximize total utility in the cases where additional utility for some people must be bought at the cost of negative utility for others; it requires more of a fair exchange between individuals in order to increase total utility.
2^^^2 is 4, so I’d choose that in a heartbeat. 2^^^3 is the kind of number you were probably thinking about. Though, if we’re choosing fair-sounding situations, I’d like to cut one of my fingernails too short to generate a MJ/K of negentropy.
I’ve got one way of thinking this problem through that seems to fit with what you’re saying – though of course, it has its own flaws: represent each person’s utility (is that the right word in this case) such that 0 is the maximum possible utility they can have, then map each individual’s utility with x ⟼ -(e^(-x)), so that lots of harm to one person is weighted higher than tiny harms to many people. This is almost certainly a case of forcing the model to say what we want it to say.
Er… wouldn’t it be vastly preferable for the AI to /not/ slap people in the face to avoid 1/3↑↑↑3 probability events for non-multiplication reasons? Building an AGI that acts on 1/3↑↑↑3 probability is to make a god that, to outsiders, comes across as both arbitrarily capricious and overwhelmingly interventionist. Even if the AGI’s end result as a net effect on its well-defined utility function is positive, I’d wager modern humans or even extrapolated transhumans wouldn’t like being slapped in the face lest they consider running the next-next-gen version of the LHC in their AGI’s favorite universe. You don’t even need Knuth notation for that to run into place: a 1/10^100^100 or even 1/10^100 event quickly gets to the point.
Even from a practical viewpoint, that seems incredibly prone to oscillation. There’s a reason we don’t set air conditioners to keep local temperatures to five nines, and 1/3↑↑↑3 is, to fall to understatement, a lot of sensitivity past that.
This is incoherent because qualitative boundaries are naturally incoherent—why is one 2/10^100 risk worth processing power, but six separate 1/10^100 risks not worth processing, to give the blunt version of the sorites paradox? That’s a major failing from a philosophical standpoint, where incoherence is functionally incorrect. But AGI aren’t pure philosophy: there are strong secondary benefits to an underinterventionist AGI, and the human neurological bias trends toward underintentervention.
Of course, in an AGI situation you have to actually program it, and actually defining slapping one person 50 times versus slapping 100 people once each is programatically difficult enough
Edit : never mind; retracting this as off topic and misunderstanding the question.
Violating a coherence theorem always carries with an appropriate penalty of incoherence. What is your reply to the obvious argument from circular preference?
It would be utterly disastrous to create an AI which would allow someone to be slapped in the face to avoid a 1/3↑↑↑3 probability of destroying the Earth.
Suppose that I would tentatively choose to torture one person to save a googolplex people from dust specks, and that additionally I would choose torture to save only a googol people from a papercut. Do I have circular preferences if I would be much, much more willing to save a googolplex people from dust specks by giving paper cuts to googol people than to save either group from specks or paper cuts by torturing one person?
I can achieve the exact same total utility by giving specks to googolplex people, giving papercuts to a googol people, or torturing one person. If I had to save 3^^^3 people from dust specks I’d give 3^^^3*googol/googolplex people paper cuts instead of torturing anyone. I’d much prefer saving 3^^^3 people from dust specks by subjecting perhaps 2^^^2 people to a relatively troublesome dust speck. So why exactly do I prefer troublesome dust specks over papercuts over torture even if utility is maximized either way? I think that I’m probably doing utilitarianism as more of a maximin calculation; maximizing the minimum individual utility function in some way. I can’t maximize total utility in the cases where additional utility for some people must be bought at the cost of negative utility for others; it requires more of a fair exchange between individuals in order to increase total utility.
2^^^2 is 4, so I’d choose that in a heartbeat. 2^^^3 is the kind of number you were probably thinking about. Though, if we’re choosing fair-sounding situations, I’d like to cut one of my fingernails too short to generate a MJ/K of negentropy.
I’ve got one way of thinking this problem through that seems to fit with what you’re saying – though of course, it has its own flaws: represent each person’s utility (is that the right word in this case) such that 0 is the maximum possible utility they can have, then map each individual’s utility with x ⟼ -(e^(-x)), so that lots of harm to one person is weighted higher than tiny harms to many people. This is almost certainly a case of forcing the model to say what we want it to say.
Er… wouldn’t it be vastly preferable for the AI to /not/ slap people in the face to avoid 1/3↑↑↑3 probability events for non-multiplication reasons? Building an AGI that acts on 1/3↑↑↑3 probability is to make a god that, to outsiders, comes across as both arbitrarily capricious and overwhelmingly interventionist. Even if the AGI’s end result as a net effect on its well-defined utility function is positive, I’d wager modern humans or even extrapolated transhumans wouldn’t like being slapped in the face lest they consider running the next-next-gen version of the LHC in their AGI’s favorite universe. You don’t even need Knuth notation for that to run into place: a 1/10^100^100 or even 1/10^100 event quickly gets to the point.
Even from a practical viewpoint, that seems incredibly prone to oscillation. There’s a reason we don’t set air conditioners to keep local temperatures to five nines, and 1/3↑↑↑3 is, to fall to understatement, a lot of sensitivity past that.
This is incoherent because qualitative boundaries are naturally incoherent—why is one 2/10^100 risk worth processing power, but six separate 1/10^100 risks not worth processing, to give the blunt version of the sorites paradox? That’s a major failing from a philosophical standpoint, where incoherence is functionally incorrect. But AGI aren’t pure philosophy: there are strong secondary benefits to an underinterventionist AGI, and the human neurological bias trends toward underintentervention.
Of course, in an AGI situation you have to actually program it, and actually defining slapping one person 50 times versus slapping 100 people once each is programatically difficult enough
Edit : never mind; retracting this as off topic and misunderstanding the question.