Ming the Merciless offers you a choice that you cannot refuse. Either (a) his torturer will rip one of your fingernails off, or (b) his torturer will inflict pain more intense than you can imagine, continuously for the next 24 hours, without otherwise harming you. But in case (b) only, his evil genius neuroscientists will cause you to afterwards completely forget the experience, and any other aftereffects from the stress will be put right as well. If you refuse to make a choice, you will get (b) without the amnesia.
What do you choose?
If you choose (a), how much worse would (a) have to be, for you to choose (b)? If you choose (b), how much less bad would (a) have to be, for you to choose (a)?
Since I will forget the experience (b), I sometimes interpret this question as being equivalent to whether I prefer having something minor happen to me (a) or something more serious happen to someone else (b). Then deciding how bad (a) would need to be before I choose (b) becomes a squirm-worthy ethical question.
Yet on alternating thoughts, I realize the choice is not as bad as choosing whether case (a) happens to me or case (b) happens to someone else because I still need to factor in that the other person will forget the torture after it happens, so it doesn’t happen to them either. I might as well say it’s happening to me. But this feels like rationalization to avoid having my fingernail taken off, which I really don’t want either.
I choose (b) without the amnesia. Why? Because fuck Ming, that’s why!
Or more seriously, by refusing to play Ming’s bizzare little game you deny him the utility he gets from watching people agonise about what the best choice is. Turn it up to 11, Ming you pussy!
Or maybe I already chose (b) and can’t remember...
I choose (b) without hesitation. There is not some counter or accumulator somewhere that is incremented any time someone has a positive experience and decremented every time someone has a negative experience.
EDIT. To answer Kennaway’s second question, there is no way to attenuate (a) to make me prefer it to (b). I’d choose (b) even if the alternative was a dust speck in my eye or a small scratch on my skin because the dust speck and the scratch have a nonzero probability of negatively affecting my vision or my health.
I am in essential agreement with MBlume. It is more likely than not that the space-time continuum we find ourselves in will support life and intelligence for only a finite length of time. But even if that is the case, there might be another compartment of reality beyond our space-time continuum that can support life or intelligence indefinitely. If I affect that other compartment (even if I merely influence someone who influences someone who communicates with the other compartment) then my struggling comes to more than nothing.
If on the other hand, there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens.
If on the other hand, there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens.
People use the word “preference” to mean many things, including:
Felt emotional preference;
Descriptive model of the the preferences an outside observer could use to predict one’s actual behavior;
Intellectual framework that has an xml tag “preference”, that accords with some other xml tag “the right thing to do”, and perhaps with what one verbally advocates;
Intellectual framework that a particular verbal portion of oneself, in practice, tries to manipulate the rest of oneself into better following.
I take it you mean “preference” in senses 3 and 4, but not in sense 1 or 2?
Anna, you are incorrect in guessing that my statement of preference is less than extremely useful for an outside observer to predict my actual behavior.
In other words, the part of me that is loyal to the intellectual framework is very good at getting the rest of me to serve the framework.
The rest of this comment consists of more than most readers probably want to know about my unusual way of valuing things.
I am indifferent to impermanent effects. Internal experiences, mine and yours, certainly qualify as impermanent effects. Note though that internal experiences correlate with things I assign high instrumental value to.
OK, so I care only about permanent effects. I still have not said which permanent effects I prefer. Well, I value the ability to predict and control reality. Whose ability to predict and control? I am indifferent about that: what I want to maximize is reality’s ability to predict and control reality: if maximizing my own ability is the best way to achieve that, then that is what I do. If maximizing my friend’s ability or my hostile annoying neighbor’s ability is the best way, then I do that. When do I want it? Well, my discount rate is zero.
That is the most informative 130 words I can write for improving the ability of someone who does not know me to predict the global effects of my actual behavior.
Since I am in a tiny, tiny minority in wanting this, I might choose to ally myself with people with significantly different preferences. And it is probably impossible in the long term to be allies or colleagues or coworkers with a group of people who all roughly share the same preferences without in a real sense adopting those preferences as my own.
But the preferences I just outlined are the criteria I’d use to decide who to ally with. The single criterion that is most informative in predicting who I might ally with BTW is the prospective ally’s intrinsic values’ discount rate’s being low.
I understand that your stated goal system has effects on your external behavior.
Still, I was trying to understand your claim that “If… there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens” (emphasis mine). Imagine that you were somehow shown a magically 100% sound, 100% persuasive proof that you could not have any permanent effect on reality, and that the entire multiverse would eventually end. In this circumstance, I doubt very much that the concept “Hollerith’s aims” would cease to be predictively useful. Whether you ate breakfast, or sought to end your life, or took up a new trade, or whatever, I suspect that your actions would have a purposive structure unlike the random bouncing about of inanimate systems. If you maintain that you would have no “preferences” under these circumstances (despite a model of “Hollerith’s preferences” being useful to predict your behavior under these circumstances), this suggests you’re using the term “preferences” in an interesting way.
The reason I’m trying to pursue this line of inquiry is that I am not clear what “preference” does and should mean, as any of us discuss ethics and meta-ethics. No doubt you feel some desire to realize goals that are valued by goal system zero, and no doubt you act partially on that desire as well. No doubt you also feel and act partially on other desires or preferences that a particular aspect of you does not endorse. The thing I’m confused about is… well, I don’t know how to say what I’m confused about; I’m confused. But something like:
What goes on, in practice, when a person verbally endorses certain sense (1) and sense (2) preferences and disclaims other sense (1) or sense (2) preferences? What kind of a sense (4) system for manipulating oneself then gets formed -- is it distinguished from other cognitive subsystems by more than the xml tag? What kind of actual psychological consequences does the xml tag “Hollerith’s/Anna’s/whoever’s ‘real preferences’” tend to have?
Which part is Hollerith? Where, in practice, does your desire to realize the goals of goal system zero reside? What kind of a cognitive subsystem, I mean—what are the details?
I would rather have longer to think before making high-stakes decisions. If I could, I would rather defer various high-stakes decisions to “what I would want, if I knew more and had time to think it through”. But what kind of more reflective “me” am I (“I”?) trying to defer to, here? What kinds of volition-extrapolation fulfill what kinds of my (or others’) existing preferences? What kinds of volition-extrapolation would fulfill my existing preferences, if the “me” whose existing preferences I was trying to fulfill had time to think more, first?
My confusion is not specific to you, and maybe I shouldn’t have responded to you with it. But your example is particularly interesting in that the preferences you verbally endorse are particularly far from the ordinary, felt, behaviorally enacted preferences that we mostly start out with as humans. And given that distance, it is natural to ask, “Why, and in what sense, should we call these preferences ‘Hollerith’s preferences’/ ‘Hollerith’s ethics’/ ‘the right thing to do’ ”? Psychologically, is “right” just functioning as a floating xml tag of apparent justified-ness?
Imagine that you were somehow shown a magically 100% sound, 100% persuasive proof that you could not have permanent effect on reality, and that the entire multiverse would eventually end.
I agree with you, Anna, that in that case the concept of my aims does not cease to be predictively useful. (Consequently, I take back my “then I have no preferences” .) It is just that I have not devoted any serious brain time to what my aims might be if knew for sure I cannot have a permanent effect. (Nor does it bother me that I am bad at predicting what I might do if I knew for sure I cannot have a permanent effect.)
Most of the people who say they are loyal to goal system zero seem to have only a superficial commitment to goal system zero. In contrast, Garcia clearly had a very strong deep commitment to goal system zero. Another way of saying what I said above: like Garcia’s, my commitment to goal system zero is strong and deep. But that is probably not helping you.
One of the ways I have approached CEV is to think of the superintelligence as implementing what would have happened if the superintelligence had not come into being—with certain modifications. An example of a modification you and I will agree is desirable: if Joe suffers brain damage the day before the superintelligence comes into being, the superintelligence arranges things the way that Joe would have arranged them if he had not suffered the brain damage. The intelligence might learn that by e.g. reading what Joe posted on the internet before his injury. In summary, one line of investigation that seems worthwhile to me is to get away from this slippery concept of preference or volition and think instead of what the superintelligence predicts would have happened if the superintelligence does not act. Note that e.g. the human sense of right and wrong are predicted by any competent agent to have huge effects on what will happen.
My adoption of goal system zero in 1992 helped me to resolve an emotional problem of mine. I severely doubt it would help your professional goals and concerns for me to describe that, though.
Would you go into why you only care about permanent effects? It seems highly bizarre to me (especially since, as Eliezer has pointed out, everything that happens is permanent insofar as occupies volume in 4d spacetime).
A system of valuing things is a definition. I have defined a system and said, “Oh, by the way, this system has my loyalty.”
It is possible that the system is ill-defined, that is, that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs in some significant way from what I think it means. But your appeal to general relativity does not show the ill-definedness of my system because it is possible to pick the time dimension out of spacetime: the time dimension it is treated quite specially in general relativity.
Eliezer’s response to my definition appeals not to general relativity but rather to Julian Barbour’s endless physics and Eliezer’s refinements and additions to it, but his response does not establish the ill-definedness of my system any more than your argument does. If anyone wants the URLs of Eliezer’s comments (on Overcoming Bias) that respond to my definition, write me and say a few words about why it is important to you that I make this minor effort.
If Eliezer has a non-flimsy argument that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs significantly from what I think it means, he has not shared it with me.
When I am being careful, I use Judea Pearl’s language of causality in my definition rather than the concept of time. The reason I used the concept of time in yesterday’s description is succinctness: “I am indifferent to impermanent effects” is shorter than “I care only about terminal effects where a terminal effect is defined as an effect that is not itself a cause” plus sufficient explanation of Judea Pearl’s framework to avoid the most common ways in which those words would be misunderstood.
So if I had to, I could use Judea Pearl’s language of causality to remove the reliance of my definition on the concept of time. But again, nothing you or Eliezer has written requires me to retreat from my use of the concept of time.
So there is my response to the parts of your comment that can be interpreted as implying that my system is ill-defined.
But what you were probably after when you asked, “Would you go into why you only care about permanent effects?” is why I am loyal to this system I have defined—or more to the point why you should give it any of your loyalty. Well, I used try to persuade people to become loyal to the system, but that had negative effects, including the effect of causing me to tend to hijack conversations on Overcoming Bias, so now I try only to explain and inform. I no longer try to promote or persuade.
My main advice to you, dclayh, is to chalk this up to the fact that the internet gives a voice to people whose values are very different from yours. For example, you will probably find the values implied by the Voluntary Human Extinction Movement or by anti-natalism just as unconventional as my values. Peace, dclayh.
I wasn’t trying to claim that your stated goal system had no effect on your observable behavior, only that it doesn’t have a complete effect. That is, that I would be very surprised if, after you were shown a magically completely certain proof that it is impossible for you to have permanent effect on reality, concepts like “Hollerith’s goals” completely ceased to be useful in predicting, say, whether you would eat breakfast.
Ming’s minions burst in and abduct you to the planet Ming. “So!” smiles Ming the Merciless in his merciless way, “My astronomers and physicists, who have had thousands of years to improve their sciences beyond your primitive level, assure me that all this will pass, yes, even I myself! One day it will be as if none had ever lived! Just rocks and dead stars, and insufficient complexity to ever again assemble creatures such as us, though it last a Graham number of years!”
“Tell me, knowing this—and I am as known for my honesty as for my evil, for see! I have not executed my scientists for telling me an unwelcome truth—are you truly indifferent as to whether I let you go, or hand you over to my torturers? Does this touch of the branding iron mean nothing?”
I feel like pointing out that Graham’s number is big enough that if the universe lasted that long, it would effectively visit every state it possibly can, unless the universe is fucking huge.
I am not completely indifferent to being tortured, so in your hypothetical, Kennaway, I will try to get Ming to let me go because in your hypothetical I know I cannot have a permanent effect on reality.
But when faced with a choice between having a positive permanent effect on reality and avoiding being tortured I’ll always choose having the permanent effect if I can.
Almost everybody gives in under torture. Almost everyone will eventually tell an interrogator skilled in torture everything they know, e.g., the passphrase to the rebel mainframe. Since I have no reason to believe I am any different in that regard, there are limits to my ability to choose the way I said. But for most practical purposes, I can and will choose the way I said. In particular, I think I can calmly choose being tortured over losing my ability to have a permanent effect on reality: it is just that once the torture actually starts, I will probably lose my resolve.
I am worried, Kennaway, that our conversation about my way of valuing things will distract you from what I wrote below about the risk of post-traumatic stress disorder from a surgical procedure. Your scenario is less than ideal for exploring what intrinsic value people assign to internal experience: it is better to present people with a choice of being killed painlessly and being killed after 24 hours of intense pain and then asking what benefit to their best friend or to humanity would induce them to choose the intense pain.
Basically the amount you’d expect: if medical technology allowed me to regrow the finger with full use of it, that would mitigate it considerably. With current technology, it gets me a lifetime of inconvenience.
At the level of loss of a fingernail, I think the answer is a no-brainer: (a) gets me considerably less total pain, with both options protracted over a period of time that makes time preference mostly inconsequential (24 hours, and maybe a couple weeks); getting a fingernail ripped off isn’t a bad enough experience to have any significant long-term impact on my mental state.
I have no frame of reference for computing the relative disutility (i.e. level of pain integrated over time) of these. Intuitively I feel like (b) is the better option, but not by much.
I choose a) because I see little reason to trust that his evil genius neuroscientists will do less damage to my brain than his torturer will do to my finger (I’ve had a toenail pulled out so I have a pretty good reason to think I will survive the torture largely intact, even though it will hurt like hell). To choose b) I’d need a good reason to trust his neuroscientists.
Ming enjoys setting these conundrums, and therefore cultivates not merely the reputation, but also the actuality, of being an Evil Emperor of his word. He will not even engage in lawyering or offer devil’s bargains. It’s no fun if his prisoners just shrug and say, “Why should I believe you? Do your worst, you will anyway.”
Kennaway’s reason for asking the questions is probably to get at how much people prefer to avoid negative internal experiences relative to negative effects on external reality, which parenthetically is the main theme of my blog on the ethics of superintelligence. If so, then he wants you to assume that you can trust Ming 100% to do what he says—and he also wants you to assume that Ming’s evil geniuses can somehow compensate you for the fact that you could have done something else with the 24 hours during which you were experiencing the unimaginably intense pain, e.g., by using a (probably imposible in reality) time machine to roll back the clock by 24 hours.
I am not a fan of what Daniel Dennett calls Intuition Pumps—thought experiments in philosophy that ask people to imagine a scenario and then draw a conclusion when the scenario requires a leap of imagination that few people are capable of. The Chinese Room thought experiment is a classic example.
I don’t necessarily think the original question was driving at a particular answer but I’m just getting a little sick of this style of thinking on Less Wrong. I think it is sloppy and not very rational. I’d place any discussions involving Omega, most of the posed utilitarian moral dilemmas (specks vs. torture) and a number of other examples commonly discussed in the same category.
I should probably have composed a post explaining that rather than trying to make my point by making a dumb answer to the question though.
I am not completely surprised to learn that your not getting the point was intentional, Newport, because your comments are usually good.
Do you consider it a “leap of imagination that few are capable of” to ask people here to indicate how much they value internal experience compared to how much they value external reality?
Do you consider it a “leap of imagination that few are capable of” to ask people here to indicate how much they value internal experience compared to how much they value external reality?
No, but if that’s the question the original poster was interested in asking then I don’t see any value in posing it in the form of an elaborate thought experiment rather than just directly asking the question, or asking about a more plausible scenario that raises similar issues.
I have similar misgivings about Hardened Problems, but thought this one worth posing anyway. But here are two actual experiences I have had, that raise the same issue of how to assess experiences that leave no trace.
I was in hospital for surgery, to be carried out under general anaesthetic. Some time before the procedure was to happen, a nurse came with the pre-med tranquiliser, which seemed to have absolutely no effect. Eventually, the time came when my bed, with me on it, was wheeled out of the ward, down a corridor, into a lift, and—bam! -- I woke up in the ward after it was all over. I was perfectly compos mentis right up to the time when my memories stopped. I don’t believe I passed out. More likely, this was retrograde amnesia for things I was fully aware of at the time.
Maybe I was conscious all the way through the operation? If you ever need surgery, maybe you will be fully conscious, but paralysed as the surgeons cut you open and rummage about in your interior, but you will forget all about it afterwards. Maybe the tales of people waking up during surgery are to be explained, not as a failure to render the patient unconscious, but a failure to erase the memory of it.
Am I scaring anyone?
On another occasion I was in hospital for an examination of a somewhat uncomfortable and invasive nature—I shall tastefully omit all detail—to be carried out under a sedative. Same thing: one moment, watching the doctor’s preparations and the machines that go ping, the next, waking up in the recovery room. But this time, I was told afterwards that I had been “somewhat uncooperative” during the procedure. So I know that I was awake, and having experienced on another occasion the same procedure without any memory loss, I have a pretty good idea of what I must have experienced but have no memory of.
Next time (there will be a next time), should I be apprehensive that I will experience pain and discomfort, or only that I may remember it?
The following conclusions come from a book on post-traumatic stress disorder (PTSD) called Waking the Tiger by Peter Levine, who treats PTSD for a living. I have a copy of this book, which I hereby offer to loan to Richard Kennaway if I do not have to pay to get it to him and to get it back from him.
Surgical procedures are in the opinion of Peter Levine a huge cause of PTSD.
According to Levine, PTSD is caused by subtle damage to the brain stem. Since in contrast episodic memory seems to have very little to do with the brain stem, the fact that one has no episodic memories of a surgical procedure does not mean that one was not traumatized by the procedure.
Since it is impossible in our society for doctors and nurses and such to ignore the fact that someone has died, you can somewhat sometimes rely on them not to kill you unnecessarily, but for anything as subtle as PTSD with as much false information floating about as there is about PTSD, you can pretty much count on it that whenever they cause a case of PTSD, they will remain serenely unaware of that fact, and consequently they will not take even the simplest and most straightforward measure to avoid traumatizing a patient. This sentiment (that medical professionals regularly do harms they are unaware of) is not in Levine’s book AFAICR but is pretty common among rationalists who have extensive experience with the health-care system.
Most cases of traumatization caused by surgical procedures probably occur despite the use of general or local anesthesia.
In conclusion, if I had to undergo a surgical procedure, I’d gather more information of the type I have been sharing here, but if that were not possible, I would treat the possibility of being tramatized by a surgical procedure requiring the use of general anesthetic as having a greater expected negative effect on my health, intelligence and creativity than losing a fingernail would have. (It is more likely than not to turn out less bad than losing a fingernail, but the worst possible consequences are significantly worse than the worst possible consequences of losing the fingernail. In other words, I would tend to choose the loss of a fingernail because the uncertainty and consequently the probability of getting a really bad outcome is much less.)
Ming the Merciless offers you a choice that you cannot refuse. Either (a) his torturer will rip one of your fingernails off, or (b) his torturer will inflict pain more intense than you can imagine, continuously for the next 24 hours, without otherwise harming you. But in case (b) only, his evil genius neuroscientists will cause you to afterwards completely forget the experience, and any other aftereffects from the stress will be put right as well. If you refuse to make a choice, you will get (b) without the amnesia.
What do you choose?
If you choose (a), how much worse would (a) have to be, for you to choose (b)? If you choose (b), how much less bad would (a) have to be, for you to choose (a)?
I choose (b) for instrumental reasons, but would much prefer (a) if I didn’t have to worry so much about preserving my mental equilibrium.
My interpretation of this scenario flip-flops.
Since I will forget the experience (b), I sometimes interpret this question as being equivalent to whether I prefer having something minor happen to me (a) or something more serious happen to someone else (b). Then deciding how bad (a) would need to be before I choose (b) becomes a squirm-worthy ethical question.
Yet on alternating thoughts, I realize the choice is not as bad as choosing whether case (a) happens to me or case (b) happens to someone else because I still need to factor in that the other person will forget the torture after it happens, so it doesn’t happen to them either. I might as well say it’s happening to me. But this feels like rationalization to avoid having my fingernail taken off, which I really don’t want either.
In the end, I’m just confused about it.
I choose (b) without the amnesia. Why? Because fuck Ming, that’s why!
Or more seriously, by refusing to play Ming’s bizzare little game you deny him the utility he gets from watching people agonise about what the best choice is. Turn it up to 11, Ming you pussy!
Or maybe I already chose (b) and can’t remember...
I choose (b) without hesitation. There is not some counter or accumulator somewhere that is incremented any time someone has a positive experience and decremented every time someone has a negative experience.
EDIT. To answer Kennaway’s second question, there is no way to attenuate (a) to make me prefer it to (b). I’d choose (b) even if the alternative was a dust speck in my eye or a small scratch on my skin because the dust speck and the scratch have a nonzero probability of negatively affecting my vision or my health.
To the best of our knowledge, the universe will run down one day, and all our struggling will come to nothing.
A meta-ethics which says “nothing temporary can matter” means all utilities come to zero in such a universe.
I am in essential agreement with MBlume. It is more likely than not that the space-time continuum we find ourselves in will support life and intelligence for only a finite length of time. But even if that is the case, there might be another compartment of reality beyond our space-time continuum that can support life or intelligence indefinitely. If I affect that other compartment (even if I merely influence someone who influences someone who communicates with the other compartment) then my struggling comes to more than nothing.
If on the other hand, there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens.
People use the word “preference” to mean many things, including:
Felt emotional preference;
Descriptive model of the the preferences an outside observer could use to predict one’s actual behavior;
Intellectual framework that has an xml tag “preference”, that accords with some other xml tag “the right thing to do”, and perhaps with what one verbally advocates;
Intellectual framework that a particular verbal portion of oneself, in practice, tries to manipulate the rest of oneself into better following.
I take it you mean “preference” in senses 3 and 4, but not in sense 1 or 2?
Anna, you are incorrect in guessing that my statement of preference is less than extremely useful for an outside observer to predict my actual behavior.
In other words, the part of me that is loyal to the intellectual framework is very good at getting the rest of me to serve the framework.
The rest of this comment consists of more than most readers probably want to know about my unusual way of valuing things.
I am indifferent to impermanent effects. Internal experiences, mine and yours, certainly qualify as impermanent effects. Note though that internal experiences correlate with things I assign high instrumental value to.
OK, so I care only about permanent effects. I still have not said which permanent effects I prefer. Well, I value the ability to predict and control reality. Whose ability to predict and control? I am indifferent about that: what I want to maximize is reality’s ability to predict and control reality: if maximizing my own ability is the best way to achieve that, then that is what I do. If maximizing my friend’s ability or my hostile annoying neighbor’s ability is the best way, then I do that. When do I want it? Well, my discount rate is zero.
That is the most informative 130 words I can write for improving the ability of someone who does not know me to predict the global effects of my actual behavior.
Since I am in a tiny, tiny minority in wanting this, I might choose to ally myself with people with significantly different preferences. And it is probably impossible in the long term to be allies or colleagues or coworkers with a group of people who all roughly share the same preferences without in a real sense adopting those preferences as my own.
But the preferences I just outlined are the criteria I’d use to decide who to ally with. The single criterion that is most informative in predicting who I might ally with BTW is the prospective ally’s intrinsic values’ discount rate’s being low.
I understand that your stated goal system has effects on your external behavior.
Still, I was trying to understand your claim that “If… there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens” (emphasis mine). Imagine that you were somehow shown a magically 100% sound, 100% persuasive proof that you could not have any permanent effect on reality, and that the entire multiverse would eventually end. In this circumstance, I doubt very much that the concept “Hollerith’s aims” would cease to be predictively useful. Whether you ate breakfast, or sought to end your life, or took up a new trade, or whatever, I suspect that your actions would have a purposive structure unlike the random bouncing about of inanimate systems. If you maintain that you would have no “preferences” under these circumstances (despite a model of “Hollerith’s preferences” being useful to predict your behavior under these circumstances), this suggests you’re using the term “preferences” in an interesting way.
The reason I’m trying to pursue this line of inquiry is that I am not clear what “preference” does and should mean, as any of us discuss ethics and meta-ethics. No doubt you feel some desire to realize goals that are valued by goal system zero, and no doubt you act partially on that desire as well. No doubt you also feel and act partially on other desires or preferences that a particular aspect of you does not endorse. The thing I’m confused about is… well, I don’t know how to say what I’m confused about; I’m confused. But something like:
What goes on, in practice, when a person verbally endorses certain sense (1) and sense (2) preferences and disclaims other sense (1) or sense (2) preferences? What kind of a sense (4) system for manipulating oneself then gets formed -- is it distinguished from other cognitive subsystems by more than the xml tag? What kind of actual psychological consequences does the xml tag “Hollerith’s/Anna’s/whoever’s ‘real preferences’” tend to have?
Which parts am I?
Which part is Hollerith? Where, in practice, does your desire to realize the goals of goal system zero reside? What kind of a cognitive subsystem, I mean—what are the details?
I would rather have longer to think before making high-stakes decisions. If I could, I would rather defer various high-stakes decisions to “what I would want, if I knew more and had time to think it through”. But what kind of more reflective “me” am I (“I”?) trying to defer to, here? What kinds of volition-extrapolation fulfill what kinds of my (or others’) existing preferences? What kinds of volition-extrapolation would fulfill my existing preferences, if the “me” whose existing preferences I was trying to fulfill had time to think more, first?
My confusion is not specific to you, and maybe I shouldn’t have responded to you with it. But your example is particularly interesting in that the preferences you verbally endorse are particularly far from the ordinary, felt, behaviorally enacted preferences that we mostly start out with as humans. And given that distance, it is natural to ask, “Why, and in what sense, should we call these preferences ‘Hollerith’s preferences’/ ‘Hollerith’s ethics’/ ‘the right thing to do’ ”? Psychologically, is “right” just functioning as a floating xml tag of apparent justified-ness?
I agree with you, Anna, that in that case the concept of my aims does not cease to be predictively useful. (Consequently, I take back my “then I have no preferences” .) It is just that I have not devoted any serious brain time to what my aims might be if knew for sure I cannot have a permanent effect. (Nor does it bother me that I am bad at predicting what I might do if I knew for sure I cannot have a permanent effect.)
Most of the people who say they are loyal to goal system zero seem to have only a superficial commitment to goal system zero. In contrast, Garcia clearly had a very strong deep commitment to goal system zero. Another way of saying what I said above: like Garcia’s, my commitment to goal system zero is strong and deep. But that is probably not helping you.
One of the ways I have approached CEV is to think of the superintelligence as implementing what would have happened if the superintelligence had not come into being—with certain modifications. An example of a modification you and I will agree is desirable: if Joe suffers brain damage the day before the superintelligence comes into being, the superintelligence arranges things the way that Joe would have arranged them if he had not suffered the brain damage. The intelligence might learn that by e.g. reading what Joe posted on the internet before his injury. In summary, one line of investigation that seems worthwhile to me is to get away from this slippery concept of preference or volition and think instead of what the superintelligence predicts would have happened if the superintelligence does not act. Note that e.g. the human sense of right and wrong are predicted by any competent agent to have huge effects on what will happen.
My adoption of goal system zero in 1992 helped me to resolve an emotional problem of mine. I severely doubt it would help your professional goals and concerns for me to describe that, though.
Would you go into why you only care about permanent effects? It seems highly bizarre to me (especially since, as Eliezer has pointed out, everything that happens is permanent insofar as occupies volume in 4d spacetime).
A system of valuing things is a definition. I have defined a system and said, “Oh, by the way, this system has my loyalty.”
It is possible that the system is ill-defined, that is, that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs in some significant way from what I think it means. But your appeal to general relativity does not show the ill-definedness of my system because it is possible to pick the time dimension out of spacetime: the time dimension it is treated quite specially in general relativity.
Eliezer’s response to my definition appeals not to general relativity but rather to Julian Barbour’s endless physics and Eliezer’s refinements and additions to it, but his response does not establish the ill-definedness of my system any more than your argument does. If anyone wants the URLs of Eliezer’s comments (on Overcoming Bias) that respond to my definition, write me and say a few words about why it is important to you that I make this minor effort.
If Eliezer has a non-flimsy argument that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs significantly from what I think it means, he has not shared it with me.
When I am being careful, I use Judea Pearl’s language of causality in my definition rather than the concept of time. The reason I used the concept of time in yesterday’s description is succinctness: “I am indifferent to impermanent effects” is shorter than “I care only about terminal effects where a terminal effect is defined as an effect that is not itself a cause” plus sufficient explanation of Judea Pearl’s framework to avoid the most common ways in which those words would be misunderstood.
So if I had to, I could use Judea Pearl’s language of causality to remove the reliance of my definition on the concept of time. But again, nothing you or Eliezer has written requires me to retreat from my use of the concept of time.
So there is my response to the parts of your comment that can be interpreted as implying that my system is ill-defined.
But what you were probably after when you asked, “Would you go into why you only care about permanent effects?” is why I am loyal to this system I have defined—or more to the point why you should give it any of your loyalty. Well, I used try to persuade people to become loyal to the system, but that had negative effects, including the effect of causing me to tend to hijack conversations on Overcoming Bias, so now I try only to explain and inform. I no longer try to promote or persuade.
My main advice to you, dclayh, is to chalk this up to the fact that the internet gives a voice to people whose values are very different from yours. For example, you will probably find the values implied by the Voluntary Human Extinction Movement or by anti-natalism just as unconventional as my values. Peace, dclayh.
I wasn’t trying to claim that your stated goal system had no effect on your observable behavior, only that it doesn’t have a complete effect. That is, that I would be very surprised if, after you were shown a magically completely certain proof that it is impossible for you to have permanent effect on reality, concepts like “Hollerith’s goals” completely ceased to be useful in predicting, say, whether you would eat breakfast.
*jangling chord*
Ming’s minions burst in and abduct you to the planet Ming. “So!” smiles Ming the Merciless in his merciless way, “My astronomers and physicists, who have had thousands of years to improve their sciences beyond your primitive level, assure me that all this will pass, yes, even I myself! One day it will be as if none had ever lived! Just rocks and dead stars, and insufficient complexity to ever again assemble creatures such as us, though it last a Graham number of years!”
“Tell me, knowing this—and I am as known for my honesty as for my evil, for see! I have not executed my scientists for telling me an unwelcome truth—are you truly indifferent as to whether I let you go, or hand you over to my torturers? Does this touch of the branding iron mean nothing?”
*sizzle*
I feel like pointing out that Graham’s number is big enough that if the universe lasted that long, it would effectively visit every state it possibly can, unless the universe is fucking huge.
I am not completely indifferent to being tortured, so in your hypothetical, Kennaway, I will try to get Ming to let me go because in your hypothetical I know I cannot have a permanent effect on reality.
But when faced with a choice between having a positive permanent effect on reality and avoiding being tortured I’ll always choose having the permanent effect if I can.
Almost everybody gives in under torture. Almost everyone will eventually tell an interrogator skilled in torture everything they know, e.g., the passphrase to the rebel mainframe. Since I have no reason to believe I am any different in that regard, there are limits to my ability to choose the way I said. But for most practical purposes, I can and will choose the way I said. In particular, I think I can calmly choose being tortured over losing my ability to have a permanent effect on reality: it is just that once the torture actually starts, I will probably lose my resolve.
I am worried, Kennaway, that our conversation about my way of valuing things will distract you from what I wrote below about the risk of post-traumatic stress disorder from a surgical procedure. Your scenario is less than ideal for exploring what intrinsic value people assign to internal experience: it is better to present people with a choice of being killed painlessly and being killed after 24 hours of intense pain and then asking what benefit to their best friend or to humanity would induce them to choose the intense pain.
I’d go with (a) for values of (a) smaller than loss of a finger.
how much does the level of medical technology in the society in which you live matter?
Basically the amount you’d expect: if medical technology allowed me to regrow the finger with full use of it, that would mitigate it considerably. With current technology, it gets me a lifetime of inconvenience.
At the level of loss of a fingernail, I think the answer is a no-brainer: (a) gets me considerably less total pain, with both options protracted over a period of time that makes time preference mostly inconsequential (24 hours, and maybe a couple weeks); getting a fingernail ripped off isn’t a bad enough experience to have any significant long-term impact on my mental state.
I have no frame of reference for computing the relative disutility (i.e. level of pain integrated over time) of these. Intuitively I feel like (b) is the better option, but not by much.
I choose a) because I see little reason to trust that his evil genius neuroscientists will do less damage to my brain than his torturer will do to my finger (I’ve had a toenail pulled out so I have a pretty good reason to think I will survive the torture largely intact, even though it will hurt like hell). To choose b) I’d need a good reason to trust his neuroscientists.
Ming enjoys setting these conundrums, and therefore cultivates not merely the reputation, but also the actuality, of being an Evil Emperor of his word. He will not even engage in lawyering or offer devil’s bargains. It’s no fun if his prisoners just shrug and say, “Why should I believe you? Do your worst, you will anyway.”
Kennaway’s reason for asking the questions is probably to get at how much people prefer to avoid negative internal experiences relative to negative effects on external reality, which parenthetically is the main theme of my blog on the ethics of superintelligence. If so, then he wants you to assume that you can trust Ming 100% to do what he says—and he also wants you to assume that Ming’s evil geniuses can somehow compensate you for the fact that you could have done something else with the 24 hours during which you were experiencing the unimaginably intense pain, e.g., by using a (probably imposible in reality) time machine to roll back the clock by 24 hours.
Yes, I assumed that something like that was the reason for posing the question. My answer deliberately ‘missed the point’ for the kinds of reasons mentioned in Hardened Problems Make Brittle Models and No Universal Probability Space.
I am not a fan of what Daniel Dennett calls Intuition Pumps—thought experiments in philosophy that ask people to imagine a scenario and then draw a conclusion when the scenario requires a leap of imagination that few people are capable of. The Chinese Room thought experiment is a classic example.
I don’t necessarily think the original question was driving at a particular answer but I’m just getting a little sick of this style of thinking on Less Wrong. I think it is sloppy and not very rational. I’d place any discussions involving Omega, most of the posed utilitarian moral dilemmas (specks vs. torture) and a number of other examples commonly discussed in the same category.
I should probably have composed a post explaining that rather than trying to make my point by making a dumb answer to the question though.
I am not completely surprised to learn that your not getting the point was intentional, Newport, because your comments are usually good.
Do you consider it a “leap of imagination that few are capable of” to ask people here to indicate how much they value internal experience compared to how much they value external reality?
No, but if that’s the question the original poster was interested in asking then I don’t see any value in posing it in the form of an elaborate thought experiment rather than just directly asking the question, or asking about a more plausible scenario that raises similar issues.
I have similar misgivings about Hardened Problems, but thought this one worth posing anyway. But here are two actual experiences I have had, that raise the same issue of how to assess experiences that leave no trace.
I was in hospital for surgery, to be carried out under general anaesthetic. Some time before the procedure was to happen, a nurse came with the pre-med tranquiliser, which seemed to have absolutely no effect. Eventually, the time came when my bed, with me on it, was wheeled out of the ward, down a corridor, into a lift, and—bam! -- I woke up in the ward after it was all over. I was perfectly compos mentis right up to the time when my memories stopped. I don’t believe I passed out. More likely, this was retrograde amnesia for things I was fully aware of at the time.
Maybe I was conscious all the way through the operation? If you ever need surgery, maybe you will be fully conscious, but paralysed as the surgeons cut you open and rummage about in your interior, but you will forget all about it afterwards. Maybe the tales of people waking up during surgery are to be explained, not as a failure to render the patient unconscious, but a failure to erase the memory of it.
Am I scaring anyone?
On another occasion I was in hospital for an examination of a somewhat uncomfortable and invasive nature—I shall tastefully omit all detail—to be carried out under a sedative. Same thing: one moment, watching the doctor’s preparations and the machines that go ping, the next, waking up in the recovery room. But this time, I was told afterwards that I had been “somewhat uncooperative” during the procedure. So I know that I was awake, and having experienced on another occasion the same procedure without any memory loss, I have a pretty good idea of what I must have experienced but have no memory of.
Next time (there will be a next time), should I be apprehensive that I will experience pain and discomfort, or only that I may remember it?
The following conclusions come from a book on post-traumatic stress disorder (PTSD) called Waking the Tiger by Peter Levine, who treats PTSD for a living. I have a copy of this book, which I hereby offer to loan to Richard Kennaway if I do not have to pay to get it to him and to get it back from him.
Surgical procedures are in the opinion of Peter Levine a huge cause of PTSD.
According to Levine, PTSD is caused by subtle damage to the brain stem. Since in contrast episodic memory seems to have very little to do with the brain stem, the fact that one has no episodic memories of a surgical procedure does not mean that one was not traumatized by the procedure.
Since it is impossible in our society for doctors and nurses and such to ignore the fact that someone has died, you can somewhat sometimes rely on them not to kill you unnecessarily, but for anything as subtle as PTSD with as much false information floating about as there is about PTSD, you can pretty much count on it that whenever they cause a case of PTSD, they will remain serenely unaware of that fact, and consequently they will not take even the simplest and most straightforward measure to avoid traumatizing a patient. This sentiment (that medical professionals regularly do harms they are unaware of) is not in Levine’s book AFAICR but is pretty common among rationalists who have extensive experience with the health-care system.
Most cases of traumatization caused by surgical procedures probably occur despite the use of general or local anesthesia.
In conclusion, if I had to undergo a surgical procedure, I’d gather more information of the type I have been sharing here, but if that were not possible, I would treat the possibility of being tramatized by a surgical procedure requiring the use of general anesthetic as having a greater expected negative effect on my health, intelligence and creativity than losing a fingernail would have. (It is more likely than not to turn out less bad than losing a fingernail, but the worst possible consequences are significantly worse than the worst possible consequences of losing the fingernail. In other words, I would tend to choose the loss of a fingernail because the uncertainty and consequently the probability of getting a really bad outcome is much less.)
Contact Richard Hollerith.