I agree fully with your first two paragraphs. I would not change my answer regardless of the amount of time the causally disconnected person lasts. Biting this bullet leads to some quite extreme conclusions, basically admitting that current human values can not be consistently transferred to a future with uploads, self-modification and such. (Meaning, Eliezer’s whole research program is futile.) I am not happy about these conclusions, but they do not change my respect for human values, regardless of my opinion about their fundamental inconsistencies.
I believe even AlephNeil’s position is quite extreme among LWers, and mine is definitely fringe. So if someone here agrees with either of us, I am very interested in that information.
Biting this bullet leads to some quite extreme conclusions, basically admitting that current human values can not be consistently transferred to a future with uploads, self-modification and such. (Meaning, Eliezer’s whole research program is futile.)
Couldn’t an AI prevent us from ever achieving uploads or self-modification? Wouldn’t this be a good thing for humanity if human values could not survive in a future with those things?
Yes, this is a possible end point of my line of reasoning: we either have to become luddites, or build a FAI that prevents us from uploading. These are both very repulsive conclusions for me. (Even if I don’t consider the fact that I am not confident enough in my judgement to justify such extreme solutions by it.) I, personally, would rather accept that much of my values will not survive.
My value system works okay right now, at least when I don’t have to solve trolley problems. In any given world with uploading and self-modification, my value system would necessarily fail. In such a world, my current self would not feel at home. My visit there would be a series of unbelievably nasty trolley problems, a big reductio ad absurdum of my values. Luckily, it is not me who has to feel at home there, but the inhabitants of that world. (*)
(*) Even the word “inhabitants” is misleading, because I don’t think personal identity has much of a role in a world where it is possible to merge minds. Not to talk about the word “feel”, which, from the perspective of a substrate-independent self-modifying mind refers to a particular suboptimal self-reflection mechanism. Which, to clear up a possible misunderstanding in advance, does not mean that this substrate-independent mind can not possibly see positive feelings as terminal value. But I am already quite off-topic here.
I, personally, would rather accept that much of my values will not survive.
If there is something that you care about more than your values, they are not really your values.
I think we should just get on with FAI. If it realizes that uploads are okay according to our values it will allow uploads and if uploads are bad it will forbid them (maybe not entirely forbid; there could easily be something even worse). This is one of the questions that can completely be left until after we have FAI because whatever it does will, by definition, be in accordance with our values.
That’s not how I interpreted his statement. Look at the original context.
Yes, this is a possible end point of my line of reasoning: we either have to become luddites, or build a FAI that prevents us from uploading. These are both very repulsive conclusions for me. (Even if I don’t consider the fact that I am not confident enough in my judgement to justify such extreme solutions by it.) I, personally, would rather accept that much of my values will not survive.
He seems to be saying that, if his values lead to luddism he would rather give up on his values than embrace luddism. I think that this is not the correct use of the word `values’.
If there is something that you care about more than your values, they are not really your values.
You seem to rely on a hidden assumption here: that I am equally confident in all my values.
I don’t think my values are consistent. Having more powerful deductive reasoning, and constant access to extreme corner cases would obviously change my value system. I also anticipate that my values would not be changed equally. Some of them would survive the encounter with extreme corner cases, some would not. Right now I don’t have to constantly deal with perfect clones and merging minds, so I am fine with my values as they are. But even now, I have a quite good intuition about which of them would not survive the future shock. That’s why I can talk without contradiction about accepting to lose those.
In CEV jargon: my expectation is that the extrapolation of my value system might not be recognizable to me as my value system. Wei_Dai voiced some related concerns with CEV here. It is worth looking at the first link in his comment.
Oh, I see. I appear to have initially missed the phrase `much of my values’.
I am wary of referring to my current inconsistent values rather than their reflective equilibrium as `my values’ because of the principle of explosion, but I am unsure of how to resolve this into my current self even having values.
It seems our positions can be summed up like this: You are wary of referring to your current values rather than their reflective equilibrium as ‘your values’, because your current values are inconsistent. I am wary of referring to the reflective equilibrium rather than my current values as ‘my values’, because I expect the transition to reflective equilibrium to be a very aggressive operation. (One could say that I embrace my ignorance.)
My concern is that the reflective equilibrium is far from my current position in the dynamical system of values. Meanwhile, Marcello and Wei Dai are concerned that the dynamical system is chaotic and has multiple reflective equilibria.
I don’t worry about the aggressiveness of the transition because, if my current values are inconsistent, they can be made to say that this transition is both good and bad. I share the concern about multiple reflective equilibrium. What does it mean to judge something as an irrational cishuman if two reflective equilibria would disagree on what is desirable?
I like to think of my “true values” as (initially) unknown, and my moral intuitions as evidence of, and approximations to, those true values. I can then work on improving the error margins, confidence intervals, and so forth.
So do I, but I worry that they are not uniquely defined by the evidence. I may eventually be moved to unique values by irrational arguments, but if those values are different from my current true values than I will have lost something and if I don’t have any true values than my search for values will have been pointless, though my future self will be okay with that.
I agree fully with your first two paragraphs. I would not change my answer regardless of the amount of time the causally disconnected person lasts. Biting this bullet leads to some quite extreme conclusions, basically admitting that current human values can not be consistently transferred to a future with uploads, self-modification and such. (Meaning, Eliezer’s whole research program is futile.) I am not happy about these conclusions, but they do not change my respect for human values, regardless of my opinion about their fundamental inconsistencies.
I believe even AlephNeil’s position is quite extreme among LWers, and mine is definitely fringe. So if someone here agrees with either of us, I am very interested in that information.
Couldn’t an AI prevent us from ever achieving uploads or self-modification? Wouldn’t this be a good thing for humanity if human values could not survive in a future with those things?
Yes, this is a possible end point of my line of reasoning: we either have to become luddites, or build a FAI that prevents us from uploading. These are both very repulsive conclusions for me. (Even if I don’t consider the fact that I am not confident enough in my judgement to justify such extreme solutions by it.) I, personally, would rather accept that much of my values will not survive.
My value system works okay right now, at least when I don’t have to solve trolley problems. In any given world with uploading and self-modification, my value system would necessarily fail. In such a world, my current self would not feel at home. My visit there would be a series of unbelievably nasty trolley problems, a big reductio ad absurdum of my values. Luckily, it is not me who has to feel at home there, but the inhabitants of that world. (*)
(*) Even the word “inhabitants” is misleading, because I don’t think personal identity has much of a role in a world where it is possible to merge minds. Not to talk about the word “feel”, which, from the perspective of a substrate-independent self-modifying mind refers to a particular suboptimal self-reflection mechanism. Which, to clear up a possible misunderstanding in advance, does not mean that this substrate-independent mind can not possibly see positive feelings as terminal value. But I am already quite off-topic here.
If there is something that you care about more than your values, they are not really your values.
I think we should just get on with FAI. If it realizes that uploads are okay according to our values it will allow uploads and if uploads are bad it will forbid them (maybe not entirely forbid; there could easily be something even worse). This is one of the questions that can completely be left until after we have FAI because whatever it does will, by definition, be in accordance with our values.
You seem to conflate “I will care about X” with “X will occur”. This breaks down in, for example, any case where precommitment is useful.
That’s not how I interpreted his statement. Look at the original context.
He seems to be saying that, if his values lead to luddism he would rather give up on his values than embrace luddism. I think that this is not the correct use of the word `values’.
You seem to rely on a hidden assumption here: that I am equally confident in all my values.
I don’t think my values are consistent. Having more powerful deductive reasoning, and constant access to extreme corner cases would obviously change my value system. I also anticipate that my values would not be changed equally. Some of them would survive the encounter with extreme corner cases, some would not. Right now I don’t have to constantly deal with perfect clones and merging minds, so I am fine with my values as they are. But even now, I have a quite good intuition about which of them would not survive the future shock. That’s why I can talk without contradiction about accepting to lose those.
In CEV jargon: my expectation is that the extrapolation of my value system might not be recognizable to me as my value system. Wei_Dai voiced some related concerns with CEV here. It is worth looking at the first link in his comment.
Oh, I see. I appear to have initially missed the phrase `much of my values’.
I am wary of referring to my current inconsistent values rather than their reflective equilibrium as `my values’ because of the principle of explosion, but I am unsure of how to resolve this into my current self even having values.
It seems our positions can be summed up like this: You are wary of referring to your current values rather than their reflective equilibrium as ‘your values’, because your current values are inconsistent. I am wary of referring to the reflective equilibrium rather than my current values as ‘my values’, because I expect the transition to reflective equilibrium to be a very aggressive operation. (One could say that I embrace my ignorance.)
My concern is that the reflective equilibrium is far from my current position in the dynamical system of values. Meanwhile, Marcello and Wei Dai are concerned that the dynamical system is chaotic and has multiple reflective equilibria.
I don’t worry about the aggressiveness of the transition because, if my current values are inconsistent, they can be made to say that this transition is both good and bad. I share the concern about multiple reflective equilibrium. What does it mean to judge something as an irrational cishuman if two reflective equilibria would disagree on what is desirable?
Upvoted purely for the tasty, tasty understatement here.
I should get that put on a button.
I like to think of my “true values” as (initially) unknown, and my moral intuitions as evidence of, and approximations to, those true values. I can then work on improving the error margins, confidence intervals, and so forth.
So do I, but I worry that they are not uniquely defined by the evidence. I may eventually be moved to unique values by irrational arguments, but if those values are different from my current true values than I will have lost something and if I don’t have any true values than my search for values will have been pointless, though my future self will be okay with that.