Everyone may have identical preferences, and the preferences may not be what we would call “altruism”, but they also have behavior. Do the paperclip maximizers assist their sibs or not?
In order to conclude “In general, prefer caring about sibs to not-caring”, then we need to be working inside a scenario where assisting is a more-winning strategy. I believe the best evidence that we’re currently a scenario where assisting is a more-winning strategy is the loose similarity of our present scenario to the EEA.
Paperclip maximizers will help anyone (including a sibling) who will, if and only if so assisted, go on to more than recoup the paperclip-generation value of the resources expended by the assistance in their future endeavors at paperclip maximization.
ETA: Or anyone whose chances of more than recouping said resource expenditure are good enough, or the forseeable recopuage great enough, that the expected result of assistance is more paperclipful than the expected result of not helping.
Exactly. There is a difference between assistance and nonassistnace, and the only way one can recommend assistance is if the SCENARIO is such that assistance leads to better results, whatever “better” means to you. For paperclip maximizers, that’s paperclips.
If assistance were unavailable, of zero use, or of actively negative use, then one would not endorse it over nonassistance. I’ve been trying to convince people that the injunction to prefer assisting one’s sibs over not assisting is scenario-dependent.
Perhaps of import is that if paperclip maximizer A is considering whether to help paperclip maximizer B, B will only want A to take paperclip-maximizing actions. Cognitively sophisticated paperclip maximizers want everybody to want to maximize paperclips over all else. There is no obvious way in which any action could be considered helpful-to-B unless that action also maximizes paperclips, except on axes that don’t matter to B except inasmuch as they maximize paperclips. A real paperclip maximizer will, with no internal conflict whatever, sacrifice its own existence if that act will maximize paperclips. The two paperclip maximizers have identical goals that are completely external to their own experiences (although they will react to their experiences of paperclips, what they want are real paperclips, not paperclip experiences). Most real agents aren’t quite like that.
Perhaps an intuition pump is appropriate at this point, explicating what I mean by the verb “assist”.
Alfonse, the paperclip maximizer, decides that the best way to maximize paperclips is to conquer the world. In pursuit of the subgoal of conquering the world, Alfonse transforms itself into an army. After a fierce battle with some non-paperclip-ists, an instance of Alfonse considers whether to bind a different, badly injured Alfonse’s wound.
Binding the wound will cost some time and calories, resources which might be used in other ways. However, if the wound is bound, the other Alfonse may have an increased chance of fighting in another battle.
By “assistance” I mean “spending an instance-local resource in order that another instance obtains (or doesn’t lose) some resources local to the other instance”.
But EY’s statement is about terminal values, not injunctions.
Say that at time t=0, you don’t care about any other entities that exist at t=0, including close copy-siblings; and that you do care about all your copy-descendants; and that your implementation is such that if you’re copied at t=1, by default, at t=2 each of the copies will only care about itself. However, since you care about both of those copies, their utility functions differ from yours. As a general principle, your goals will be better fulfilled if other agents have them, so you want to modify yourself so that your copy-descendants will care about their copy-siblings.
I disagree with your first claim (The statement is too brief and ambiguous to say what definitively what it “is about”), but I don’t want to argue it. Let’s leave that kind of interpretationism to the scholastic philosophers, who spend vast amounts of effort figuring out what various famous ancients “really meant”.
The principle “Your goals will be better fulfilled if other agents have them” is very interesting, and I’ll have to think about it.
Everyone may have identical preferences, and the preferences may not be what we would call “altruism”, but they also have behavior. Do the paperclip maximizers assist their sibs or not?
In order to conclude “In general, prefer caring about sibs to not-caring”, then we need to be working inside a scenario where assisting is a more-winning strategy. I believe the best evidence that we’re currently a scenario where assisting is a more-winning strategy is the loose similarity of our present scenario to the EEA.
Paperclip maximizers will help anyone (including a sibling) who will, if and only if so assisted, go on to more than recoup the paperclip-generation value of the resources expended by the assistance in their future endeavors at paperclip maximization.
ETA: Or anyone whose chances of more than recouping said resource expenditure are good enough, or the forseeable recopuage great enough, that the expected result of assistance is more paperclipful than the expected result of not helping.
Exactly. There is a difference between assistance and nonassistnace, and the only way one can recommend assistance is if the SCENARIO is such that assistance leads to better results, whatever “better” means to you. For paperclip maximizers, that’s paperclips.
If assistance were unavailable, of zero use, or of actively negative use, then one would not endorse it over nonassistance. I’ve been trying to convince people that the injunction to prefer assisting one’s sibs over not assisting is scenario-dependent.
Perhaps of import is that if paperclip maximizer A is considering whether to help paperclip maximizer B, B will only want A to take paperclip-maximizing actions. Cognitively sophisticated paperclip maximizers want everybody to want to maximize paperclips over all else. There is no obvious way in which any action could be considered helpful-to-B unless that action also maximizes paperclips, except on axes that don’t matter to B except inasmuch as they maximize paperclips. A real paperclip maximizer will, with no internal conflict whatever, sacrifice its own existence if that act will maximize paperclips. The two paperclip maximizers have identical goals that are completely external to their own experiences (although they will react to their experiences of paperclips, what they want are real paperclips, not paperclip experiences). Most real agents aren’t quite like that.
Perhaps an intuition pump is appropriate at this point, explicating what I mean by the verb “assist”.
Alfonse, the paperclip maximizer, decides that the best way to maximize paperclips is to conquer the world. In pursuit of the subgoal of conquering the world, Alfonse transforms itself into an army. After a fierce battle with some non-paperclip-ists, an instance of Alfonse considers whether to bind a different, badly injured Alfonse’s wound.
Binding the wound will cost some time and calories, resources which might be used in other ways. However, if the wound is bound, the other Alfonse may have an increased chance of fighting in another battle.
By “assistance” I mean “spending an instance-local resource in order that another instance obtains (or doesn’t lose) some resources local to the other instance”.
The two instances should agree on the solution, whatever it is.
But EY’s statement is about terminal values, not injunctions.
Say that at time t=0, you don’t care about any other entities that exist at t=0, including close copy-siblings; and that you do care about all your copy-descendants; and that your implementation is such that if you’re copied at t=1, by default, at t=2 each of the copies will only care about itself. However, since you care about both of those copies, their utility functions differ from yours. As a general principle, your goals will be better fulfilled if other agents have them, so you want to modify yourself so that your copy-descendants will care about their copy-siblings.
I disagree with your first claim (The statement is too brief and ambiguous to say what definitively what it “is about”), but I don’t want to argue it. Let’s leave that kind of interpretationism to the scholastic philosophers, who spend vast amounts of effort figuring out what various famous ancients “really meant”.
The principle “Your goals will be better fulfilled if other agents have them” is very interesting, and I’ll have to think about it.