pjeby comments on Post Your Utility Function

pjeby 10 Jun 2009 17:52 UTC
2 points

That said, I do seem to have preferences that concern other minds. These don’t seem reducible to experiences of inter-personal behavior… they seem largely rooted in the empathic impulse, the “mirror neurons”. Of course, on its face, this is still just built from subjective experience, right? It’s the the experience of sympathetic response when modeling another mind. And there’s no question that this involves substituting my own experiences for theirs as part of the modeling process.

Right. And don’t forget the mind-projection machinery, that causes us to have, e.g. different inbuilt intuitions about things that are passively moved, move by themselves, or have faces that appear to express emotion. These are all inbuilt maps in humans.

But when I reflect on a simple inter-personal preference like “I’d love for my friend to experience this”, I can’t see how it really reduces to pure experience, except as mediated by my concept of invariant reality. I don’t have a full anticipation of their reaction, and it doesn’t seem to be my experience of modeling their interaction that I’m after either.

Most of us learn by experience that sharing positive experiences with others results in positive attention. That’s all that would be needed, but it’s also likely that humans have an evolved appetite to communicate and share positive experiences with their allies.

Another simple example: Do you think a preference for honest communication is at all plausible? Doesn’t it involve something beyond “I hope the environment doesn’t trick me”?

It just means you prefer one class of experiences to another, that you have come to associate with other experiences or actions coming before them, or co-incident with them.

The reason, btw, that I asked why it made a difference whether this is an absolute concept or a “mostly” concept, is that AFAICT, the idea that “some preferences are really about the territory” leads directly to “therefore, all of MY preferences are really about the territory”.

In contrast, thinking of all preferences being essentially delusional is a much better approach, especially if 99.999999999% of all human preferences are entirely about the map, if we presume that maybe there are some enlightened Zen masters or Beisutsukai out there who’ve successfully managed, against all odds, to win the epistemic lottery and have an actual “about the territory” preference.

Even if the probability of having such a preference were much higher, viewing it as still delusional with respect to “invariant reality” (as you call it) does not introduce any error. So the consequences of erring on the side of delusion are negligible, and there is a significant upside to being more able to notice when you’re looping, subgoal stomping, or just plain deluded.

That’s why it’s of little interest to me how many .9′s there are on the end of that %, or whether in fact it’s 100% - the difference is inconsequential for any practical purpose involving human beings. (Of course, if you’re doing FAI, you probably want to do some deeper thinking than this, since you want the AI to be just as deluded as humans are, in one sense, but not as deluded in another.)
- orthonormal 10 Jun 2009 18:21 UTC
  1 point
  Parent
  The reason, btw, that I asked why it made a difference whether this is an absolute concept or a “mostly” concept, is that AFAICT, the idea that “some preferences are really about the territory” leads directly to “therefore, all of MY preferences are really about the territory”.
  
  For the love of Bayes, NO. The people here are generally perfectly comfortable with the realization that much of their altruism, etc. is sincere signaling rather than actual altruism. (Same for me, before you ask.) So it’s not necessary to tell ourselves the falsehood that all of our preferences are only masked desires for certain states of mind.
  
  As for your claim that the ratio of signaling to genuine preference is 1 minus epsilon, that’s a pretty strong claim, and it flies in the face of experience and certain well-supported causal models. For example, kin altruism is a widespread and powerful evolutionary adaptation; organisms with far less social signaling than humans are just hardwired to sacrifice at certain proportions for near relatives, because the genes that cause this flourish thereby. It is of course very useful for humans to signal even higher levels of care and devotion to our kin; but given two alleles such that
  - (X) makes a human want directly to help its kin to the right extent, plus a desire to signal to others and itself that it is a kin-helper, versus
  - (X’) makes a human only want to signal to others and itself that it is a kin-helper,
  the first allele beats the second easily, because the second will cause searches for the cheapest ways to signal kin-helping, which ends up helping less than the optimal level for promoting those genes.
  
  Thus we have a good deal of support for the hypothesis that our perceived preferences in some areas are a mix of signaling and genuine preferences, and not nearly 100% one or the other. Generally, those who make strong claims against such hypotheses should be expected to produce experimental evidence. Do you have any?
  - pjeby 11 Jun 2009 1:52 UTC
    2 points
    Parent
    
    The people here are generally perfectly comfortable with the realization that much of their altruism, etc. is sincere signaling rather than actual altruism.
    
    That’s nice, but not relevant, since I haven’t been talking about signaling.
    
    Given that, I’m not going to go through the rest of your comment point by point, as it’s all about signaling and kin selection stuff that doesn’t in any way contest the idea that “preference is about experiences, not the reality being experienced”.
    
    I don’t disagree with what you said, it’s just not in conflict with the main idea here. When I said that this is like Hanson’s “politics are not about policy”, I didn’t mean that it was therefore about signaling! (I said it was “not about” in the same way, not that it was about in the same way—i.e., that the mechanism of delusion was similar.)
    
    The way human preferences work certainly supports signaling functions, and may be systematically biased by signaling drives, but that’s not the same thing as saying that preferences equal signaling, or that preferences are “about” signaling.
    - orthonormal 12 Jun 2009 21:05 UTC
      1 point
      Parent
      Well, this discussion might not be useful to either of us at this point, but I’ll give it one last go. My reason for bringing in talk of signaling is that throughout this conversation, it seems like one of the claims you have been making is that
      
      The algorithm (more accurately, the collection of algorithms) that constitutes me makes its decisions based on a weighting of my current and extrapolated states of mind. To the extent that I perceive preferences about things that are distinct from my mental states (and especially when confronting thought-experiments in which my mental states will knowably diverge from the mental states I would ordinarily form given certain features of the world), I am deceiving myself.
      
      Now, I brought up signaling because I and many others already accept a form of (A), in which we’ve evolved to deceive others and ourselves about our real priorities because such signalers appear to others to be better potential friends, lovers, etc. It looks perfectly meaningful to me to declare such preferences “illusory”, since in point of fact we find rationalizations for choosing not what we signaled we prefer, but rather the least costly available signs of these ‘preferences’.
      
      However, kin altruism appears to be a clear case where not all action is signaling, where making decisions that are optimized to actually benefit my relatives confers an advantage in total fitness to my genes.
      
      While my awareness and my decisions exist on separate tracks, my decisions seem to come out as they would for a certain preference relation, one of whose attributes is a concern for my relatives’ welfare. Less concern, of course, than I consciously think I have for them; but roughly the right amount of concern for Hamilton’s Rule of kin selection.
      
      My understanding, then, is that I have both conscious and real preferences; the former are what I directly feel, but the latter determine parts of my action and are partially revealed by analysis of how I act. (One component of my real preferences is social, and even includes the preference to keep signaling my conscious preferences to myself and others when it doesn’t cost me too much; this at least gives my conscious preferences some role in my actions.) If my actions predictably come out in accordance with the choices of an actual preference relation, then the term “preference” has to be applied there if it’s applied anywhere.
      
      There’s still the key functional sense in which my anticipation of future world-states (and not just my anticipation of future mind-states) enters into my real preferences; I feel an emotional response now about the possibility of my sister dying and me never knowing, because that is the form that evaluation of that imagined world takes. Furthermore, the reason I feel that emotional response in that situation is because it confers an advantage to have one’s real preferences more finely tuned to “model of the future world” than “model of the future mind”, because that leads to decisions that actually help when I need to help.
      
      This is what I mean by having my real preferences sometimes care about the state of the future world (as modeled by my present mind) rather than just my future experience (ditto). Do you disagree on a functional level; and if so, in what situation do you predict a person would feel or act differently than I’d predict? If our disagreement is just about what sort of language is helpful or misleading when taking about the mind, then I’d be relieved.
      - pjeby 12 Jun 2009 21:19 UTC
        1 point
        Parent
        The confusion that you have here is that kin altruism is only “about” your relatives from the outside of you. Within the map that you have, you have no such thing as “kin altruism”, any more than a thermostat’s map contains “temperature regulation”. You have features that execute to produce kin altruism, as a thermostat’s features produce temperature regulation. However, just as a thermostat simply tries to make its sensor match its setting, so too do your preferences simply try to keep your “sensors” within a desired range.
        
        This is true regardless of the evolutionary, signaling, functional, or other assumed “purposes” of your preferences, because the reality in which those other concepts exist, is not contained within the system those preferences operate in. It is a self-applied mind projection fallacy to think otherwise, for reasons that have been done utterly to death in my interactions with Vladimir Nesov in this thread. If you follow that logic, you’ll see how preferences, aboutness, and “natural categories” can be completely reduced to illusions of the mind projection fallacy upon close examination.
        orthonormal 12 Jun 2009 21:46 UTC
        1 point
        Parent
        Well, if this is just a disagreement over whether our typical uses of the word “about” are justified, then I’m satisfied with letting go of this thread; is that the case, or do you think there is a disagreement on our expectations for specific human thoughts and actions?
        
        I suggest, by the way, that your novel backwards application of the Mind Projection Fallacy needs its own name so as not to get it confused with the usual one. (Eliezer’s MPF denotes the problem with exporting our mental/intentional concepts outside the sphere of human beings; you seem to be asserting that we imported the notion of preferences from the external world in the first place.)
        pjeby 13 Jun 2009 0:53 UTC
        2 points
        Parent
        
        you seem to be asserting that we imported the notion of preferences from the external world in the first place
        
        No. I’m saying that the common ideas of “preference” and “about” are mind projection fallacies, in the original sense of the phrase (which Eliezer did not coin, btw, but which he does use correctly). Preference-ness and about-ness are qualities (like “sexiness”) that are attributed as intrinsic properties of the world, but to be properly specified must include the one doing the attribution.
        
        IOW, for your preferences to be “about” the world, there must be someone who is making this attribution of aboutness, as the aboutness itself does not exist in the territory, any more than “sexiness” exists in the territory.
        
        However, you cannot make this attribution, because the thing you think of as “the territory” is really only your model of the territory.
        
        Well, if this is just a disagreement over whether our typical uses of the word “about” are justified, then I’m satisfied with letting go of this thread; is that the case, or do you think there is a disagreement on our expectations for specific human thoughts and actions?
        
        This can be viewed as purely a Russellian argument about language levels, but the practical point I originally intended to make was that humans cannot actually make preferences about the actual territory because the only thing we can evaluate are our own experiences—which can be suspect. Inbuilt drives and biases are one source of experiences being suspect, but our own labeling of experiences is also suspect—labels are not only subject to random linkage, but are prone to spreading to related topics in time, space, or subject matter.
        
        It is thus grossly delusional as a practical matter to assume that your preferences have anything to do with actual reality, as opposed to your emotionally-colored, recall-biased associations with imagined subsets of half-remembered experiences of events that occurred under entirely different conditions. (Plus, many preferences subtly lead to the recreation of circumstances that thwart the preference’s fulfillment—which calls into question precisely what “reality” that preference is about.)
        
        Perhaps we could call our default thinking about such matters (i.e. preferences being about reality) “naive preferential realism”, by analogy to “naive moral realism”, as it is essentially the same error, applied to one’s own preferences rather than some absolute definition of good or evil.
        orthonormal 13 Jun 2009 18:19 UTC
        2 points
        Parent
        This is pretty much what I meant by a semantic argument. If, as I’ve argued, my real preferences (as defined above) care about the projected future world (part of my map) and not just the projected future map (a sub-part of that map), then I see no difficulty with describing this by “I have preferences about the future territory”, as long as I remain aware that all the evaluation is happening within my map.
        
        It is perhaps analogous to moral language in that when I talk about right and wrong, I keep in mind that these are patterns within my brain (analogous to those in other human brains) extrapolated from emotive desires, rather than objectively perceived entities. But with that understanding, right and wrong are still worth thinking about and discussing with others (although I need to be quite careful with my use of the terms when talking with a naive moral realist), since these are patterns that actually move me to act in certain ways, and to introspect in certain ways on my action and on the coherence of the patterns themselves.
        
        In short, any theory of language levels or self-reference that ties you in Hofstadterian knots when discussing real, predictable human behavior (like the decision process for kin altruism) is problematic.
        
        That said, I’m done with this thread. Thanks for an entertainingly slippery discussion!
        
        ETA: To put it another way, learning about the Mind Projection Fallacy doesn’t mean you can never use the word “sexy” again; it just means that you should be aware of its context in the human mind, which will stop you from using it in certain novel but silly situations.