Don’t take it lightly, it’s a well-vetted and well-understood position, extensively discussed and agreed upon. You should take such claims as strong evidence that you may have missed something crucial, that you need to go back and reread the standard texts.
Don’t take it lightly, it’s a well-vetted and well-understood position, extensively discussed and agreed upon.
It’s extensively discussed and agreed upon, that that is how we (for certain definitions of “we”) would like it to be, and it certainly has desirable properties for say, building Friendly AI, or any AI that doesn’t wirehead. And it is certainly a property of the human brain that it orients its preferences towards what it believes is the outside world—again, it has good consequences for preventing wireheading.
But that doesn’t make it actually true, just useful.
It’s also pretty well established as a tenet of e.g., General Semantics, that the “outside world” is unknowable, since all we can ever consciously perceive is our map. The whole point of discussing biases is that our maps are systematically biased—and this includes our preferences, which are being applied to our biased views of the world, rather than the actual world.
I am being descriptive here, not prescriptive. When we say we prefer a certain set of things to actually be true, we can only mean that we want the world to not dispute a certain map, because otherwise we are making the supernaturalist error of assuming that a thing could be true independent of the components that make it so.
To put it another way, if I say, “I prefer that the wings of this plane not fall off”, I am speaking about the map, since “wings” do not exist in the territory.
IOW, our statements about reality are about the intersection of some portion of “observable” reality and our particular mapping (division and labeling) of it. And it cannot be otherwise, since to even talk about it, we have to carve up and label the “reality” we are discussing.
IOW, our statements about reality are about the intersection of some portion of “observable” reality and our particular mapping (division and labeling) of it. And it cannot be otherwise, since to even talk about it, we have to carve up and label the “reality” we are discussing.
It’s funny that you talk of wordplay a few comments back, as it seems that you’re the one making a technically-correct-but-not-practically-meaningful argument here.
If I may attempt to explore your position: Suppose someone claims a preference for “blue skies”. The wirehead version of this that you endorse is “I prefer experiences that include the perception I label ‘blue sky’”. The “anti-wirehead” version you seem to be arguing against is “I prefer actual states of the world where the sky is actually blue”.
You seem to be saying that since the preference is really about the experience of blue skies, it makes no sense to talk about the sky actually being blue. Chasing after external definitions involving photons and atmospheric scattering is beside the point, because the actual preference wasn’t formed in terms of them.
This becomes another example of the general rule that it’s impossible to form preferences directly about reality, because “reality” is just another label on our subjective map.
As far as specifics go, I think the point you make is sound: Most (all?) of our preferences can’t just be about the territory, because they’re phrased in terms of things that themselves don’t exist in the territory, but at best simply point at the slice of experience labeled “the territory”.
That said, I think this perspective grossly downplays the practical importance of that label. It has very distinct subjective features connecting in special ways to other important concepts. For the non-solipsists among us, perhaps the most important role it plays is establishing a connection between our subjective reality and someone else’s. We have reason to believe that it mediates experiences we label as “physical interactions” in a manner causally unaffected by our state of mind alone.
When I say “I prefer the galaxy not to be tiled by paperclips”, I understand that, technically, the only building blocks I have for that preference are labeled experiences and concepts that aren’t themselves the “stuff” of their referents. In fact, I freely admit that I’m not exactly sure what constitutes “the galaxy”, but the preference I just expressed actually contains a massive number of implicit references to other concepts that I consider causally connected to it via my “external reality” label. What’s more, most people I communicate with can easily access a seemingly similar set of connections to their “external reality” label, assuming they don’t talk themselves out of it.
The territory concept plays a similar role to that of an opaque reference in a programming language. Its state may not be invariant, but its identity is. I don’t have to know any true facts concerning its actual structure for it to be meaningful and useful. Just as photons aren’t explicitly required to subjectively perceive a blue sky, the ontological status of my territory concept doesn’t really change its meaning or importance, which is acquired through its intimate connection to massive amounts of raw experience.
Claiming my preferences about the territory are really just about my map is true in the narrow technical sense that it’s impossible for me to refer directly to “reality”, but doing so completely glosses over the deep, implicit connections expressed by such preferences, most primarily the connection between myself and the things I label “other consciousnesses”. In contrast, the perception of these connections seems to come for free by “confusing” the invariant identity of my territory concept with the invariant “existence” of a real external world. The two notions are basically isomorphic, so where’s the value in the distinction?
Think of what difference is there between “referring directly” to the outside reality and “referring directly” to the brain. Not much, methinks. There is no homunculus whose hands are only so long to reach the brain, but not long enough to touch your nose.
You seem to be saying that since the preference is really about the experience of blue skies, it makes no sense to talk about the sky actually being blue.
That depends on whether you’re talking about “blue” in terms of human experience, or whether you’re talking about wavelengths of light. The former is clearly “map”, whereas discussing wavelengths of light at least might be considered “about” the territory in some sense.
However, if you are talking about preferences, I don’t think there’s any way for a preference to escape the map. We can define how the mapped preference relates to some mapped portion of the territory, but preferences (on human hardware at least) can only be “about” experience.
And that’s because all of our hypothetical preferences for states of the actual world are being modeled as experiences in order to compute a preference.
The entire context of this discussion (for me, anyway) has been about preferences. That doesn’t preclude the possibility of relating maps to territory, or maps to other people’s maps. I’m just saying that human beings have no way to model their preferences except by modeling experiences, in human-sensory terms.
The “anti-wirehead” version you seem to be arguing against is “I prefer actual states of the world where the sky is actually blue”.
But when this statement is uttered by a human, it is almost invariably a lie. Because inside, the person’s model is an experience of “blue sky”. The preference is about experiences, not the state of the world.
Even if you phrase this as, “I prefer the sky to be actually blue, even if I don’t know it”, it is still a lie, because now you are modeling an experience of the sky being blue, plus an experience of you not knowing it.
The two notions are basically isomorphic, so where’s the value in the distinction?
Well, it makes clear some of the limits of certain endeavors that are often discussed here. It dissolves confusions about the best ways to make people happy, and whether a world should be considered “real” or “virtual”, and whether it’s somehow “bad” to be virtual.
But the most important practical benefit is that it helps in understanding why sometimes the best thing for a person may be to update their preferences to match the constraints of reality, rather than trying in vain to make reality fit their preferences.
Consider the “blue sky” preference. The experience of a blue sky is likely associated with, and significantly colored by things like the person’s favorite color, the warmth of the sun or the cool breeze of that perfect day one summer when they were in love. For another person, it might be associated with the blinding heat of the desert and a sensation of thirst… and these two people can then end up arguing endlessly about whether a blue sky is obviously good or bad.
And both are utterly deluded to think this their preferences have anything to do with reality.
I am not saying that they disagree about how many angstroms the light of a blue sky is, or that the sky doesn’t really exist or anything like that. I’m saying their preference is (and can only be) about their maps, because the mere fact of a blue sky has no inherent “preferability”, without reference to some other purpose.
Even if we try to say in the abstract that it is good because it’s correlated with things about our planet that make life possible, we can only have that preference because we’re modeling an experience of “life” that we label “good”. (And perhaps a depressed teenager would disagree, using a differently-labeled experience of “life”!)
This does not mean preferences are invalid or meaningless. It doesn’t mean that we should only change our preferences and ignore reality. However, to the extent that our preferences produce negative experiences, it is saner to remove the negative portion of the preference.
Luckily, human beings are not limited to, or required to have, bidirectional preferences. Feeling pain at the absence of something is not required in order to experience pleasure at its presence, in other words. (Or vice versa.)
Awareness of this fact, combined with an awareness that it is really the experience we prefer (and mainly, the somatic markers we have attached to the experience) makes it plain that the logical thing to do is to remove the negative label, and leave any positive labels in place.
However, if we think that our internal labeling of experience has something to do with “reality”, then we are likely to engage in the confusion of thinking that removing a negative label of badness will somehow create or prolong badness in the territory.
And for that matter, we may be under the mistaken impression that changing the reality out there will change our experience of it… and that is often not the case. As the saying goes, if you want to make a human suffer, you can either not give them what they want, or else you can give it to them! Humans tend to create subgoals that get promoted to “what I want” status, without reference to what the original desired experience was.
For example, looking for blue skies...
When what you really want is to fall in love again.
Wow, −5! People here don’t seem to appreciate this sort of challenge to their conceptual framework.
I’m just saying that human beings have no way to model their preferences except by modeling experiences, in human-sensory terms.
I agree, but I wonder if I failed to communicate the distinction I was attempting to make. The human-sensory experience of being embedded in a concrete, indifferent reality is (drugs, fantasies, and dreams aside) basically constant. It’s a fundamental thread underlying our entire history of experience.
It’s this indifference to our mental state that makes it special. A preference expressed in terms of “reality” has subjective properties that it would otherwise lack. Maybe I want the sky to be blue so that other people will possess a similar experience of it that we can share. “Blueness” may still be a red herring, but my preference now demands some kind of invariant between minds that seemingly cannot be mediated except through a shared external reality. You might argue that I really just prefer shared experiences, but this ignores the implied consistency between such experiences and all other experiences involving the external reality, something I claim to value above and beyond any particular experience.
Even if you phrase this as, “I prefer the sky to be actually blue, even if I don’t know it”, it is still a lie, because now you are modeling an experience of the sky being blue, plus an experience of you not knowing it.
This is where the massive implicit context enters the scene. “Even if I don’t know it” is modeled after experience only in the degenerate sense that it’s modeled after experience of indifferent causality. A translation might look like “I prefer to experience a reality with the sorts of consequences I would predict from the sky being blue, even if I don’t consciously perceive blue skies”. That’s still an oversimplification, but it’s definitely more complex than just invoking a generic memory of “not having known something” and applying it to blue skies.
The two notions are basically isomorphic, so where’s the value in the distinction?
Well, it makes clear some of the limits of certain endeavors that are often discussed here. It dissolves confusions about the best ways to make people happy, and whether a world should be considered “real” or “virtual”, and whether it’s somehow “bad” to be virtual.
I don’t see how any of that is true. I can easily think of different concrete realizations of “real” and “virtual” that would interact differently with my experience of reality, thus provoking different labellings of “good” and “bad”. If your point is merely that “real” is technically underspecified, then I agree. But I don’t see how you can draw inferences from this underspecification.
For another person, it might be associated with the blinding heat of the desert and a sensation of thirst… and these two people can then end up arguing endlessly about whether a blue sky is obviously good or bad.
And both are utterly deluded to think this their preferences have anything to do with reality.
I’m going to have to turn your own argument against you here. To the extent that you have a concept of reality that is remotely consistent with your everyday experience, I claim that “in reality, blue skies are bad because they provoke suffering” is a preference stated in terms of an extremely similar reality-concept, plus a suffering-concept blended together from first-hand experience and compassion (itself also formed in terms of reality-as-connected-to-other-minds). For you to say it has “nothing to do with reality” is pure semantic hogwash. What definition of “reality” can you possibly be using to make this statement, except the one formed by your lifetime’s-worth of experience with indifferent causality? You seem to be denying the use of the term to relate your concept of reality to mine, despite their apparent similarity.
However, to the extent that our preferences produce negative experiences, it is saner to remove the negative portion of the preference.
This doesn’t make sense to me. Whether or not an experience is “negative” is a function of our preferences. If a preference “produces” negative experiences, then either they’re still better than the alternative (in which case it’s a reasonable preference, and it’s probably worthwhile to change your perception of the experience) or they’re not (in which case it’s not a true preference, just delusion).
Luckily, human beings are not limited to, or required to have, bidirectional preferences. Feeling pain at the absence of something is not required in order to experience pleasure at its presence, in other words. (Or vice versa.)
That’s a property of pain and pleasure, not preference. I may well decide not to feel pain due to preference X being thwarted, but I still prefer X, and I still prefer pleasure to the absence of pleasure.
Awareness of this fact, combined with an awareness that it is really the experience we prefer (and mainly, the somatic markers we have attached to the experience) makes it plain that the logical thing to do is to remove the negative label, and leave any positive labels in place.
This is where I think your oversimplification of “experience vs reality” produces invalid conclusions. Those labels don’t just apply to one experience or another, they apply to a massively complicated network of experience that I can’t even begin to hold in my mind at once. Given that, your logic doesn’t follow at all, because I really don’t know what I’m relabeling.
This relates to a general reservation I have with cavalier attitudes toward mind-hacks: I know full well that my preferences are complex, difficult to understand, and grossly underspecified in any conscious realization, so it’s not at all obvious to me that optimizing a simple preference concerning one particular scenario doesn’t carry loads of unintended consequences for the rest of them. I’ve had direct experience with my subconsciously directed behavior “making decisions for me” that I had conscious reasons to optimize against, only later to find out that my conscious understanding of the situation was flawed and incomplete. I think that ignoring the intuitive implications of an external reality leads to similar contradictions.
You seem to mostly be arguing against a strawman; as I said, I’m not saying reality doesn’t exist or that it’s not relevant to our experiences. What I’m saying is that the preferences are composed of map, and while there are connections between that map and external reality, we are essentially deluded to think our preferences refer to actual reality, and that this delusion leads us to believing that changing external reality will change our internal experience, when more often the reverse is more likely true. (That is, changing our internal experience will more likely result in our taking actions that will actually change external reality.)
Note, however that:
Whether or not an experience is “negative” is a function of our preferences.
Here you seem to be arguing my point. The experience is a function of preferences, the preferences are a product of, and point to, other experiences, in a self-sustaining loop that sometimes might as well not be connected to outside reality at all, for all that it has anything to do with what’s actually going on.
Lucky people live in a perpetual perception of good things happening, unlucky people the opposite, even when the same events are happening to both.
How can we say, then, that either person’s perceptions are “about” reality, if they are essentially unconditional? Clearly, something else is going on.
If we disagree at this point, I’d have to say it can only be because we disagree on what “about” means. When I say preferences are not “about” reality, it is in the same sense that Robin Hanson is always saying that politics is not “about” policy, etc.
Clearly, preferences are “about” reality in the same sense that politics are “about” policy. That is, reality is the subject of a preference, in the same way that a policy might be the subject of a political dispute.
However, in both cases, the point of the ostensible activity is not where it appears to be. In order for politics to function, people must sincerely believe that it is “about” policy, in precisely the same way as we must sincerely believe our preferences are “about” reality, in order to make them function—and for similar reasons.
But in neither case does either the sincerity or the necessity of the delusion change the fact that it’s nonetheless a delusion.
If we disagree at this point, I’d have to say it can only be because we disagree on what “about” means.
I don’t think I disagree with any of the above, except to dispute its full universality (which I’m not sure you’re even arguing). To attempt to rephrase your point: Our interactions with reality create experiences filtered through our particular way of characterizing such interactions. It’s these necessarily subjective characterizations (among other things), rather than the substance of the interaction itself, which generate our preferences. When reflecting on our preferences, we’re likely to look right past the interpretive layer we’ve introduced and attribute them to the external stimulus that produced the response, rather than the response itself.
Robin’s “X is not about Y” has the flavor of general, but not universal, rules. Would you extend your analogy to include this property?
Robin’s “X is not about Y” has the flavor of general, but not universal, rules. Would you extend your analogy to include this property?
Here’s an interesting question for you: why is it important that you consider this non-universal? What value does it provide you for me to concede an exception, or what difference will it make in your thinking if I say “yes” or “no”? I am most curious.
(Meanwhile, I agree with your summation as an accurate, if incomplete restatement of the bulk of my point.)
Because I’m trying to make sense of your position, but I don’t think I can with such a strict conclusion. I don’t see any fundamental reason why someone couldn’t form preferences more or less directly mediated by reality, it just seems that in practice, we don’t.
If you’re asking why I’m bringing up universality, it seemed clear that your claims about preferences were universal in scope until you brought up “X is not about Y”. “Must logically be” and “tends to be in practice” are pretty different types of statement.
I mean, you said some things that sound like answers, but they’re not answers to the questions I asked. Here they are again:
Why is it important that you consider this non-universal?
and
What value does it provide you for me to concede an exception, or what difference will it make in your thinking if I say “yes” or “no”?
Your almost-answer was that you don’t think you can “make sense” of my position with a strict conclusion. Why is that? What would it mean for there to be a strict conclusion? How, specifically, would that be a problem?
Why is it important that you consider this non-universal?
I didn’t answer this because it’s predicated on an assumption that has no origin in the conversation. I never claimed that it was “important” for me to consider this non-universal. As per being “liberal in what I accept” in the realm of communication, I tried to answer the nearest meaningful question I thought you might actually be asking. I thought the phrase “If you’re asking why I’m bringing up universality” made my confusion sufficiently clear.
If you really do mean to ask me why I think it’s important that I believe in some property of preference formation, then either I’ve said something fairly obvious to that end that I’m not remembering (or finding), or you’re asserting your own inferences as the basis of a question, instead of its substance. I try to give people the benefit of the doubt that I’ve misunderstood them in such cases, rather than just assume they’re speaking manipulatively.
What value does it provide you for me to concede an exception
No particular value in mind. I suppose the greatest value would be in you solidly refuting such exceptions in a way that made sense to me, as that would be a more surprising (therefore more informative) outcome. If you the concede the exception, I don’t gain any additional insight, so that’s of fairly neutral value.
what difference will it make in your thinking if I say “yes” or “no”?
Not really sure yet, especially in the “no” case (since in that case you may have reasons I haven’t yet thought of or understood). I suppose in the “yes” case I’d have greater confidence that I knew what you were talking about if I encountered similar concepts in your comments elsewhere. This discussion has had some difference on my thinking: I don’t think I understood the thrust of your point when I originally complained that your distinction lacked relevance.
Your almost-answer was that you don’t think you can “make sense” of my position with a strict conclusion. Why is that? What would it mean for there to be a strict conclusion?
By strict conclusion, I mean “preferences are modeled strictly in terms of the map: it is logically impossible to a hold preference expressed in terms of something other than that which is expressed in the map”. This seems very nearly true, but vulnerable to counterexamples when taken as a general principle or logical result of some other general principle. I’ll elaborate if you’d like, but I thought I’d clarify that you meant it that way. If you didn’t, theoretical or speculative counter-examples aren’t particularly relevant.
By strict conclusion, I mean “preferences are modeled strictly in terms of the map: it is logically impossible to a hold preference expressed in terms of something other than that which is expressed in the map”. This seems very nearly true, but vulnerable to counterexamples when taken as a general principle or logical result of some other general principle. I’ll elaborate if you’d like, but I thought I’d clarify that you meant it that way. If you didn’t, theoretical or speculative counter-examples aren’t particularly relevant.
I can imagine that, in principle, some other sort of mind than a human’s might be capable of being a counterexample, apart from, say, the trivial example of a thermostat, which shows a “preference” for reality being a certain way. An AI could presumably be built so that its preferences were based on properties of the world, rather than properties of its experience, or deduction from other properties based on experience. However, at some point that would need to be rooted in the goal system provided by its programmers… who presumably based it off of their own preferences.… ;-) (Nonetheless, if the AI didn’t have anything we’d label “experience”, then I’d have to agree that it has a preference about reality, rather than its experience of reality.)
I could also consider an argument that, say, hunger is about the state of one’s stomach, and that it therefore is “about” the territory, except that I’m not sure hunger qualifies as a preference, rather than an appetite or a drive. A person on a hunger strike or with anorexia still experiences hunger, yet prefers not to eat.
If you think you have other counterexamples, I’d like to hear them. I will be very surprised if they don’t involve some rather tortured reasoning and hypotheticals, though, or non-human minds. The only reason I even hedge my bets regarding humans is that (contrary to popular belief) I’m not under the mistaken impression that I have anything remotely approaching a complete theory of mind for human brains, versus a few crude maps that just happen to cover certain important chunks of “territory”. ;-)
I can imagine that, in principle, some other sort of mind than a human’s might be capable of being a counterexample, apart from, say, the trivial example of a thermostat, which shows a “preference” for reality being a certain way.
I don’t actually consider this a good counterexample. It can been trivially shown that the thermostat’s “preference” is not in terms of the “reality” of temperature: Just sabotage the sensor. The thermostat “prefers” its sensor reading to correspond to its set point. Wouldn’t you agree this is fairly analogous to plenty of human desires?
I could also consider an argument that, say, hunger is about the state of one’s stomach, and that it therefore is “about” the territory, except that I’m not sure hunger qualifies as a preference, rather than an appetite or a drive.
Agreed. The closest it seems you could come is to prefer satiation of said appetites, which is a subjective state.
If you think you have other counterexamples, I’d like to hear them. I will be very surprised if they don’t involve some rather tortured reasoning and hypotheticals, though, or non-human minds.
Actually, human minds are the primary source of my reservations. I don’t think my reasoning is particularly tortured, but it certainly seems incomplete. Like you, I really have no idea what a mind is.
That said, I do seem to have preferences that concern other minds. These don’t seem reducible to experiences of inter-personal behavior… they seem largely rooted in the empathic impulse, the “mirror neurons”. Of course, on its face, this is still just built from subjective experience, right? It’s the the experience of sympathetic response when modeling another mind. And there’s no question that this involves substituting my own experiences for theirs as part of the modeling process.
But when I reflect on a simple inter-personal preference like “I’d love for my friend to experience this”, I can’t see how it really reduces to pure experience, except as mediated by my concept of invariant reality. I don’t have a full anticipation of their reaction, and it doesn’t seem to be my experience of modeling their interaction that I’m after either.
Feel free to come up with a better explanation, but I find it difficult to deconstruct my desire to reproduce internally significant experiences in an external environment in a way that dismisses the role of “hard” reality. I can guess at the pre-reflective biological origin of this sort of preference, just like we can point at the biological origin of domesticated turkeys, but, just as turkeys can’t function without humans, I don’t know how it would function without some reasonable concept of a reality that implements things intrinsically inaccessible and indifferent to my own experience.
I chose to instantiate this particular example, but the general rule seems to be: The very fabric of what “another mind” means to me involves the concept of an objective but shared reality. The very fabric of what “another’s experiences” means to me involves the notion of an external system giving rise to external subjective experiences that bear some relation to my own.
You could claim my reasoning is tortured in that it resembles Russel’s paradox: One could talk about the set of all subjective preferences explicitly involving objective phenomena (i.e., not containing themselves). But it seems to me that I can in a sense relate to a very restricted class of objective preferences, those constructed from the vocabulary of my experience, reflected back into the world, and reinstantiated in the form of another mind.
Another simple example: Do you think a preference for honest communication is at all plausible? Doesn’t it involve something beyond “I hope the environment doesn’t trick me”?
That said, I do seem to have preferences that concern other minds. These don’t seem reducible to experiences of inter-personal behavior… they seem largely rooted in the empathic impulse, the “mirror neurons”. Of course, on its face, this is still just built from subjective experience, right? It’s the the experience of sympathetic response when modeling another mind. And there’s no question that this involves substituting my own experiences for theirs as part of the modeling process.
Right. And don’t forget the mind-projection machinery, that causes us to have, e.g. different inbuilt intuitions about things that are passively moved, move by themselves, or have faces that appear to express emotion. These are all inbuilt maps in humans.
But when I reflect on a simple inter-personal preference like “I’d love for my friend to experience this”, I can’t see how it really reduces to pure experience, except as mediated by my concept of invariant reality. I don’t have a full anticipation of their reaction, and it doesn’t seem to be my experience of modeling their interaction that I’m after either.
Most of us learn by experience that sharing positive experiences with others results in positive attention. That’s all that would be needed, but it’s also likely that humans have an evolved appetite to communicate and share positive experiences with their allies.
Another simple example: Do you think a preference for honest communication is at all plausible? Doesn’t it involve something beyond “I hope the environment doesn’t trick me”?
It just means you prefer one class of experiences to another, that you have come to associate with other experiences or actions coming before them, or co-incident with them.
The reason, btw, that I asked why it made a difference whether this is an absolute concept or a “mostly” concept, is that AFAICT, the idea that “some preferences are really about the territory” leads directly to “therefore, all of MY preferences are really about the territory”.
In contrast, thinking of all preferences being essentially delusional is a much better approach, especially if 99.999999999% of all human preferences are entirely about the map, if we presume that maybe there are some enlightened Zen masters or Beisutsukai out there who’ve successfully managed, against all odds, to win the epistemic lottery and have an actual “about the territory” preference.
Even if the probability of having such a preference were much higher, viewing it as still delusional with respect to “invariant reality” (as you call it) does not introduce any error. So the consequences of erring on the side of delusion are negligible, and there is a significant upside to being more able to notice when you’re looping, subgoal stomping, or just plain deluded.
That’s why it’s of little interest to me how many .9′s there are on the end of that %, or whether in fact it’s 100% - the difference is inconsequential for any practical purpose involving human beings. (Of course, if you’re doing FAI, you probably want to do some deeper thinking than this, since you want the AI to be just as deluded as humans are, in one sense, but not as deluded in another.)
The reason, btw, that I asked why it made a difference whether this is an absolute concept or a “mostly” concept, is that AFAICT, the idea that “some preferences are really about the territory” leads directly to “therefore, all of MY preferences are really about the territory”.
For the love of Bayes, NO. The people here are generally perfectly comfortable with the realization that much of their altruism, etc. is sincere signaling rather than actual altruism. (Same for me, before you ask.) So it’s not necessary to tell ourselves the falsehood that all of our preferences are only masked desires for certain states of mind.
As for your claim that the ratio of signaling to genuine preference is 1 minus epsilon, that’s a pretty strong claim, and it flies in the face of experience and certain well-supported causal models. For example, kin altruism is a widespread and powerful evolutionary adaptation; organisms with far less social signaling than humans are just hardwired to sacrifice at certain proportions for near relatives, because the genes that cause this flourish thereby. It is of course very useful for humans to signal even higher levels of care and devotion to our kin; but given two alleles such that
(X) makes a human want directly to help its kin to the right extent, plus a desire to signal to others and itself that it is a kin-helper, versus
(X’) makes a human only want to signal to others and itself that it is a kin-helper,
the first allele beats the second easily, because the second will cause searches for the cheapest ways to signal kin-helping, which ends up helping less than the optimal level for promoting those genes.
Thus we have a good deal of support for the hypothesis that our perceived preferences in some areas are a mix of signaling and genuine preferences, and not nearly 100% one or the other. Generally, those who make strong claims against such hypotheses should be expected to produce experimental evidence. Do you have any?
The people here are generally perfectly comfortable with the realization that much of their altruism, etc. is sincere signaling rather than actual altruism.
That’s nice, but not relevant, since I haven’t been talking about signaling.
Given that, I’m not going to go through the rest of your comment point by point, as it’s all about signaling and kin selection stuff that doesn’t in any way contest the idea that “preference is about experiences, not the reality being experienced”.
I don’t disagree with what you said, it’s just not in conflict with the main idea here. When I said that this is like Hanson’s “politics are not about policy”, I didn’t mean that it was therefore about signaling! (I said it was “not about” in the same way, not that it was about in the same way—i.e., that the mechanism of delusion was similar.)
The way human preferences work certainly supports signaling functions, and may be systematically biased by signaling drives, but that’s not the same thing as saying that preferences equal signaling, or that preferences are “about” signaling.
Well, this discussion might not be useful to either of us at this point, but I’ll give it one last go. My reason for bringing in talk of signaling is that throughout this conversation, it seems like one of the claims you have been making is that
The algorithm (more accurately, the collection of algorithms) that constitutes me makes its decisions based on a weighting of my current and extrapolated states of mind. To the extent that I perceive preferences about things that are distinct from my mental states (and especially when confronting thought-experiments in which my mental states will knowably diverge from the mental states I would ordinarily form given certain features of the world), I am deceiving myself.
Now, I brought up signaling because I and many others already accept a form of (A), in which we’ve evolved to deceive others and ourselves about our real priorities because such signalers appear to others to be better potential friends, lovers, etc. It looks perfectly meaningful to me to declare such preferences “illusory”, since in point of fact we find rationalizations for choosing not what we signaled we prefer, but rather the least costly available signs of these ‘preferences’.
However, kin altruism appears to be a clear case where not all action is signaling, where making decisions that are optimized to actually benefit my relatives confers an advantage in total fitness to my genes.
While my awareness and my decisions exist on separate tracks, my decisions seem to come out as they would for a certain preference relation, one of whose attributes is a concern for my relatives’ welfare. Less concern, of course, than I consciously think I have for them; but roughly the right amount of concern for Hamilton’s Rule of kin selection.
My understanding, then, is that I have both conscious and real preferences; the former are what I directly feel, but the latter determine parts of my action and are partially revealed by analysis of how I act. (One component of my real preferences is social, and even includes the preference to keep signaling my conscious preferences to myself and others when it doesn’t cost me too much; this at least gives my conscious preferences some role in my actions.) If my actions predictably come out in accordance with the choices of an actual preference relation, then the term “preference” has to be applied there if it’s applied anywhere.
There’s still the key functional sense in which my anticipation of future world-states (and not just my anticipation of future mind-states) enters into my real preferences; I feel an emotional response now about the possibility of my sister dying and me never knowing, because that is the form that evaluation of that imagined world takes. Furthermore, the reason I feel that emotional response in that situation is because it confers an advantage to have one’s real preferences more finely tuned to “model of the future world” than “model of the future mind”, because that leads to decisions that actually help when I need to help.
This is what I mean by having my real preferences sometimes care about the state of the future world (as modeled by my present mind) rather than just my future experience (ditto). Do you disagree on a functional level; and if so, in what situation do you predict a person would feel or act differently than I’d predict? If our disagreement is just about what sort of language is helpful or misleading when taking about the mind, then I’d be relieved.
The confusion that you have here is that kin altruism is only “about” your relatives from the outside of you. Within the map that you have, you have no such thing as “kin altruism”, any more than a thermostat’s map contains “temperature regulation”. You have features that execute to produce kin altruism, as a thermostat’s features produce temperature regulation. However, just as a thermostat simply tries to make its sensor match its setting, so too do your preferences simply try to keep your “sensors” within a desired range.
This is true regardless of the evolutionary, signaling, functional, or other assumed “purposes” of your preferences, because the reality in which those other concepts exist, is not contained within the system those preferences operate in. It is a self-applied mind projection fallacy to think otherwise, for reasons that have been done utterly to death in my interactions with Vladimir Nesov in this thread. If you follow that logic, you’ll see how preferences, aboutness, and “natural categories” can be completely reduced to illusions of the mind projection fallacy upon close examination.
Well, if this is just a disagreement over whether our typical uses of the word “about” are justified, then I’m satisfied with letting go of this thread; is that the case, or do you think there is a disagreement on our expectations for specific human thoughts and actions?
I suggest, by the way, that your novel backwards application of the Mind Projection Fallacy needs its own name so as not to get it confused with the usual one. (Eliezer’s MPF denotes the problem with exporting our mental/intentional concepts outside the sphere of human beings; you seem to be asserting that we imported the notion of preferences from the external world in the first place.)
you seem to be asserting that we imported the notion of preferences from the external world in the first place
No. I’m saying that the common ideas of “preference” and “about” are mind projection fallacies, in the original sense of the phrase (which Eliezer did not coin, btw, but which he does use correctly). Preference-ness and about-ness are qualities (like “sexiness”) that are attributed as intrinsic properties of the world, but to be properly specified must include the one doing the attribution.
IOW, for your preferences to be “about” the world, there must be someone who is making this attribution of aboutness, as the aboutness itself does not exist in the territory, any more than “sexiness” exists in the territory.
However, you cannot make this attribution, because the thing you think of as “the territory” is really only your model of the territory.
Well, if this is just a disagreement over whether our typical uses of the word “about” are justified, then I’m satisfied with letting go of this thread; is that the case, or do you think there is a disagreement on our expectations for specific human thoughts and actions?
This can be viewed as purely a Russellian argument about language levels, but the practical point I originally intended to make was that humans cannot actually make preferences about the actual territory because the only thing we can evaluate are our own experiences—which can be suspect. Inbuilt drives and biases are one source of experiences being suspect, but our own labeling of experiences is also suspect—labels are not only subject to random linkage, but are prone to spreading to related topics in time, space, or subject matter.
It is thus grossly delusional as a practical matter to assume that your preferences have anything to do with actual reality, as opposed to your emotionally-colored, recall-biased associations with imagined subsets of half-remembered experiences of events that occurred under entirely different conditions. (Plus, many preferences subtly lead to the recreation of circumstances that thwart the preference’s fulfillment—which calls into question precisely what “reality” that preference is about.)
Perhaps we could call our default thinking about such matters (i.e. preferences being about reality) “naive preferential realism”, by analogy to “naive moral realism”, as it is essentially the same error, applied to one’s own preferences rather than some absolute definition of good or evil.
This is pretty much what I meant by a semantic argument. If, as I’ve argued, my real preferences (as defined above) care about the projected future world (part of my map) and not just the projected future map (a sub-part of that map), then I see no difficulty with describing this by “I have preferences about the future territory”, as long as I remain aware that all the evaluation is happening within my map.
It is perhaps analogous to moral language in that when I talk about right and wrong, I keep in mind that these are patterns within my brain (analogous to those in other human brains) extrapolated from emotive desires, rather than objectively perceived entities. But with that understanding, right and wrong are still worth thinking about and discussing with others (although I need to be quite careful with my use of the terms when talking with a naive moral realist), since these are patterns that actually move me to act in certain ways, and to introspect in certain ways on my action and on the coherence of the patterns themselves.
In short, any theory of language levels or self-reference that ties you in Hofstadterian knots when discussing real, predictable human behavior (like the decision process for kin altruism) is problematic.
That said, I’m done with this thread. Thanks for an entertainingly slippery discussion!
ETA: To put it another way, learning about the Mind Projection Fallacy doesn’t mean you can never use the word “sexy” again; it just means that you should be aware of its context in the human mind, which will stop you from using it in certain novel but silly situations.
Consider the difference between a thermostat connected to a heater and a human maintaining the same temperature by looking at a thermometer and switching the heater on and off. Obviously there is a lot more going on inside the human’s brain, but I still don’t understand how the thermostat has any particular kind of connection to reality that the human lacks. The same applies whether the thermostat was built by humans with preferences or somehow formed without human design.
edit: I’m not trying to antagonize you, but I genuinely can’t tell whether you are trying to communicate something that I’m not understanding, or you’ve just read The Secret one too many times.
Obviously there is a lot more going on inside the human’s brain, but I still don’t understand how the thermostat has any particular kind of connection to reality that the human lacks.
The thermostat lacks the ability to reflect on itself, as well as the mind-projection machinery that deludes human beings into thinking that their preferences are “about” the reality they influence and are influenced by.
edit: I’m not trying to antagonize you, but I genuinely can’t tell whether you are trying to communicate something that I’m not understanding, or you’ve just read The Secret one too many times.
You’re definitely rounding to a cliche. The Secret folks think that our preferences create the universe, which is just as delusional as thinking our preferences are about the universe.
the trivial example of a thermostat, which shows a “preference” for reality being a certain way
Doesn’t it rather have a preference for its sensors showing a certain reading? (This doesn’t lead to thermostat wireheading because the thermostat’s action won’t make the sensor alter its mechanism.)
Really, it’s only systems that can model a scenario where its sensors say X but the situation is actually Y, that could possibly have preferences that go beyond the future readings of its sensors. If you assert that a thermostat can have preferences about the territory but a human can’t, then you are twisting language to an unhelpful degree.
Whether your preferences refer to your state, or to the rest of the world is indeed a wirehead-related issue. The problem with the idea that they refer to your state is that that idea tends to cause wirehead behaviour—surgery on your own brain to produce the desired state. So—it seems desirable to construct agents that believe that there is a real world, and that their preferences relate to it.
Whether your preferences refer to your state, or to the rest of the world is indeed a wirehead-related issue. The problem with the idea that they refer to your state is that that idea tends to cause wirehead behaviour—surgery on your own brain to produce the desired state. So—it seems desirable to construct agents that believe that there is a real world, and that their preferences relate to it.
I agree—that’s probably why humans appear to be constructed that way. The problem comes in when you expect the system to also be able to accurately reflect its preferences, as opposed to just executing them.
This does not preclude the possibility of creating systems that can; it’s just that they’re purely hypothetical.
To the greatest extent practical, I try to write here only about what I know about the practical effects of the hardware we actually run on today, if for no other reason than if I got into entirely-theoretical discussions I’d post WAY more than I already do. ;-)
Presumably, if you asked such an agent to reflect on its own purposes, it would claim that they related to the external world (unless it’s aim was to deceive you about its purposes for signalling reasons, of course).
For example, it might claim that its aim was to save the whales—rather than to feel good about saving the whales. It could do the latter by taking drugs or via hypnotherapy—and that is not how it actually acts.
Presumably, if you asked such an agent to reflect on its own purposes, it would claim that they related to the external world (unless it’s aim was to deceive you about its purposes for signalling reasons, of course).
Actually, if signaling was its true purpose, it would claim the same thing. And if it were hacked together by evolution to be convincing, it might even do so by genuinely believing that its reflections were accurate. ;-)
For example, it might claim that its aim was to save the whales—rather than to feel good about saving the whales. It could do the latter by taking drugs or via hypnotherapy—and that is not how it actually acts.
Indeed. But in the case of humans, note first that many people do in fact take drugs to feel good, and second, that we tend to dislike being deceived. When we try to imagine getting hypnotized into believing the whales are safe, we react as we would to being deceived, not as we would if we truly believed the whales were safe. It is this error in the map that gives us a degree of feed-forward consistency, in that it prevents us from certain classes of wireheading.
However, it’s also a source of other errors, because in the case of self-fulfilling beliefs, it leads to erroneous conclusions about our need for the belief. For example, if you think your fear of being fired is the only thing getting you to work at all, then you will be reluctant to give up that fear, even if it’s really the existence of the fear that is suppressing, say, the creativity or ambition that would replace the fear.
In each case, the error is the same: System 2 projection of the future implicitly relies on the current contents of System 1′s map, and does not take into account how that map would be different in the projected future.
(This is why, by the way, The Work’s fourth question is “who would you be without that thought?” The question is a trick to force System 1 to do a projection using the presupposition that the belief is already gone.)
Don’t take it lightly, it’s a well-vetted and well-understood position, extensively discussed and agreed upon. You should take such claims as strong evidence that you may have missed something crucial, that you need to go back and reread the standard texts.
It’s extensively discussed and agreed upon, that that is how we (for certain definitions of “we”) would like it to be, and it certainly has desirable properties for say, building Friendly AI, or any AI that doesn’t wirehead. And it is certainly a property of the human brain that it orients its preferences towards what it believes is the outside world—again, it has good consequences for preventing wireheading.
But that doesn’t make it actually true, just useful.
It’s also pretty well established as a tenet of e.g., General Semantics, that the “outside world” is unknowable, since all we can ever consciously perceive is our map. The whole point of discussing biases is that our maps are systematically biased—and this includes our preferences, which are being applied to our biased views of the world, rather than the actual world.
I am being descriptive here, not prescriptive. When we say we prefer a certain set of things to actually be true, we can only mean that we want the world to not dispute a certain map, because otherwise we are making the supernaturalist error of assuming that a thing could be true independent of the components that make it so.
To put it another way, if I say, “I prefer that the wings of this plane not fall off”, I am speaking about the map, since “wings” do not exist in the territory.
IOW, our statements about reality are about the intersection of some portion of “observable” reality and our particular mapping (division and labeling) of it. And it cannot be otherwise, since to even talk about it, we have to carve up and label the “reality” we are discussing.
It’s funny that you talk of wordplay a few comments back, as it seems that you’re the one making a technically-correct-but-not-practically-meaningful argument here.
If I may attempt to explore your position: Suppose someone claims a preference for “blue skies”. The wirehead version of this that you endorse is “I prefer experiences that include the perception I label ‘blue sky’”. The “anti-wirehead” version you seem to be arguing against is “I prefer actual states of the world where the sky is actually blue”.
You seem to be saying that since the preference is really about the experience of blue skies, it makes no sense to talk about the sky actually being blue. Chasing after external definitions involving photons and atmospheric scattering is beside the point, because the actual preference wasn’t formed in terms of them.
This becomes another example of the general rule that it’s impossible to form preferences directly about reality, because “reality” is just another label on our subjective map.
As far as specifics go, I think the point you make is sound: Most (all?) of our preferences can’t just be about the territory, because they’re phrased in terms of things that themselves don’t exist in the territory, but at best simply point at the slice of experience labeled “the territory”.
That said, I think this perspective grossly downplays the practical importance of that label. It has very distinct subjective features connecting in special ways to other important concepts. For the non-solipsists among us, perhaps the most important role it plays is establishing a connection between our subjective reality and someone else’s. We have reason to believe that it mediates experiences we label as “physical interactions” in a manner causally unaffected by our state of mind alone.
When I say “I prefer the galaxy not to be tiled by paperclips”, I understand that, technically, the only building blocks I have for that preference are labeled experiences and concepts that aren’t themselves the “stuff” of their referents. In fact, I freely admit that I’m not exactly sure what constitutes “the galaxy”, but the preference I just expressed actually contains a massive number of implicit references to other concepts that I consider causally connected to it via my “external reality” label. What’s more, most people I communicate with can easily access a seemingly similar set of connections to their “external reality” label, assuming they don’t talk themselves out of it.
The territory concept plays a similar role to that of an opaque reference in a programming language. Its state may not be invariant, but its identity is. I don’t have to know any true facts concerning its actual structure for it to be meaningful and useful. Just as photons aren’t explicitly required to subjectively perceive a blue sky, the ontological status of my territory concept doesn’t really change its meaning or importance, which is acquired through its intimate connection to massive amounts of raw experience.
Claiming my preferences about the territory are really just about my map is true in the narrow technical sense that it’s impossible for me to refer directly to “reality”, but doing so completely glosses over the deep, implicit connections expressed by such preferences, most primarily the connection between myself and the things I label “other consciousnesses”. In contrast, the perception of these connections seems to come for free by “confusing” the invariant identity of my territory concept with the invariant “existence” of a real external world. The two notions are basically isomorphic, so where’s the value in the distinction?
Think of what difference is there between “referring directly” to the outside reality and “referring directly” to the brain. Not much, methinks. There is no homunculus whose hands are only so long to reach the brain, but not long enough to touch your nose.
Agreed, as the brain is a physical object. Referring “directly” to subjective experiences is a different story though.
That depends on whether you’re talking about “blue” in terms of human experience, or whether you’re talking about wavelengths of light. The former is clearly “map”, whereas discussing wavelengths of light at least might be considered “about” the territory in some sense.
However, if you are talking about preferences, I don’t think there’s any way for a preference to escape the map. We can define how the mapped preference relates to some mapped portion of the territory, but preferences (on human hardware at least) can only be “about” experience.
And that’s because all of our hypothetical preferences for states of the actual world are being modeled as experiences in order to compute a preference.
The entire context of this discussion (for me, anyway) has been about preferences. That doesn’t preclude the possibility of relating maps to territory, or maps to other people’s maps. I’m just saying that human beings have no way to model their preferences except by modeling experiences, in human-sensory terms.
But when this statement is uttered by a human, it is almost invariably a lie. Because inside, the person’s model is an experience of “blue sky”. The preference is about experiences, not the state of the world.
Even if you phrase this as, “I prefer the sky to be actually blue, even if I don’t know it”, it is still a lie, because now you are modeling an experience of the sky being blue, plus an experience of you not knowing it.
Well, it makes clear some of the limits of certain endeavors that are often discussed here. It dissolves confusions about the best ways to make people happy, and whether a world should be considered “real” or “virtual”, and whether it’s somehow “bad” to be virtual.
But the most important practical benefit is that it helps in understanding why sometimes the best thing for a person may be to update their preferences to match the constraints of reality, rather than trying in vain to make reality fit their preferences.
Consider the “blue sky” preference. The experience of a blue sky is likely associated with, and significantly colored by things like the person’s favorite color, the warmth of the sun or the cool breeze of that perfect day one summer when they were in love. For another person, it might be associated with the blinding heat of the desert and a sensation of thirst… and these two people can then end up arguing endlessly about whether a blue sky is obviously good or bad.
And both are utterly deluded to think this their preferences have anything to do with reality.
I am not saying that they disagree about how many angstroms the light of a blue sky is, or that the sky doesn’t really exist or anything like that. I’m saying their preference is (and can only be) about their maps, because the mere fact of a blue sky has no inherent “preferability”, without reference to some other purpose.
Even if we try to say in the abstract that it is good because it’s correlated with things about our planet that make life possible, we can only have that preference because we’re modeling an experience of “life” that we label “good”. (And perhaps a depressed teenager would disagree, using a differently-labeled experience of “life”!)
This does not mean preferences are invalid or meaningless. It doesn’t mean that we should only change our preferences and ignore reality. However, to the extent that our preferences produce negative experiences, it is saner to remove the negative portion of the preference.
Luckily, human beings are not limited to, or required to have, bidirectional preferences. Feeling pain at the absence of something is not required in order to experience pleasure at its presence, in other words. (Or vice versa.)
Awareness of this fact, combined with an awareness that it is really the experience we prefer (and mainly, the somatic markers we have attached to the experience) makes it plain that the logical thing to do is to remove the negative label, and leave any positive labels in place.
However, if we think that our internal labeling of experience has something to do with “reality”, then we are likely to engage in the confusion of thinking that removing a negative label of badness will somehow create or prolong badness in the territory.
And for that matter, we may be under the mistaken impression that changing the reality out there will change our experience of it… and that is often not the case. As the saying goes, if you want to make a human suffer, you can either not give them what they want, or else you can give it to them! Humans tend to create subgoals that get promoted to “what I want” status, without reference to what the original desired experience was.
For example, looking for blue skies...
When what you really want is to fall in love again.
Wow, −5! People here don’t seem to appreciate this sort of challenge to their conceptual framework.
I agree, but I wonder if I failed to communicate the distinction I was attempting to make. The human-sensory experience of being embedded in a concrete, indifferent reality is (drugs, fantasies, and dreams aside) basically constant. It’s a fundamental thread underlying our entire history of experience.
It’s this indifference to our mental state that makes it special. A preference expressed in terms of “reality” has subjective properties that it would otherwise lack. Maybe I want the sky to be blue so that other people will possess a similar experience of it that we can share. “Blueness” may still be a red herring, but my preference now demands some kind of invariant between minds that seemingly cannot be mediated except through a shared external reality. You might argue that I really just prefer shared experiences, but this ignores the implied consistency between such experiences and all other experiences involving the external reality, something I claim to value above and beyond any particular experience.
This is where the massive implicit context enters the scene. “Even if I don’t know it” is modeled after experience only in the degenerate sense that it’s modeled after experience of indifferent causality. A translation might look like “I prefer to experience a reality with the sorts of consequences I would predict from the sky being blue, even if I don’t consciously perceive blue skies”. That’s still an oversimplification, but it’s definitely more complex than just invoking a generic memory of “not having known something” and applying it to blue skies.
I don’t see how any of that is true. I can easily think of different concrete realizations of “real” and “virtual” that would interact differently with my experience of reality, thus provoking different labellings of “good” and “bad”. If your point is merely that “real” is technically underspecified, then I agree. But I don’t see how you can draw inferences from this underspecification.
I’m going to have to turn your own argument against you here. To the extent that you have a concept of reality that is remotely consistent with your everyday experience, I claim that “in reality, blue skies are bad because they provoke suffering” is a preference stated in terms of an extremely similar reality-concept, plus a suffering-concept blended together from first-hand experience and compassion (itself also formed in terms of reality-as-connected-to-other-minds). For you to say it has “nothing to do with reality” is pure semantic hogwash. What definition of “reality” can you possibly be using to make this statement, except the one formed by your lifetime’s-worth of experience with indifferent causality? You seem to be denying the use of the term to relate your concept of reality to mine, despite their apparent similarity.
This doesn’t make sense to me. Whether or not an experience is “negative” is a function of our preferences. If a preference “produces” negative experiences, then either they’re still better than the alternative (in which case it’s a reasonable preference, and it’s probably worthwhile to change your perception of the experience) or they’re not (in which case it’s not a true preference, just delusion).
That’s a property of pain and pleasure, not preference. I may well decide not to feel pain due to preference X being thwarted, but I still prefer X, and I still prefer pleasure to the absence of pleasure.
This is where I think your oversimplification of “experience vs reality” produces invalid conclusions. Those labels don’t just apply to one experience or another, they apply to a massively complicated network of experience that I can’t even begin to hold in my mind at once. Given that, your logic doesn’t follow at all, because I really don’t know what I’m relabeling.
This relates to a general reservation I have with cavalier attitudes toward mind-hacks: I know full well that my preferences are complex, difficult to understand, and grossly underspecified in any conscious realization, so it’s not at all obvious to me that optimizing a simple preference concerning one particular scenario doesn’t carry loads of unintended consequences for the rest of them. I’ve had direct experience with my subconsciously directed behavior “making decisions for me” that I had conscious reasons to optimize against, only later to find out that my conscious understanding of the situation was flawed and incomplete. I think that ignoring the intuitive implications of an external reality leads to similar contradictions.
You seem to mostly be arguing against a strawman; as I said, I’m not saying reality doesn’t exist or that it’s not relevant to our experiences. What I’m saying is that the preferences are composed of map, and while there are connections between that map and external reality, we are essentially deluded to think our preferences refer to actual reality, and that this delusion leads us to believing that changing external reality will change our internal experience, when more often the reverse is more likely true. (That is, changing our internal experience will more likely result in our taking actions that will actually change external reality.)
Note, however that:
Here you seem to be arguing my point. The experience is a function of preferences, the preferences are a product of, and point to, other experiences, in a self-sustaining loop that sometimes might as well not be connected to outside reality at all, for all that it has anything to do with what’s actually going on.
Lucky people live in a perpetual perception of good things happening, unlucky people the opposite, even when the same events are happening to both.
How can we say, then, that either person’s perceptions are “about” reality, if they are essentially unconditional? Clearly, something else is going on.
If we disagree at this point, I’d have to say it can only be because we disagree on what “about” means. When I say preferences are not “about” reality, it is in the same sense that Robin Hanson is always saying that politics is not “about” policy, etc.
Clearly, preferences are “about” reality in the same sense that politics are “about” policy. That is, reality is the subject of a preference, in the same way that a policy might be the subject of a political dispute.
However, in both cases, the point of the ostensible activity is not where it appears to be. In order for politics to function, people must sincerely believe that it is “about” policy, in precisely the same way as we must sincerely believe our preferences are “about” reality, in order to make them function—and for similar reasons.
But in neither case does either the sincerity or the necessity of the delusion change the fact that it’s nonetheless a delusion.
I don’t think I disagree with any of the above, except to dispute its full universality (which I’m not sure you’re even arguing). To attempt to rephrase your point: Our interactions with reality create experiences filtered through our particular way of characterizing such interactions. It’s these necessarily subjective characterizations (among other things), rather than the substance of the interaction itself, which generate our preferences. When reflecting on our preferences, we’re likely to look right past the interpretive layer we’ve introduced and attribute them to the external stimulus that produced the response, rather than the response itself.
Robin’s “X is not about Y” has the flavor of general, but not universal, rules. Would you extend your analogy to include this property?
Here’s an interesting question for you: why is it important that you consider this non-universal? What value does it provide you for me to concede an exception, or what difference will it make in your thinking if I say “yes” or “no”? I am most curious.
(Meanwhile, I agree with your summation as an accurate, if incomplete restatement of the bulk of my point.)
Because I’m trying to make sense of your position, but I don’t think I can with such a strict conclusion. I don’t see any fundamental reason why someone couldn’t form preferences more or less directly mediated by reality, it just seems that in practice, we don’t.
If you’re asking why I’m bringing up universality, it seemed clear that your claims about preferences were universal in scope until you brought up “X is not about Y”. “Must logically be” and “tends to be in practice” are pretty different types of statement.
You didn’t answer my questions.
I mean, you said some things that sound like answers, but they’re not answers to the questions I asked. Here they are again:
and
Your almost-answer was that you don’t think you can “make sense” of my position with a strict conclusion. Why is that? What would it mean for there to be a strict conclusion? How, specifically, would that be a problem?
I didn’t answer this because it’s predicated on an assumption that has no origin in the conversation. I never claimed that it was “important” for me to consider this non-universal. As per being “liberal in what I accept” in the realm of communication, I tried to answer the nearest meaningful question I thought you might actually be asking. I thought the phrase “If you’re asking why I’m bringing up universality” made my confusion sufficiently clear.
If you really do mean to ask me why I think it’s important that I believe in some property of preference formation, then either I’ve said something fairly obvious to that end that I’m not remembering (or finding), or you’re asserting your own inferences as the basis of a question, instead of its substance. I try to give people the benefit of the doubt that I’ve misunderstood them in such cases, rather than just assume they’re speaking manipulatively.
No particular value in mind. I suppose the greatest value would be in you solidly refuting such exceptions in a way that made sense to me, as that would be a more surprising (therefore more informative) outcome. If you the concede the exception, I don’t gain any additional insight, so that’s of fairly neutral value.
Not really sure yet, especially in the “no” case (since in that case you may have reasons I haven’t yet thought of or understood). I suppose in the “yes” case I’d have greater confidence that I knew what you were talking about if I encountered similar concepts in your comments elsewhere. This discussion has had some difference on my thinking: I don’t think I understood the thrust of your point when I originally complained that your distinction lacked relevance.
By strict conclusion, I mean “preferences are modeled strictly in terms of the map: it is logically impossible to a hold preference expressed in terms of something other than that which is expressed in the map”. This seems very nearly true, but vulnerable to counterexamples when taken as a general principle or logical result of some other general principle. I’ll elaborate if you’d like, but I thought I’d clarify that you meant it that way. If you didn’t, theoretical or speculative counter-examples aren’t particularly relevant.
I can imagine that, in principle, some other sort of mind than a human’s might be capable of being a counterexample, apart from, say, the trivial example of a thermostat, which shows a “preference” for reality being a certain way. An AI could presumably be built so that its preferences were based on properties of the world, rather than properties of its experience, or deduction from other properties based on experience. However, at some point that would need to be rooted in the goal system provided by its programmers… who presumably based it off of their own preferences.… ;-) (Nonetheless, if the AI didn’t have anything we’d label “experience”, then I’d have to agree that it has a preference about reality, rather than its experience of reality.)
I could also consider an argument that, say, hunger is about the state of one’s stomach, and that it therefore is “about” the territory, except that I’m not sure hunger qualifies as a preference, rather than an appetite or a drive. A person on a hunger strike or with anorexia still experiences hunger, yet prefers not to eat.
If you think you have other counterexamples, I’d like to hear them. I will be very surprised if they don’t involve some rather tortured reasoning and hypotheticals, though, or non-human minds. The only reason I even hedge my bets regarding humans is that (contrary to popular belief) I’m not under the mistaken impression that I have anything remotely approaching a complete theory of mind for human brains, versus a few crude maps that just happen to cover certain important chunks of “territory”. ;-)
I don’t actually consider this a good counterexample. It can been trivially shown that the thermostat’s “preference” is not in terms of the “reality” of temperature: Just sabotage the sensor. The thermostat “prefers” its sensor reading to correspond to its set point. Wouldn’t you agree this is fairly analogous to plenty of human desires?
Agreed. The closest it seems you could come is to prefer satiation of said appetites, which is a subjective state.
Actually, human minds are the primary source of my reservations. I don’t think my reasoning is particularly tortured, but it certainly seems incomplete. Like you, I really have no idea what a mind is.
That said, I do seem to have preferences that concern other minds. These don’t seem reducible to experiences of inter-personal behavior… they seem largely rooted in the empathic impulse, the “mirror neurons”. Of course, on its face, this is still just built from subjective experience, right? It’s the the experience of sympathetic response when modeling another mind. And there’s no question that this involves substituting my own experiences for theirs as part of the modeling process.
But when I reflect on a simple inter-personal preference like “I’d love for my friend to experience this”, I can’t see how it really reduces to pure experience, except as mediated by my concept of invariant reality. I don’t have a full anticipation of their reaction, and it doesn’t seem to be my experience of modeling their interaction that I’m after either.
Feel free to come up with a better explanation, but I find it difficult to deconstruct my desire to reproduce internally significant experiences in an external environment in a way that dismisses the role of “hard” reality. I can guess at the pre-reflective biological origin of this sort of preference, just like we can point at the biological origin of domesticated turkeys, but, just as turkeys can’t function without humans, I don’t know how it would function without some reasonable concept of a reality that implements things intrinsically inaccessible and indifferent to my own experience.
I chose to instantiate this particular example, but the general rule seems to be: The very fabric of what “another mind” means to me involves the concept of an objective but shared reality. The very fabric of what “another’s experiences” means to me involves the notion of an external system giving rise to external subjective experiences that bear some relation to my own.
You could claim my reasoning is tortured in that it resembles Russel’s paradox: One could talk about the set of all subjective preferences explicitly involving objective phenomena (i.e., not containing themselves). But it seems to me that I can in a sense relate to a very restricted class of objective preferences, those constructed from the vocabulary of my experience, reflected back into the world, and reinstantiated in the form of another mind.
Another simple example: Do you think a preference for honest communication is at all plausible? Doesn’t it involve something beyond “I hope the environment doesn’t trick me”?
Right. And don’t forget the mind-projection machinery, that causes us to have, e.g. different inbuilt intuitions about things that are passively moved, move by themselves, or have faces that appear to express emotion. These are all inbuilt maps in humans.
Most of us learn by experience that sharing positive experiences with others results in positive attention. That’s all that would be needed, but it’s also likely that humans have an evolved appetite to communicate and share positive experiences with their allies.
It just means you prefer one class of experiences to another, that you have come to associate with other experiences or actions coming before them, or co-incident with them.
The reason, btw, that I asked why it made a difference whether this is an absolute concept or a “mostly” concept, is that AFAICT, the idea that “some preferences are really about the territory” leads directly to “therefore, all of MY preferences are really about the territory”.
In contrast, thinking of all preferences being essentially delusional is a much better approach, especially if 99.999999999% of all human preferences are entirely about the map, if we presume that maybe there are some enlightened Zen masters or Beisutsukai out there who’ve successfully managed, against all odds, to win the epistemic lottery and have an actual “about the territory” preference.
Even if the probability of having such a preference were much higher, viewing it as still delusional with respect to “invariant reality” (as you call it) does not introduce any error. So the consequences of erring on the side of delusion are negligible, and there is a significant upside to being more able to notice when you’re looping, subgoal stomping, or just plain deluded.
That’s why it’s of little interest to me how many .9′s there are on the end of that %, or whether in fact it’s 100% - the difference is inconsequential for any practical purpose involving human beings. (Of course, if you’re doing FAI, you probably want to do some deeper thinking than this, since you want the AI to be just as deluded as humans are, in one sense, but not as deluded in another.)
For the love of Bayes, NO. The people here are generally perfectly comfortable with the realization that much of their altruism, etc. is sincere signaling rather than actual altruism. (Same for me, before you ask.) So it’s not necessary to tell ourselves the falsehood that all of our preferences are only masked desires for certain states of mind.
As for your claim that the ratio of signaling to genuine preference is 1 minus epsilon, that’s a pretty strong claim, and it flies in the face of experience and certain well-supported causal models. For example, kin altruism is a widespread and powerful evolutionary adaptation; organisms with far less social signaling than humans are just hardwired to sacrifice at certain proportions for near relatives, because the genes that cause this flourish thereby. It is of course very useful for humans to signal even higher levels of care and devotion to our kin; but given two alleles such that
(X) makes a human want directly to help its kin to the right extent, plus a desire to signal to others and itself that it is a kin-helper, versus
(X’) makes a human only want to signal to others and itself that it is a kin-helper,
the first allele beats the second easily, because the second will cause searches for the cheapest ways to signal kin-helping, which ends up helping less than the optimal level for promoting those genes.
Thus we have a good deal of support for the hypothesis that our perceived preferences in some areas are a mix of signaling and genuine preferences, and not nearly 100% one or the other. Generally, those who make strong claims against such hypotheses should be expected to produce experimental evidence. Do you have any?
That’s nice, but not relevant, since I haven’t been talking about signaling.
Given that, I’m not going to go through the rest of your comment point by point, as it’s all about signaling and kin selection stuff that doesn’t in any way contest the idea that “preference is about experiences, not the reality being experienced”.
I don’t disagree with what you said, it’s just not in conflict with the main idea here. When I said that this is like Hanson’s “politics are not about policy”, I didn’t mean that it was therefore about signaling! (I said it was “not about” in the same way, not that it was about in the same way—i.e., that the mechanism of delusion was similar.)
The way human preferences work certainly supports signaling functions, and may be systematically biased by signaling drives, but that’s not the same thing as saying that preferences equal signaling, or that preferences are “about” signaling.
Well, this discussion might not be useful to either of us at this point, but I’ll give it one last go. My reason for bringing in talk of signaling is that throughout this conversation, it seems like one of the claims you have been making is that
The algorithm (more accurately, the collection of algorithms) that constitutes me makes its decisions based on a weighting of my current and extrapolated states of mind. To the extent that I perceive preferences about things that are distinct from my mental states (and especially when confronting thought-experiments in which my mental states will knowably diverge from the mental states I would ordinarily form given certain features of the world), I am deceiving myself.
Now, I brought up signaling because I and many others already accept a form of (A), in which we’ve evolved to deceive others and ourselves about our real priorities because such signalers appear to others to be better potential friends, lovers, etc. It looks perfectly meaningful to me to declare such preferences “illusory”, since in point of fact we find rationalizations for choosing not what we signaled we prefer, but rather the least costly available signs of these ‘preferences’.
However, kin altruism appears to be a clear case where not all action is signaling, where making decisions that are optimized to actually benefit my relatives confers an advantage in total fitness to my genes.
While my awareness and my decisions exist on separate tracks, my decisions seem to come out as they would for a certain preference relation, one of whose attributes is a concern for my relatives’ welfare. Less concern, of course, than I consciously think I have for them; but roughly the right amount of concern for Hamilton’s Rule of kin selection.
My understanding, then, is that I have both conscious and real preferences; the former are what I directly feel, but the latter determine parts of my action and are partially revealed by analysis of how I act. (One component of my real preferences is social, and even includes the preference to keep signaling my conscious preferences to myself and others when it doesn’t cost me too much; this at least gives my conscious preferences some role in my actions.) If my actions predictably come out in accordance with the choices of an actual preference relation, then the term “preference” has to be applied there if it’s applied anywhere.
There’s still the key functional sense in which my anticipation of future world-states (and not just my anticipation of future mind-states) enters into my real preferences; I feel an emotional response now about the possibility of my sister dying and me never knowing, because that is the form that evaluation of that imagined world takes. Furthermore, the reason I feel that emotional response in that situation is because it confers an advantage to have one’s real preferences more finely tuned to “model of the future world” than “model of the future mind”, because that leads to decisions that actually help when I need to help.
This is what I mean by having my real preferences sometimes care about the state of the future world (as modeled by my present mind) rather than just my future experience (ditto). Do you disagree on a functional level; and if so, in what situation do you predict a person would feel or act differently than I’d predict? If our disagreement is just about what sort of language is helpful or misleading when taking about the mind, then I’d be relieved.
The confusion that you have here is that kin altruism is only “about” your relatives from the outside of you. Within the map that you have, you have no such thing as “kin altruism”, any more than a thermostat’s map contains “temperature regulation”. You have features that execute to produce kin altruism, as a thermostat’s features produce temperature regulation. However, just as a thermostat simply tries to make its sensor match its setting, so too do your preferences simply try to keep your “sensors” within a desired range.
This is true regardless of the evolutionary, signaling, functional, or other assumed “purposes” of your preferences, because the reality in which those other concepts exist, is not contained within the system those preferences operate in. It is a self-applied mind projection fallacy to think otherwise, for reasons that have been done utterly to death in my interactions with Vladimir Nesov in this thread. If you follow that logic, you’ll see how preferences, aboutness, and “natural categories” can be completely reduced to illusions of the mind projection fallacy upon close examination.
Well, if this is just a disagreement over whether our typical uses of the word “about” are justified, then I’m satisfied with letting go of this thread; is that the case, or do you think there is a disagreement on our expectations for specific human thoughts and actions?
I suggest, by the way, that your novel backwards application of the Mind Projection Fallacy needs its own name so as not to get it confused with the usual one. (Eliezer’s MPF denotes the problem with exporting our mental/intentional concepts outside the sphere of human beings; you seem to be asserting that we imported the notion of preferences from the external world in the first place.)
No. I’m saying that the common ideas of “preference” and “about” are mind projection fallacies, in the original sense of the phrase (which Eliezer did not coin, btw, but which he does use correctly). Preference-ness and about-ness are qualities (like “sexiness”) that are attributed as intrinsic properties of the world, but to be properly specified must include the one doing the attribution.
IOW, for your preferences to be “about” the world, there must be someone who is making this attribution of aboutness, as the aboutness itself does not exist in the territory, any more than “sexiness” exists in the territory.
However, you cannot make this attribution, because the thing you think of as “the territory” is really only your model of the territory.
This can be viewed as purely a Russellian argument about language levels, but the practical point I originally intended to make was that humans cannot actually make preferences about the actual territory because the only thing we can evaluate are our own experiences—which can be suspect. Inbuilt drives and biases are one source of experiences being suspect, but our own labeling of experiences is also suspect—labels are not only subject to random linkage, but are prone to spreading to related topics in time, space, or subject matter.
It is thus grossly delusional as a practical matter to assume that your preferences have anything to do with actual reality, as opposed to your emotionally-colored, recall-biased associations with imagined subsets of half-remembered experiences of events that occurred under entirely different conditions. (Plus, many preferences subtly lead to the recreation of circumstances that thwart the preference’s fulfillment—which calls into question precisely what “reality” that preference is about.)
Perhaps we could call our default thinking about such matters (i.e. preferences being about reality) “naive preferential realism”, by analogy to “naive moral realism”, as it is essentially the same error, applied to one’s own preferences rather than some absolute definition of good or evil.
This is pretty much what I meant by a semantic argument. If, as I’ve argued, my real preferences (as defined above) care about the projected future world (part of my map) and not just the projected future map (a sub-part of that map), then I see no difficulty with describing this by “I have preferences about the future territory”, as long as I remain aware that all the evaluation is happening within my map.
It is perhaps analogous to moral language in that when I talk about right and wrong, I keep in mind that these are patterns within my brain (analogous to those in other human brains) extrapolated from emotive desires, rather than objectively perceived entities. But with that understanding, right and wrong are still worth thinking about and discussing with others (although I need to be quite careful with my use of the terms when talking with a naive moral realist), since these are patterns that actually move me to act in certain ways, and to introspect in certain ways on my action and on the coherence of the patterns themselves.
In short, any theory of language levels or self-reference that ties you in Hofstadterian knots when discussing real, predictable human behavior (like the decision process for kin altruism) is problematic.
That said, I’m done with this thread. Thanks for an entertainingly slippery discussion!
ETA: To put it another way, learning about the Mind Projection Fallacy doesn’t mean you can never use the word “sexy” again; it just means that you should be aware of its context in the human mind, which will stop you from using it in certain novel but silly situations.
Consider the difference between a thermostat connected to a heater and a human maintaining the same temperature by looking at a thermometer and switching the heater on and off. Obviously there is a lot more going on inside the human’s brain, but I still don’t understand how the thermostat has any particular kind of connection to reality that the human lacks. The same applies whether the thermostat was built by humans with preferences or somehow formed without human design.
edit: I’m not trying to antagonize you, but I genuinely can’t tell whether you are trying to communicate something that I’m not understanding, or you’ve just read The Secret one too many times.
The thermostat lacks the ability to reflect on itself, as well as the mind-projection machinery that deludes human beings into thinking that their preferences are “about” the reality they influence and are influenced by.
You’re definitely rounding to a cliche. The Secret folks think that our preferences create the universe, which is just as delusional as thinking our preferences are about the universe.
You don’t understand how something can be about something else, but declare it meaningless.
Doesn’t it rather have a preference for its sensors showing a certain reading? (This doesn’t lead to thermostat wireheading because the thermostat’s action won’t make the sensor alter its mechanism.)
Really, it’s only systems that can model a scenario where its sensors say X but the situation is actually Y, that could possibly have preferences that go beyond the future readings of its sensors. If you assert that a thermostat can have preferences about the territory but a human can’t, then you are twisting language to an unhelpful degree.
Whether your preferences refer to your state, or to the rest of the world is indeed a wirehead-related issue. The problem with the idea that they refer to your state is that that idea tends to cause wirehead behaviour—surgery on your own brain to produce the desired state. So—it seems desirable to construct agents that believe that there is a real world, and that their preferences relate to it.
I agree—that’s probably why humans appear to be constructed that way. The problem comes in when you expect the system to also be able to accurately reflect its preferences, as opposed to just executing them.
This does not preclude the possibility of creating systems that can; it’s just that they’re purely hypothetical.
To the greatest extent practical, I try to write here only about what I know about the practical effects of the hardware we actually run on today, if for no other reason than if I got into entirely-theoretical discussions I’d post WAY more than I already do. ;-)
Presumably, if you asked such an agent to reflect on its own purposes, it would claim that they related to the external world (unless it’s aim was to deceive you about its purposes for signalling reasons, of course).
For example, it might claim that its aim was to save the whales—rather than to feel good about saving the whales. It could do the latter by taking drugs or via hypnotherapy—and that is not how it actually acts.
Actually, if signaling was its true purpose, it would claim the same thing. And if it were hacked together by evolution to be convincing, it might even do so by genuinely believing that its reflections were accurate. ;-)
Indeed. But in the case of humans, note first that many people do in fact take drugs to feel good, and second, that we tend to dislike being deceived. When we try to imagine getting hypnotized into believing the whales are safe, we react as we would to being deceived, not as we would if we truly believed the whales were safe. It is this error in the map that gives us a degree of feed-forward consistency, in that it prevents us from certain classes of wireheading.
However, it’s also a source of other errors, because in the case of self-fulfilling beliefs, it leads to erroneous conclusions about our need for the belief. For example, if you think your fear of being fired is the only thing getting you to work at all, then you will be reluctant to give up that fear, even if it’s really the existence of the fear that is suppressing, say, the creativity or ambition that would replace the fear.
In each case, the error is the same: System 2 projection of the future implicitly relies on the current contents of System 1′s map, and does not take into account how that map would be different in the projected future.
(This is why, by the way, The Work’s fourth question is “who would you be without that thought?” The question is a trick to force System 1 to do a projection using the presupposition that the belief is already gone.)