CarlShulman comments on Arguments against the Orthogonality Thesis

CarlShulman Mar 10, 2013, 4:30 AM
17 points
Hi Jonatas,

I am having a hard time understanding the argument. In general terms, I take you to be arguing that some kind of additive impartial total hedonistic utilitarianism is true, and would be discovered by, and motivating to, any “generally intelligent” reasoner. Is that right?

My rough guess at your argument, knowing that I am having difficulty following your meaning, is something like this:
1. Pleasure and pain are intrinsically motivating for brains or algorithms organized around them, indeed we use their role in motivation to pick them out in our language
2. All of the ways in which people’s seemingly normative attitudes, preferences, intuitions and desires differ can be reduced to vehicles of pleasure or pain.
3. Intelligent reflection about one’s own desires then leads to egoistic hedonism.
4. Our concepts of personal identity are hard to reconcile with physicalism, so intelligent beings would be eliminativist about personal identity, and conclude that egoism is untenable as there is no “ego.”
5. In the absence of the “egoistic” component, one is left with non-egoistic hedonism, which turns into total additive utilitarianism as one tries to satisfy some kind of aggregate of all desires, all of which are desires about pleasure/pain (see #2).
However, the piece as it is now is a bit too elliptical for me to follow: you make various points quickly without explaining the arguments for them, which makes it hard for me to be sure what you mean, or the reasons for believing it. I felt most in need of further explanation on the following passages:

The problem with choosing values is obvious: making errors. Human beings are biologically and constitutionally very similar, and given this, if they objectively and rightfully differ in correct values, it is only in aesthetic preferences, by an existing biological difference

There is a complex web of assumptions here, and it’s very hard for me to be clear what you mean, although I have some guesses.

theoretically, any input (aesthetic preferences) could be associated with a certain output (good and bad feelings), or even no input at all, as in spontaneous satisfaction or wire-heading. In terms of output, good feelings and bad feelings always get positive and negative value, by definition.

What kind of value do you mean here? Impersonal ethical value? Impact on behavior? Different sorts of pleasurable and painful experience affect motivation and behavior differently, and motivation does not respond to pleasure or pain as such, but to some discounted transformation thereof. E.g. people will accept a pain 1 hour hence in exchange for a reward immediately when they would not take the reverse deal.

Good and bad feelings are directly felt as positive and desirable; negative and aversive, and this direct verification gives them the highest epistemological value. What is indirectly felt, such as the world around us, science, or physical theories

Does this apply to other directly felt moral intuitions, like anger or fairness? Later you say that our best theories show that personal identity is an illusion, despite our perception of continued existence over time, and so we would discard it. What distinguishes the two?

Likewise, only conscious beings can be said to be ethically relevant in themselves, while what goes on in the hot magma at the core of the earth, or in a random rock in Pluto, are not. Consciousness creates a subject of experience, which is required for direct ethical value.

The flat assertion that only conscious experiences have value is opposed by the flat assertions of other philosophers that other things are of value. How is it exactly that increased working memory or speed of thought would change this?

Good and bad feelings (or conscious experiences) are physical occurrences, and therefore objectively good and bad occurrences, and objective value.

How are good and bad feelings physical occurrences in a way that knowledge or health or equality or the existence of other outcomes that people desire are not?

here is no logical basis for privileging a physical organism’s own viewpoint,

Earlier you privileged pleasure as a value because it is directly experienced. But an organism directly experiences, and is conditioned or reinforced by its own pain or pleasure.

Furthermore they would understand that the free variation of values, even in comparable causal chains of biologically similar organisms, comes from error

Error in what sense? If desires are mostly learned through reward and ranticipations of to reward, one can note when the resulting desires do not maximize some metric of personal pleasure or pain (e.g. to be remembered after one dies, or for equality). But why identify with the usual tendency of reinforcement learning rather than the actual attitudes and desires one has?

One could make a similar argument about evolution, claiming that any activity which does not maximize reproductive fitness is a mistake, even if desired or pleasurable. Or if one was created by one’s parents to achieve some particular end, one could say that it is an error relative to that end to pursue some other goal.

So what is the standard of error, and why be moved by it, rather than others?
- JonatasMueller Mar 10, 2013, 5:15 AM
  3 points
  Parent
  Hi Carl,
  
  Thank you for a thoughtful comment. I am not used to writing didactically, so forgive my excessive conciseness.
  
  You understood my argument well, in the 5 points, with the detail that I define value as good and bad feelings rather than pleasure, happiness, suffering and pain. The former definition allows for subjective variation and universality, while the latter utilitarian definition is too narrow and anthropocentric, and could be contested on these grounds.
  
  What kind of value do you mean here? Impersonal ethical value? Impact on behavior? Different sorts of pleasurable and painful experience affect motivation and behavior differently, and motivation does not respond to pleasure or pain as such, but to some discounted transformation thereof. E.g. people will accept a pain 1 hour hence in exchange for a reward immediately when they would not take the reverse deal.
  
  I mean ethical value, but not necessarily impact on behavior or motivation. Indeed, people do accept trades between good and bad feelings, and they can be biased in terms of motivation.
  
  Does this apply to other directly felt moral intuitions, like anger or fairness? Later you say that our best theories show that personal identity is an illusion, despite our perception of continued existence over time, and so we would discard it. What distinguishes the two?
  
  It does not apply in the same way to other moral intuitions, like anger or fairness. The latter are directly felt in some way, and in this sense they are real, but they also have a context related to the world that is indirectly felt and could be false. Anger, for instance, can be directly felt as a bad feeling, but its causation and subsequent behavioral motivation relate to the outside world, and are in another level of certainty (not as certain). Likewise, it could be said that whatever caused good or bad feelings (such as kissing a woman) is not universal and not as certain as the good feeling itself which was caused by it in a person, and was directly verified by them. This person doesn’t know if he is inside a Matrix virtual world and if the woman was really a woman or just computer data, but he knows that the kiss led to directly felt good feelings. The distinction is that one relates to the outside world, and another relates to itself.
  
  How are good and bad feelings physical occurrences in a way that knowledge or health or equality or the existence of other outcomes that people desire are not?
  
  Good question. The goodness and badness of feelings is directly felt as so, and is a datum of highest certainty about the world, while the goodness or badness of these other physical occurrences (which are indirectly felt) is not data, but inferences, which though generally trustworthy, need to be justified eventually by being connected to intrinsic values.
  
  Earlier you privileged pleasure as a value because it is directly experienced. But an organism directly experiences, and is conditioned or reinforced by its own pain or pleasure.
  
  Indeed. However, in acting on the world, an organism has to assume a model about the world which they are going to trust as true, in order to act ethically. In this model of the world, in the world as it appears to us, the organism would consider the nature of personal identity and not privilege its own viewpoint. However, you have a reason that, strictly, one’s own experiences are more certain than those of others. The difference in this certainty could be thought of as the difference between direct conscious feelings and physical theories. Let’s say that the former get ascribed a certainty of 100%, while the latter get 95%. The organism might then put 5% more value to its own experiences, not fundamentally, but based on the solipsistic hypothesis that other people are zombies, or that they don’t really exist.
  
  Error in what sense? If desires are mostly learned through reward and ranticipations of to reward, one can note when the resulting desires do not maximize some metric of personal pleasure or pain (e.g. to be remembered after one dies, or for equality). But why identify with the usual tendency of reinforcement learning rather than the actual attitudes and desires one has?
  
  I meant in that case intrinsic values. But what you meant, for instance for equality, can be thought of instrumental values. Instrumental values are taken as heuristics or in decision theory as patterns of behavior that usually lead to intrinsic values. Indeed, in order to achieve direct or intrinsic value, the best way tends to be following instrumental values, such as working, learning, increasing longevity… I argue that the validity of these can be examined by the extent that they lead to direct value, being good and bad feelings, in a non-personal way.
  - CarlShulman Mar 10, 2013, 5:35 AM
    8 points
    Parent
    OK, that is the interpretation I found less convincing. The bare axiomatic normative claim that all the desires and moral intuitions not concerned with pleasure as such are errors with respect to maximization of pleasure isn’t an argument for adopting that standard.
    
    And given the admission that biological creatures can and do want things other than pleasure, have other moral intuitions and motivations, and the knowledge that we can and do make computer programs with preferences defined over some model of their environment that do not route through an equivalent of pleasure and pain, the connection from moral philosophy to empirical prediction is on shakier ground than the purely normative assertions.
    
    The goodness and badness of feelings is directly felt as so, and is a datum of highest certainty about the world, while the goodness or badness of these other physical occurrences (which are indirectly felt) is not data, but inferences, which though generally trustworthy, need to be justified eventually by being connected to intrinsic values.
    
    But why? You seem to be just giving an axiom without any further basis, that others don’t accept.
    
    In this model of the world, in the world as it appears to us, the organism would consider the nature of personal identity and not privilege its own viewpoint.
    
    Once one is valuing things in a model of the world, why stop at your particular axiom? And people do have reactions of approval to their mental models of an equal society, or a diversity of goods, or perfectionism, which are directly experienced.
    
    But what you meant, for instance for equality, can be thought of instrumental values.
    
    You can say that you might pursue something vaguely like X, which people feel is morally good or obligatory as such, is instrumental in pursuit of Y. But that doesn’t change the pursuit of X, even in conflict with Y.
    - wedrifid Mar 10, 2013, 5:44 AM
      4 points
      Parent
      Carl, for the sake of readability lesswrong implements markdown and in particular the block quote feature. Place a “>” before the paragraph that is a quote.
    - JonatasMueller Mar 10, 2013, 6:02 AM
      −2 points
      Parent
      
      OK, that is the interpretation I found less convincing. The bare axiomatic normative claim that all the desires and moral intuitions not concerned with pleasure as such are errors with respect to maximization of pleasure isn’t an argument for adopting that standard.
      
      The argument for adopting that standard was based on epistemological prevalence of the goodness and badness of good and bad feelings, while other hypothetical intrinsic values could be so only by much less certain inference. But I’d also argue that the nature of how the world is perceived necessitates conscious subjects, and reason that, in the lack of them, or in an universe eternally without consciousness, nothing could possibly matter ethically. Consciousness is therefore given special status, and good and bad relate to it.
      
      And given the admission that biological creatures can and do want things other than pleasure, have other moral intuitions and motivations, and the knowledge that we can and do make computer programs with preferences defined over some model of their environment that do not route through an equivalent of pleasure and pain, the connection from moral philosophy to empirical prediction is on shakier ground than the purely normative assertions.
      
      Biological creatures indeed have other preferences, but I classify those in the error category, as Eliezer justifies in CEV. Their validity could be argued on a case by case basis, though. Machines could be made unconscious or without capacity for good and bad feelings, then they would need to infer the existence of these by seeing living organisms and their culture (in this case, their certainty would be similar to that of their world model), or possibly by being very intelligent and deducing it from scratch (if this be even possible), otherwise they might be morally anti-realist. In the lack of real values, I suppose, they would have no logical reason to act one way or another, considering meta-ethics.
      
      Once one is valuing things in a model of the world, why stop at your particular axiom? And people do have reactions of approval to their mental models of an equal society, or a diversity of goods, or perfectionism, which are directly experienced.
      
      You can say that you might pursue something vaguely like X, which people feel is morally good or obligatory as such, is instrumental in pursuit of Y. But that doesn’t change the pursuit of X, even in conflict with Y.
      
      I think that these values need to be justified somehow. I see them as instrumental values for their tendency to lead to the direct values of good feelings, which take a special status by being directly verified as good. Decision theory and practical ethics are very complex, and sometimes one would take an instrumentally valuable action even in detriment of a direct value, if the action be expected to give even more direct value in the future. For instance, one might spend a lot of time learning philosophical topics, even if it be in detriment of direct pleasure, if one sees it as likely to be important to the world, causing good feelings or preventing bad feelings in an unclear but potentially significant way.