Kaj_Sotala comments on Yet more “stupid” questions

Kaj_Sotala 28 Aug 2013 17:27 UTC
3 points
Every now and then, there are discussions or comments on LW where people talk about finding a “correct” morality, or where they argue that some particular morality is “mistaken”. (Two recent examples: [1] [2]) Now I would understand that in an FAI context, where we want to find such a specification for an AI that it won’t do something that all humans would find terrible, but that’s generally not the context of those discussions. Outside such a context, it sounds like people were presuming the existence of an objective morality, but I thought that folks on LW rejected that. What’s up with that?
- RolfAndreassen 28 Aug 2013 18:41 UTC
  6 points
  Parent
  Objective morality in one (admiitedly rather long) sentence: For any moral dilemma, there is some particular decision you would make after a thousand years of collecting information, thinking, upgrading your intelligence, and reaching reflective equilibrium with all other possible moral dilemmas; this decision is the same for all humans, and is what we refer to when we say that an action is ‘correct’.
  - Kaj_Sotala 28 Aug 2013 20:10 UTC
    8 points
    Parent
    I find that claim to be very implausible: to name just one objection to it, it seems to assume that morality is essentially “logical” and based on rational thought, whereas in practice moral beliefs seem to be much more strongly derived from what the people around us believe in. And in general, the hypothesis that all moral beliefs will eventually converge seems to be picking out a very narrow region in the space of possible outcomes, whereas “beliefs will diverge” contains a much broader space. Do you personally believe in that claim?
    - niceguyanon 29 Aug 2013 5:21 UTC
      1 point
      Parent
      I’m not sure what I was expecting, but I was a little surprised after seeing you say you object to objective morality. I probably don’t understand CEV well enough and I am pretty sure this is not the case, but it seems like there is so much similarity between CEV and some form of objective morality as described above. In other words, if you don’t think moral beliefs will eventually converge, given enough intelligence, reflection, and gathering data, etc, then how do you convince someone that FAI will make the “correct” decisions based on the extrapolated volition?
      - Kaj_Sotala 29 Aug 2013 6:05 UTC
        4 points
        Parent
        CEV in its current form is quite under-specified. I expect that there would exist many, many different ways of specifying it, each of which would produce a different CEV that would converge at a different solution.
        
        For example, Tarleton (2010) notes that CEV is really a family of algorithms which share the following features:
        
        Meta-algorithm: Most of the AGI’s goals will be obtained at run-time from human minds, rather than explicitly programmed in before run-time.
        Factually correct beliefs: The AGI will attempt to obtain correct answers to various factual questions, in order to modify preferences or desires that are based upon false factual beliefs.
        Singleton: Only one superintelligent AGI is to be constructed, and it is to take control of the world with whatever goal function is decided upon.
        Reflection: Individual or group preferences are reflected upon and revised.
        Preference aggregation: The set of preferences of a whole group are to be combined somehow.
        
        He comments:
        
        The set of factually correcting, singleton, reflective, aggregative meta-algorithms is larger than just the CEV algorithm. For example, there is no reason to suppose that factual correction, reflection, and aggregation, performed in any order, will give the same result; therefore, there are at least 6 variants depending upon ordering of these various processes, and many variants if we allow small increments of these processes to be interleaved. CEV also stipulates that the algorithm should extrapolate ordinary human-human social interactions concurrently with the processes of reflection, factual correction and preference aggregation; this requirement could be dropped.
        
        Although one of Eliezer’s desired characteristics for CEV was to ”avoid creating a motive for modern-day humans to fight over the initial dynamic”, a more rigorous definition of CEV will probably require making many design choices for which there will not be any objective answer, and which may be influenced by the designer’s values. The notion that our values should be extrapolated according to some specific criteria is by itself a value-laden proposal: it might be argued that it was enough to start off from our current-day values just as they are, and then incorporate additional extrapolation only if our current values said that we should do so. But doing so would not be a value-neutral decision either, but rather one supporting the values of those who think that there should be no extrapolation, rather than of those who think there should be.
        
        I don’t find any of these issues to be problems, though: as long as CEV found any of the solutions in the set-of-final-values-that-I-wouldn’t-consider-horrible, the fact that the solution isn’t unique isn’t much of an issue. Of course, it’s quite possible that CEV will hit on some solution in that set that I would judge to be inferior to many others also in that set, but so it goes.
    - RolfAndreassen 28 Aug 2013 22:19 UTC
      0 points
      Parent
      
      Do you personally believe in that claim?
      
      It seems there are two claims: One, that each human will be reflectively self-consistent given enough time; two, that the self-consistent solution will be the same for all humans. I’m highly confident of the first; for the second, let me qualify slightly:
      
      Not all human-like things are actually humans, eg psychopaths. Some of these may be fixable.
      Some finite tolerance is implied when I say “the same” solution will be arrived at.
      
      With those qualifications, yes, I believe the second claim with, say, 85% confidence.
      - Kaj_Sotala 29 Aug 2013 7:37 UTC
        3 points
        Parent
        I find the first claim plausible though not certain, but I would expect that if such individual convergence happens, it will lead to collective divergence not convergence.
        
        When we are young, our moral intuitions and beliefs are a hodge-podge of different things, derived from a wide variety of sources, probably reflecting something like a “consensus morality” that is the average of different moral positions in society. If/when we begin to reflect on these intuitions and beliefs, we will find that they are mutually contradictory. But one person’s modus ponens is another’s modus tollens: faced with the fact that a utilitarian intuition and a deontological intuition contradict each other, say, we might end up rejecting the utilitarian conclusion, rejecting the deontological conclusion, or trying to somehow reconcile them. Since logic by itself does not tell us which alternative we should choose, it becomes determined by extra-logical factors.
        
        Given that different people seem to arrive at different conclusions when presented with such contradictory cases, and given that their judgement seems to be at least weakly predicted by their existing overall leanings, I would guess that the choice of which intuition to embrace would depend on their current balance of other intutions. Thus, if you are already leaning utilitarian, the intuitions which are making you lean that way may combine together and cause you to reject the deontological intuition, and vice versa if you’re learning deontologist. This would mean that a person who initially started with an even mix of both intuitions would, by random drift, eventually end up in a position where one set of intuitions was dominant, after which there would be a self-reinforcing trajectory towards an area increasingly dominated by intuitions compatible with the ones currently dominant. (Though of course the process that determines which intuitions get accepted and which ones get rejected is nowhere as simple as just taking a “majority vote” of intuitions, and some intuitions may be felt so strongly that they are almost impossible to reject.) This would mean that as people carried out self-reflection, their position would end up increasingly idiosyncratic and distant from the consensus morality. This seems to be roughly compatible with what I have anecdotally observed in various people, though my sample size is relatively small.
        
        I feel that I have personally been undergoing this kind of a drift: I originally had the generic consensus morality that one adopts by spending their childhood in a Western country, after which I began reading LW, which worked to select and reinforce my existing set of utilitarian intuitions—but had I not already been utilitarian-leaning, the utilitarian emphasis on LW might have led me to reject those claims and seek out a (say) more deontological influence. But as time has gone by, I have become increasingly aware of the fact that some of my strongest intuitions lean towards negative utilitarianism, whereas LW is more akin to classical utilitarianism. Reflecting upon various intuitions has led me to gradually reject various intuitions that I previously took to support classical rather than negative utilitarianism, thus causing me to move away from the general LW consensus. And since this process has caused some of the intuitions that previously supported a classical utilitarian position to lose their appeal, I expect that moving back towards CU is less likely than continued movement towards NU.
      - cousin_it 29 Aug 2013 13:32 UTC
        2 points
        Parent
        Seconding Kaj_Sotala’s question. Is there a good argument why self-improvement doesn’t have diverging paths due to small differences in starting conditions?
        hairyfigment 30 Aug 2013 0:47 UTC
        0 points
        Parent
        Dunno. CEV actually contains the phrase, “and had grown up farther together,” which the above leaves out. But I feel a little puzzled about the exact phrasing, which does not make “were more the people we wished we were” conditional on this other part—I thought the main point was that people “alone in a padded cell,” as Eliezer puts it there, can “wish they were” all sorts of Unfriendly entities.
    - Eugine_Nier 30 Aug 2013 1:53 UTC
      −1 points
      Parent
      That argument seems like it would apply equally well to non-moral beliefs.
  - Armok_GoB 30 Aug 2013 23:23 UTC
    0 points
    Parent
    I assume the same but instead of “all humans” the weaker “the people participating in this conversation”.
  - buybuydandavis 29 Aug 2013 10:33 UTC
    0 points
    Parent
    I don’t think even that’s a sufficient definition.
    
    It’s that all observers (except psychos), no matter their own particular circumstances and characteristics, would assign approval/disapproval in exactly the same way.
    - drethelin 29 Aug 2013 17:55 UTC
      0 points
      Parent
      Psychopaths are quite capable of perceiving objective truths. In fact if there was an objective morality I expect it would work better for psychopaths than for anyone else.
      - buybuydandavis 29 Aug 2013 21:56 UTC
        1 point
        Parent
        I believe Rolf has excommunicated psychopaths (and Clippy) from the set of agents from whom “human morality” is calculated.
        
        First they purged the psychopaths...
        
        Me, I don’t think everyone else converges to the same conclusions. All non psychopaths just aren’t all made out of the same moral cookie cutter. It’s not that we have to “figure out” what is right, it’s that we have different values. If casual observation doesn’t convince you of this, Haidt’s quantified approach should.
  - Lumifer 28 Aug 2013 18:49 UTC
    0 points
    Parent
    That’s one but not the only one possible definition of objective morality.
- Shmi 28 Aug 2013 18:00 UTC
  4 points
  Parent
  At least some of the prominent regulars seem to believe in objective morality outside of any FAI context, I think (Alicorn? palladias?).
- Vladimir_Nesov 29 Aug 2013 14:13 UTC
  3 points
  Parent
  The connotations of “objective” (also discussed in the other replies in this thread) don’t seem relevant to the question about the meaning of “correct” morality. Suppose we are considering a process of producing an idealized preference that gives different results for different people, and also nondeterministically gives one of many possible results for each person. Even in this case, the question of expected ranking of consequences of alternative actions according to this idealization process applied to someone can be asked.
  
  Should this complicated question be asked? If the idealization process is such that you expect it to produce a better ranking of outcomes than you can when given only a little time, then it’s better to base actions on what the idealization process could tell you than on your own guess (e.g. desires). To the extent your own guess deviates from your expectation of the idealization process, basing your actions on your guess (desires) is an incorrect decision.
  
  A standard example of an idealization dynamic is what you would yourself decide given much more time and resources. If you anticipate that the results of this dynamic can nondeterministically produce widely contradictory answers, this too will be taken into account by the dynamic itself, as the abstract you-with-more-time starts to contemplate the question. The resulting meta-question of whether taking the diverging future decisions into account produces worse decisions can be attacked in the same manner, etc. If done right, such process can reliably give a better result than you-with-little-time can, because any problem with it that you could anticipate will be taken into account.
  
  A hypothetical idealization dynamic may not be helpful in actually making decisions, but its theoretical role is that it provides a possible specification of the “territory” that moral reasoning should explore, a criterion of correctness. It is a hard-to-use criterion of correctness, you might need to build a FAI to actually access it, but at least it’s meaningful, and it illustrates the way in which many ways of thinking about morality are confused.
  
  (As an analogy, we might posit the problem of drawing an accurate map of the surface of Pluto. My argument amounts to pointing out that Pluto can be actually located in the world, even if we don’t have much information about the details of its surface, and won’t be able to access it without building spacecraft. Given that there is actual territory to the question of the surface of Pluto, many intuition-backed assertions about it can already be said to be incorrect (as antiprediction against something unfounded), even if there is no concrete knowledge about what the correct assertions are. “Subjectivity” may be translated as different people caring about surfaces of different celestial bodies, but all of them can be incorrect in their respective detailed/confident claims, because none of them have actually observed the imagery from spacecraft.)
  - Wei Dai 30 Aug 2013 6:28 UTC
    0 points
    Parent
    
    A hypothetical idealization dynamic may not be helpful in actually making decisions, but its theoretical role is that it provides a possible specification of the “territory” that moral reasoning should explore, a criterion of correctness.
    
    I think that such a specification probably isn’t the correct specification of the territory that moral reasoning should explore. By analogy, it’s like specifying the territory for mathematical reasoning based on idealizing human mathematical reasoning, or specifying the territory for scientific reasoning based on idealizing human scientific reasoning. (As opposed to figuring out how to directly refer to some external reality.) It seems like a step that’s generally tempting to take when you’re able to informally reason (to some extent) about something but you don’t know how to specify the territory, but I would prefer to just say that we don’t know how to specify the territory yet. But...
    
    It is a hard-to-use criterion of correctness, you might need to build a FAI to actually access it, but at least it’s meaningful, and it illustrates the way in which many ways of thinking about morality are confused.
    
    Maybe I’m underestimating the utility of having a specification that’s “at least meaningful” even if it’s not necessarily correct. (I don’t mind “hard-to-use” so much.) Can you give some examples of how it illustrates the way in which many ways of thinking about morality are confused?
- Discredited 28 Aug 2013 20:16 UTC
  1 point
  Parent
  I came to the metaethics sequence an ethical subjectivist and walked away an ethical naturalist. I’ve mostly stopped using the words “objective” and “subjective”, because I’ve talked with subjectivists with whom I have few to no substantive disagreements. But I think you and I do have a disagreement! How exciting.
  
  I accept that there’s something like an ordering over universe configurations which is “ideal” in a sense I will expand on later, and that human desirability judgements are evidence about the structure of that ordering, and that arguments between humans (especially about the desirability of of outcomes or the praiseworthiness of actions) are often an investigation into the structure of that ordering, much as an epistemic argument between agents (especially about true states of physical systems or the truth value of mathematical propositions) investigates the structure of a common reality which influences the agents’ beliefs.
  
  A certain ordering over universe configurations also influences human preferences. It is not a causal influence, but a logical one. The connection between human minds and morality, the ideal ordering over universe configurations, is in the design of our brains. Our brains instantiate algorithms, especially emotional responses, that are logically correlated with the computation that compresses the ideal ordering over universe configurations.
  
  Actually, our brains are logically correlated with the computations that compress multiple different orderings over universe configurations, which is part of the reason we have moral disagreements. We’re not sure which valuation—which configuration-ordering that determines how our consequential behaviors change in response to different evidence—which valuation is our logical antecedent and which are merely correlates. This is also why constructed agents similar to humans, like the ones in Three Worlds Collide, could seem to have moral disagreements with humans. They, as roughly consequentialist agents, would also be logically influenced by an ordering over universe configurations, and because of similar evolutionary pressures might also developed emotion-type algorithms. The computations compressing the different orderings, morality versus “coherentized alien endorsement relation” would be logically correlated, would be partially conditionally compressed by knowing the value of simpler computations that were common between the two. Through these commonalities the two species could have moral disagreements. But there would be other aspects of the computations that compress their orderings, logical factors that would influence one species but not the other. These would naively appear as moral disagreements, but would simply be mistaken communication: exchanging evidence about different referents while thinking they were the same.
  
  But there are other sources of valuation-disagreement than being separate optimization processes. Some sources of moral disagreement between humans: We have only partial information about morality, just as we can be partially ignorant about the state of reality. For example, we might be unsure what long term effects to society would accompany the adoption of some practice like industrial manufacturing. Or even if someone in the pre-industrial era had perfect foresight, they might be unsure of how their expressed preferences toward that society would change with more exposure it. There are raw computational difficulties (unrelated to prediction of consequences) in figuring out which ordering best fits our morality-evidence, since the space of orderings over universe configurations is large. There are still more complicated issues with model selection because human preferences aren’t fully self endorsing.
  
  Anyway, I’ve been using the word “ideal” a lot as though multiple people share a single ideal, and it’s past time I explained why. Humans share a ton of neural machinery and have a spatially concentrated origins, both of which mean closer logical-causal influences to their roughly-consequential reasoning. We have so much in common that saying “Pah, nothing is right. It’s all just subjective preferences and we’re very different people and what’s right for you is different from what’s right for me” seems to me like irresponsible ignorance. We’ve got like friggin’ hundreds of identical functional regions in our brains. We can exploit that for fun and profit. We can use interpersonal communication and argumentation and living together and probably other things to figure out morality. I see no reason to be dismissive of others’ values that we don’t sympathize with simply because there’s no shiny morality-object that “objectively exists” and has a wire leading into all our brains or whatever. Blur those tiniest of differences and it’s a common ideal. And that commonality is important enough that “moral realism” is a badge worth carrying on my identity.
- Ishaan 26 Nov 2013 7:58 UTC
  0 points
  Parent
  People are often wrong about what their preferences are + most humans have roughly similar moral hardware. Not identical, but close enough to behave as if we all share a common moral instinct.
  
  When you make someone an argument and they change their mind on a moral issue, you haven’t changed their underlying preferences...you’ve simply given them insight as to what their true preferences are.
  
  For example, if a neurotypical human said that belief in God was the reason they don’t go around looting and stealing, they’d be wrong about themselves as a matter of simple fact.
  
  -as per the definition of preference that I think makes the most sense.
  
  -Alternatively, you might actually be re-programming their preferences...I think it’s fair to say that at least some preferences commonly called “moral” are largely culturally programmed.
- Armok_GoB 30 Aug 2013 23:21 UTC
  0 points
  Parent
  I just assumed it meant “My extrapolated volition” and also “your extrapolated volition” and also the implication those are identical.
- Wei Dai 29 Aug 2013 1:01 UTC
  0 points
  Parent
  I wrote a post to try to answer this question. I talk about “should” in the post, but it applies to “correct” as well.
- Eugine_Nier 30 Aug 2013 2:24 UTC
  −2 points
  Parent
  Here is a decent discussion of objective morality.
- Lumifer 28 Aug 2013 17:45 UTC
  −4 points
  Parent
  
  What’s up with that?
  
  The usual Typical Mind Fallacy which is really REALLY pervasive.