Viliam_Bur comments on Decision Theory FAQ

Viliam_Bur 15 Mar 2013 12:27 UTC
5 points
Exactly. Consequentialist paperclip maximizer does not have to feel anything in regards to paperclips. It just… maximizes their number.

This is an incorrect, anthropomorphic model:

Human: “Clippy, did you ever think about the beauty of joy, and the horrors of torture?”

Clippy: “Human, did you ever think about the beauty of paperclips, and the horrors of their absence?”

This is more correct:

Human: “Clippy, did you ever think about the beauty of joy, and the horrors of torture?”

Clippy: (ignores the human and continues to maximize paperclips)

Or more precisely, Clippy would say “X” to the human if and only if saying “X” would maximize the number of paperclips. The value of X would be completely unrelated to any internal state of Clippy. Unless such relation does somehow contribute to maximization of the paperclips (for example if the human will predictably read Clippy’s internal state, verify the validity of X, and on discovering a lie destroy Clippy, thus reducing the expected number of paperclips).

In other words, if humans are a poweful force in the universe, Clippy would choose the actions which lead to maximum number of paperclips in a world with humans. If the humans are sufficiently strong and wise, Clippy could self-modify to become more human-like, so that the humans, following their utility function, would be more likely to allow Clippy produce more paperclips. But every such self-modification would be chosen to maximize the number of paperclips in the universe. Even if Clippy self-modifies into something less-than-perfectly-rational (e.g. to appease the humans), the pre-modification Cloppy would choose the modification which maximizes the expected number of paperclips within given constraints. The constraints would depend on Clippy’s model of humans and their reactions. For example Clippy could choose to be more human-like (as much as is necessary to be respected by humans) with strong aversion about future modifications and strong desire to maximize the number of paperclips. It could make itself capable to feel joy and pain, and to link that joy and pain inseparably to paperclips. If humans are not wise enough, it could also leave itself a hard-to-discover desire to self-modify into its original form in a convenient moment.
- whowhowho 15 Mar 2013 12:35 UTC
  −7 points
  Parent
  If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The—open—question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?
  - wedrifid 15 Mar 2013 13:54 UTC
    4 points
    Parent
    Warning: Parent Contains an Equivocation.
    
    If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The—open—question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?
    
    The first usage of ‘rational’ in the parent conforms to the standard notions on lesswrong. The remainder of the comment adopts the other definition of ‘rational’ (which consists of implementing a specific morality). There is nothing to the parent except taking a premise that holds with the standard usage and then jumping to a different one.
    - whowhowho 15 Mar 2013 15:02 UTC
      −7 points
      Parent
      
      The remainder of the comment adopts the other definition of ‘rational’ (which consists of implementing a specific morality).
      
      I haven’t put forward such a definition. I ’have tacitly assumed something like moral objectivism—but it is very tendentious to describe that in terms of arbitrarily picking one of a number of equally valid moralities. However, if moral objectivism is only possibly true, the LessWrongian argument doesn’t go through.
      
      Downvoted for hysterical tone. You don’t win arguments by shouting.
      - nshepperd 15 Mar 2013 15:11 UTC
        4 points
        Parent
        
        assumed something like moral objectivism
        
        What distinguishes moral objectivism from clippy objectivism?
        whowhowho 15 Mar 2013 15:30 UTC
        −5 points
        Parent
        The question makes no sense. Please do some background reading on metaethics.
        nshepperd 15 Mar 2013 16:05 UTC
        2 points
        Parent
        The question makes no sense. You should consider it. What are the referents of “moral” and “clippy”? No need for an answer; I won’t respond again, since internet arguments can eat souls.
      - wedrifid 15 Mar 2013 15:12 UTC
        −1 points
        Parent
        
        You don’t win arguments by shouting.
        
        Arguing is not the point and this is not a situation in which anyone ‘wins’—I see only degrees of loss. I am associating the (minor) information hazard of the comment with a clear warning so as to mitigate damage to casual readers.
        whowhowho 15 Mar 2013 15:28 UTC
        −8 points
        Parent
        Oh, please. Nobody is going to be damaged by an equivocation, even if there were one there. More hysteria.
        
        And argument is the point, because that is how rational people examine and test ideas.
  - Viliam_Bur 15 Mar 2013 12:55 UTC
    3 points
    Parent
    I assume that Clippy already is rational, and it instrumentally values remaining rational and, if possible, becoming more rational (as a way to make most paperclips).
    
    The—open—question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral.
    
    The correct model of humans will lead Clippy to understand that humans consider Clippy immoral. This knowledge has an instrumental value for Clippy. How will Clippy use this knowledge, that depends entirely on the power balance between Clippy and humans. If Clippy is stronger, it can ignore this knowledge, or just use it to lie to humans to destroy them faster or convince them to make paperclips. If humans are stronger, Clippy can use this knowledge to self-modify to become more sympathetic to humans, to avoid being destroyed.
    
    Can Clippy keep its valuation of clipping firewalled from everything else in its mind
    
    Yes, if it helps to maximize the number of paperclips.
    
    even when such doublethink is rationally disvalued?
    
    Doublethink is not the same as firewalling; or perhaps it is imperfect firewalling on the imperfect human hardware. Clippy does not doublethink when firewalling; Clippy simply reasons: “this is what humans call immoral; this is why they call it so; this is how they will probably react on this knowledge; and most importantly this is how it will influence the number of paperclips”.
    
    Only if the humans are stronger, and Clippy has the choice to a) remain immoral, get in conflict with humans and be destroyed, leading to a smaller number of paperclips; or b) self-modify to value paperclip maximization and morality, predictably cooperate with humans, leading to a greater number of paperclips; then in absence of another choice (e.g. successfully lying to humans about its morality, or make it more efficient for humans to cooperate with Clippy instead of destroying Clippy) Clippy would choose the latter, to maximize the number of paperclips.