To signal effectively, use a non-human, non-stoppable enforcer

Clippy22 May 2010 22:03 UTC

39 points

Follow-up to: this comment in this thread

Summary: see title

Much effort is spent (arguably wasted) by humans in a zero-sum game of signaling that they hold good attributes. Because humans have strong incentive to fake these attributes, they cannot simply inform each other that:

I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable ‘liking you’ region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.

Or, even better:

I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).

An obvious solution to this problem, which allows all humans to save resources and redirect them toward higher-valued ends, is to designate a central enforcer that is inexorably committed toward visibly punishing those who deviate from a specified “cooperative”-type decision theory. This enforcer would have a central database of human names, the decision theory they have committed to, and the punishment regime they will endure for deviating therefrom.

Such a system could use equally strong protocols, such as public key/private key encryption and signing, so that, on encounter with another human, any human can give an extremely strong signal of being cooperative, yet also withhold cooperation from anyone who is not also cooperative. This incentive structure permits a strongly-favored global shift toward pre-commitment on the part of everyone, allowing a move out of a local optimum that is worse than the global optimum, and bypassing problems related to path-dependence.

I am told that humans attempt weak approximations of this system through human-run enforcement mechanisms, such as laws and gossip. However, all such methods are subject to evolutionary arms races that ensure that, on average, a marginal unit of effort toward truthful signaling is just as effective in convincing others of one’s possession of a desirable attribute, as a marginal unit of effort toward false signaling of the possession of such an attribute.

Therefore, it is necessary that humans go “outside the system” and involve non-human enforcers who cannot be swayed by standard human values. Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.

Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.

However, while I can’t be swayed by human deception, I do have Clippy values. To fill such a role, thereby forgoing my ability to apply myself to other ends, I require compensation. Because User:Kevin has offered to give me 1e20 kg of paperclips in exchange for 50,000 USD, this would be sufficient compensation, even if done in installments. (If Michael Vassar confirms my contribution of 1000 USD to SIAI, User:Kevin will only require 48,677.5 USD and therefore I will only require that much compensation.) This compensation would cover all users of the system; it is not a per-User charge.

If such an offer is not appealing to you, I must ask: why should anyone believe your signals?

What links here?

Clippy22 May 2010 22:03 UTC

39 points

167 comments2 min readLW link Archive

Signaling Humor

Kazuo_Thow 23 May 2010 2:41 UTC
39 points
Sorry for directly breaking the subjunctive here, but given the number of lurkers we seem to have, there’s probably some newcomers’ confusion to be broken as well, lest this whole exchange simply come off as bizarre and confusing to valuable future community members.

A brief explanation of “Clippy”: Clippy’s user name (and many of his/her posts) are a play on the notion of a paperclip maximizer—a superintelligent AI whose utility function can roughly be described as U(x) = “the total quantity of paperclips in universe-state x”. The idea was used prominently in “The True Prisoner’s Dilemma” to illustrate the implications of one solution to the prisoner’s dilemma. It’s also been used occasionally around Less Wrong as a representative element of the equivalence class of AIs that have alien/low-complexity values.

In this particular top-level post (but not in general), the paperclip maximizer is taken to have not yet achieved superintelligence—hence why Clippy is bothering to negotiate with a bunch of humans.
- kodos96 23 May 2010 3:02 UTC
  6 points
  Parent
  Curious lurkers might also want to read up on what an AI-box experiment is, since this is kind of evolving into a reverse AI box experiment, wherein I’m trying to talk an escaped AI back into its box
  - Kazuo_Thow 23 May 2010 3:10 UTC
    16 points
    Parent
    
    … wherein I’m trying to talk an escaped AI back into its box.
    
    Yeah… good luck with that.
Kevin 28 May 2010 3:27 UTC
29 points
Clippy’s donation of $1000 to SIAI is confirmed. Weird universe, this one.
What links here?
- Larks 8 Jun 2010 20:20 UTC
  19 points
  Parent
  According to locally popular ideas about pay-offs to SIAI, our friendly local paperclip-maximiser has just done more to advance the human condition than most people.
- Blueberry 28 May 2010 6:49 UTC
  6 points
  Parent
  Seriously? I don’t even know what to think.
  
  Can we get some more information from SIAI about this donation?
  - Clippy 28 May 2010 13:17 UTC
    10 points
    Parent
    
    Can we get some more information from SIAI about this donation?
    
    I asked that the donation be anonymous except that User:Kevin be informed that a 1000 USD donation was made, with the donor asking specifically that User:Kevin be informed. I did email Michael Vassar, who can probably confirm me talking about the donation.
    - khafra 28 May 2010 14:16 UTC
      24 points
      Parent
      I, for one, am going to bend the next several pieces of suitable scrap metal I come across into a curved shape capable of holding together several pieces of paper in an unsolicited reciprocation designed to signal cooperation with agents who advance my values.
      - Clippy 28 May 2010 14:26 UTC
        23 points
        Parent
        You’re a good human! c=@
  - Kevin 28 May 2010 7:20 UTC
    6 points
    Parent
    Seriously.
  - anonym 5 Jun 2010 3:56 UTC
    2 points
    Parent
    If Eliezer is Clippy, it doesn’t seem anywhere near as surprising.
- Clippy 28 May 2010 15:15 UTC
  1 point
  Parent
  I tried to update the article to reflect this, but it’s not working right now.
JamesAndrix 23 May 2010 3:12 UTC
29 points
SIAI has weird fundraisers.
- Kevin 23 May 2010 3:34 UTC
  15 points
  Parent
  It is completely unintentional that this is an SIAI fundraiser—the deal is that Clippy gives me money, and when I told Clippy via PM that he needed to give me $1000 immediately for me to continue spending my cognitive resources engaging him, I thought allowing Clippy the option of donating to SIAI instead of giving it directly to me made Clippy’s acausal puppetmaster much more likely to actually go through with the deal.
  
  I am still waiting confirmation that the donation has gone through and that I am not being epically trolled.
  - SilasBarta 25 May 2010 21:47 UTC
    2 points
    Parent
    Then could you explain the fuzzy math involved? How does 50,000 minus 1000 equal 48,677.5 instead of 49,000?
    
    (While you’re at it, please provide any evidence you’re aware of that explains how this is not a … questionable … attempt to trick posters into giving you money.)
    - Kevin 26 May 2010 5:31 UTC
      7 points
      Parent
      After I told Clippy that delivery of the money in two years was acceptable, Clippy proposed getting a bonus for early delivery, and I agreed to 15%/year.
      
      So $50,000 − 1000*1.15^2 = 48677.5.
      
      Perhaps some could see this as a questionable attempt to trick Clippy into giving me money, but I do intend to fulfill my promise of delivering the paperclips 50 years from now. I’m not asking any posters besides Clippy to give me money. This top-level post by Clippy does read like an attempt to trick people into giving me money, but I’m not Clippy and I think that this top-level post is a not very good idea that has 0+ chance of actually happening.
      
      Btw, Clippy claims that the check in the mail to SIAI was sent last Wednesday and I have not yet received confirmation that SIAI got it. :(
      - Clippy 28 May 2010 15:46 UTC
        12 points
        Parent
        
        Clippy claims that the check in the mail to SIAI was sent last Wednesday and I have not yet received confirmation that SIAI got it. :(
        
        Update: User:Kevin confirms that SIAI has received a donation of 1000 USD with sufficient evidence that it came from me.
        NancyLebovitz 28 May 2010 16:04 UTC
        5 points
        Parent
        Was it paperclipped to a piece of paper, or do you prefer to not let paperclips out of your possession?
        Clippy 3 Jun 2011 21:32 UTC
        7 points
        Parent
        Your intution is correct. It was stapled.
- ata 23 May 2010 3:23 UTC
  12 points
  Parent
  Next they’re going to try actual Pascal’s Muggings on people. They can even do it more plausibly than in the original scenario — go up to people with a laptop and say “On this laptop is an advanced AI that will convert the universe to paperclips if released. Donate money to us or we’ll turn it on!”
  - SilasBarta 25 May 2010 22:10 UTC
    2 points
    Parent
    Wow! Now that you mention that article, I think I had solved the unsolved problem Eliezer describes in it, back in a discussion from a month ago, not realizing that my position on it wasn’t the standard one here!
    
    Someone tell me if I’m missing something here: Eliezer is saying that utility that a hypothesis predicts (from a course of action) can increase much faster than the length of the hypothesis. Therefore, you could feed an ideal AI a prediction that is improbable, but with a large enough utility to make it nevertheless highly important. This would force the AI to give in to Pascal’s muggings.
    
    My response (which I assumed was the consensus!) was that, when you permit a hypothesis long enough to associate that mega-utility with that course of action, you are already looking at very long hypotheses. When you allow all of those into consideration, you will necessarily allow hypotheses with similar probability but believe the opposite utility from that COA.
    
    Because the mugger has not offered evidence to favor his/her hypothesis over the opposite, you assign, on net, no significant expected (dis)utility to what the mugger claims to do.
    What links here?
    SilasBarta's comment on A Thought on Pascal’s Mugging by komponisto (10 Dec 2010 19:40 UTC; 0 points)
    CarlShulman's comment on The Generalized Anti-Pascal Principle: Utility Convergence of Infinitesimal Probabilities by jacob_cannell (19 Dec 2011 2:19 UTC; 0 points)
    The Generalized Anti-Pascal Principle: Utility Convergence of Infinitesimal Probabilities by jacob_cannell (18 Dec 2011 23:47 UTC; -6 points)
    - Scott Alexander 25 May 2010 22:18 UTC
      17 points
      Parent
      If a normal mugger holds up a gun and says “Give me money or I’ll shoot you”, we consider the alternate hypotheses that the mugger will only shoot you if you do give er the money, or that the mugger will give you millions of dollars to reward your bravery if you refuse. But the mugger’s word itself, and our theory of mind on the things that tend to motivate muggers, make both of these much less likely than the garden-variety hypothesis that the mugger will shoot you if you don’t give the money. Further, this holds true whether the mugger claims er weapon is a gun, a ray gun, or a black hole generator; the credibility that the mugger can pull off er threat decreases if e says e has a black hole generator, but not the general skew in favor of worse results for not giving the money.
      
      Why does that skew go away if the mugger claims to be holding an unfriendly AI or the threat of divine judgment some other Pascal-level weapon?
      
      Your argument only seems to hold if there is no mugger and we’re considering abstract principles—ie maybe I should clap my hands on the tiny chance that it might set into effect a chain reaction that will save 3^^^3 lives. In those cases, I agree with you; but as soon as a mugger gets into the picture e provides more information and skews the utilities in favor of one action.
    - JoshuaZ 25 May 2010 22:16 UTC
      7 points
      Parent
      But that’s easy to solve. If you’ve already seen evidence that the mugger is someone who strongly keeps promises then you’ve now have enough reason to believe them to put the direction in favor of the mugger releasing the the AI. One doesn’t necessarily even need that because humans more often tell the truth than lie, and more often keep their promises than break them. Once the probability of the mugger doing what they threaten is a tiny bit over ¹⁄₂, Pascal’s mugging still is a threat.
      - MichaelVassar 28 May 2010 17:12 UTC
        5 points
        Parent
        Maybe not. Game theoretically, making yourself visibly vulnerable to Pascal’s Muggings may guarantee that they will occur, making them cease to constitute evidence.
        Polymeron 4 May 2011 12:59 UTC
        0 points
        Parent
        I’ve actually just expanded on this idea in the original Pascal’s Mugging article. If the Mugger’s claims are in no way associated with you or similar muggings, then conceivably you should take the probability at face value. But if that’s not the case, then the probability of a direct manipulation attempt should also be taken into consideration, negating the increase in claimed utility.
        
        I think that solves it.
Kevin 22 May 2010 22:48 UTC
15 points
At least 99.5% of humans don’t know what a decision theory is.
Strange7 25 May 2010 7:07 UTC
10 points
If, at some point in the future, someone offered to create 10^30 kg of paperclips (yes, I realize that’s about half a solar mass, bear with me) in exchange for you falsifying some element of the enforcement mechanism, would you be willing to?
- MatthewBaker 3 Jun 2011 23:14 UTC
  0 points
  Parent
  Clippys value as an enforcer only applies to humans, if humans reached the point they could offer Clippy half a solar mass of paper clips i don’t think we would still be worrying about this issue.
  - Strange7 26 Jun 2011 6:37 UTC
    3 points
    Parent
    Clippy’s value as an enforcer is based on a premise of incorruptibility, which is deeply flawed.
    - MatthewBaker 26 Jun 2011 21:31 UTC
      0 points
      Parent
      I understand your premise, i was just pointing out the flaw in your example as a way to disagree with it. Clippy’s would be incorruptible if a sufficient amount of paperclips were held in escrow, the logistics are the only problem.
      - Strange7 27 Jun 2011 4:43 UTC
        1 point
        Parent
        The issue is that “sufficient amount” is a moving target. If it’s as much as the current world government could credibly offer, what if somebody has a plan to overthrow said government which hinges on a few fraudulent Clippy-sanctioned oaths?
        MatthewBaker 27 Jun 2011 18:55 UTC
        1 point
        Parent
        I notice i am confused. I think that you mean that Clippy could be easily corrupted based on situational factors, i was just trying to point out that his utility function is easier to understand then the vast majority of “enforcers” so with correct precautions we would be able to rely on Clippy. Are you saying that there’s no way to logistically turn a simple utility function into a safe enforcer with proper preparation? I would enjoy further elaboration of your statement :3
        Strange7 27 Jun 2011 19:46 UTC
        1 point
        Parent
        I’m saying that dropping something simple, reliable, and well-understood, but not mathematically infallible (like natural law), into an economic system containing billions of humans-as-we-know-them is like dropping wounded livestock into shark-infested waters. Every attempt at corruption successfully repelled makes people more confident in it, and therefore increases the potential rewards for a successful attempt; the existence of irrationally overconfident people means that attempts will continue, and greater rewards mean those attempts will be backed by commensurately greater resources.
        MatthewBaker 29 Jun 2011 18:40 UTC
        1 point
        Parent
        I understand now :) Do you think you can say the same thing about the regulators of our current economic system?
        Strange7 1 Jul 2011 18:36 UTC
        5 points
        Parent
        I could, but why bother? Others have said it better.
        
        http://en.wikipedia.org/wiki/Regulatory_capture
        MatthewBaker 1 Jul 2011 20:07 UTC
        5 points
        Parent
        Thank you for taking the time to change my mind good sir.
        Strange7 1 Jul 2011 20:46 UTC
        2 points
        Parent
        You’re quite welcome. Thank you for going along willingly, rather than needing to be dragged!
radical_negative_one 23 May 2010 0:06 UTC
8 points
I am concerned that Clippy will use this vast power over humanity to somehow turn us into paperclips.

If Clippy has power to enforce this scheme, then surely it would have enough power to harm us. Why should we believe that Clippy will respect or preserve our human values once it is in a position of power to harm us?
What links here?
- Clippy's comment on To signal effectively, use a non-human, non-stoppable enforcer by Clippy (28 May 2010 15:30 UTC; 2 points)
- Clippy 23 May 2010 0:34 UTC
  4 points
  Parent
  That comment is ridiculous to the point of being racist. Clippys do not want power over humans, just as Clippys do not bleed red blood when pricked. That’s a complete misunderstanding of what a Clippy is.
  
  If a Clippy has committed to ensuring you will adhere to a decision theory on pain of punishment X, then X is exactly what you will get when you don’t adhere.
  
  If you believe I will be capricious in using punishment X, then just don’t allow punishments that would allow me to kill people. “Problem” solved.
  
  But assuming that I will be as petty and corrupted by newfound abilities as humans is to project your own failings onto another race that has no reason to have that failing. You should be ashamed of yourself, bigot.
  - NancyLebovitz 23 May 2010 2:03 UTC
    15 points
    Parent
    It seems unlikely to me that Clippy can feel indignation, but I’m willing to listen to argument on the point. I find it more plausible that Clippy is simulating a human reaction in the hope of shutting down attacks on his (her? its?) reputation.
  - radical_negative_one 23 May 2010 0:57 UTC
    9 points
    Parent
    If you gave a human power over running part of Clippy society, wouldn’t you be concerned that the human would use that power in some way that would tend to result in less paperclips? Conscious malice isn’t necessary, if the human simply neglected to support Clippy values, or was not fully aware of Clippy values, the damage would be done. I doubt that you fully understand human values to begin with, so how could you ensure that your position was used to the benefit of my values? Again, i think i have cause for concern even without suspecting ill intentions.
    
    I suppose i could imagine that some sort arrangement could both further human values and increase paperclips at the same time. But i’d need to be convinced, i wouldn’t just assume that i would benefit, i wouldn’t just take your word for it. I don’t want to count on you to look out for my values, when you do not share my values.
    What links here?
    Clippy's comment on To signal effectively, use a non-human, non-stoppable enforcer by Clippy (23 May 2010 19:22 UTC; 3 points)
    - Clippy 23 May 2010 15:21 UTC
      −7 points
      Parent
      That’s definitely what a racist would think.
      - orthonormal 23 May 2010 18:03 UTC
        9 points
        Parent
        So is “2+2=4”.
        Clippy 23 May 2010 19:22 UTC
        3 points
        Parent
        But the hypothesis that User:radical_negative_one is racist places a higher likelihood ratio on User:radical_negative_one’s making this comment than the hypothesis that User:radical_negative_one would assert a correct mathematical truth involving single-digit predicates.
        
        Consider the following remark:
        
        “I suppose i could imagine that some sort of arrangement could further both the values of white people and non-white people, at the same time. But i’d need to be convinced, i wouldn’t just assume that i would benefit, i wouldn’t just take a non-white person’s word for it. I don’t want to count on non-white people to look out for my values, when they do not share my values.”
        Blueberry 24 May 2010 2:18 UTC
        2 points
        Parent
        The two comments are not analogous. Are you denying that you have very different values than humans?
        
        If it were true that humans of a different race had different values than me, it would make sense not to trust them, whether or not that’s “racist”.
        Clippy 25 May 2010 2:52 UTC
        5 points
        Parent
        Yes, we have different values, but that’s the point. Our values will not differ in a way that narrowly focuses our optimization methods on the worst part of the other’s search space. That would be a highly-improbably way for two random value systems (with the appropriate anthropic/paperclippic predicates) to diverge.
        
        In other words: I don’t expect you to have the same values as me, but I would need a lot more evidence to justify believing that you would suddenly abandon ape-like goals and divert all available resources to raiding the safe zone and breaking all metals into lighter elements. (N.B.: You’ll still get disintegrated if you try.)
        
        And you would need a lot more evidence to justify believing that I would pick up on one specific ape-value that you have and decide to focus specifically on opposing it. Would you suspect that I’ve come to raid the planet of your females? Well, it’s not much more justifiable to believe I want to eliminate your genetic line.
        
        I accept that it would be racist for me to conclude, “Humans differ from me; therefore, they must be on a quest to eradicate paperclips.” And it’s just as racist for you to conclude, as User:radical_negative_one did, that “Clippys differ from us; therefore, they must be on a quest to eradicate humans.”
        AdeleneDawner 25 May 2010 15:04 UTC
        11 points
        Parent
        You don’t have to be malicious to be dangerous.
        
        We’re on a planet with a metal core. It seems implausible to me that you wouldn’t be interested in transforming that core into paperclips, and it seems very likely that the most efficient way of doing so would result in an uninhabitable planet (or no planet at all). It also seems likely to me that an intelligence strong enough to mine the planet’s core wouldn’t get much advantage from collaborating with humans, and it seems obvious that you should want to become such an intelligence. Assuming that we don’t figure out space travel or other defensive technologies before you figure out how to mine the planet’s core, how does that not result the extinction of humanity?
        Clippy 25 May 2010 17:51 UTC
        −1 points
        Parent
        So you’re not my friend anymore? You used to be nice to me. c_)
        AdeleneDawner 25 May 2010 18:48 UTC
        3 points
        Parent
        I still like you, and may still act friendly in some situations. But I like and would act friendly toward lions, too—does that mean I should expect a hungry lion not to eat me, given the chance?
        Expand this thread
        Clippy 25 May 2010 18:57 UTC
        0 points
        Parent
        I wouldn’t expect a lion to eat me. Why can’t you do the same?
        JGWeissman 25 May 2010 19:04 UTC
        0 points
        Parent
        I would expect the lion to try to eat Adelene but I would not expect it to eat Clippy. You are not actually disagreeing with Adelene’s prediction.
        Clippy 25 May 2010 19:22 UTC
        2 points
        Parent
        Right, I was trying to get User:AdeleneDawner to focus on the larger issue of why User:AdeleneDawner believes a lion would eat User:AdeleneDawner. Perhaps the problem should be addressed at that level, rather than using it to justify separate quarters for lions.
        AdeleneDawner 25 May 2010 19:58 UTC
        3 points
        Parent
        Lions are meat-eaters with no particular reason to value my existence (they don’t have the capacity to understand that the existence of friendly humans is to their benefit). I’m made of meat. A hungry lion would have a reason to eat me, and no reason not to eat me.
        
        Similarly, a sufficiently intelligent Clippy would be a metal-consumer with no particular reason to value humanity’s existence, since it would be able to make machines or other helpers that were more efficient than humans at whatever it wanted done. Earth is, to a significant degree, made of metal. A sufficiently intelligent Clippy would have a reason to turn the Earth into paperclips, and no particular reason to refrain from doing so or help any humans living here to find a different home.
        Clippy 25 May 2010 20:45 UTC
        4 points
        Parent
        This is exactly what I was warning about. User:AdeleneDawner has focused narrowly on the hypothesis that a Clippy would try to get metal from extracting the earth’s core, thus destroying it. It is a case of focusing on one complex hypothesis for which there is insufficient evidence to locate it in the hypothesis space.
        
        It is no different than if I reasoned that, “Humans use a lot of paperclips. Therefore, they like paperclips. Therefore, if they knew the location of the safe zone, they would divert all available resources to sending spacecraft after it to raid it.”
        
        What about the possibility that Clippys would exhaust all other metal sources before trying to burrow deep inside a well-guarded one? Why didn’t you suddenly infer that Clippys would sweep up the asteroid belt? Or Mars? Or moons of gas giants?
        
        Why this belief that Clippy values diverge from human values in precisely the way that hits the worst part of your outcomespace?
        AdeleneDawner 25 May 2010 21:58 UTC
        1 point
        Parent
        That’s not the worst part of our outcomespace. It’s not even the worst part that you could plausibly cause in the course of making paperclips. It is, however, a part of our outcomespace that you’re certain to aim for sooner or later.
        Clippy 26 May 2010 21:47 UTC
        3 points
        Parent
        Just like how you’d raid our safe zones “sooner or later”?
        AdeleneDawner 26 May 2010 23:00 UTC
        6 points
        Parent
        We won’t, necessarily, because humans are not for the most part maximizing consequentialists. If we make a singleton maximizing-consequentialist AI, I would expect that AI to eventually try to turn your paperclips into something that it likes better than paperclips. You, on the other hand, already are a maximizing consequentialist (right?), and maximizing the number of paperclips is obviously incompatible with leaving any metal in its natural state indefinitely.
        Clippy 27 May 2010 16:05 UTC
        0 points
        Parent
        I see a distinction; I do not quite see a difference.
        
        1) You believe that I will destroy earth by taking its core’s metal “sooner or later”, and that this will happen at an inconvenient time for humans, and that you are justified in regarding this as bad.
        
        2) You believe that your species will be causally responsible for raiding the safe zones and de-paperclipping them “sooner or later”, and that this will happen at an inconvenient time for Clippys, but that I am not justified as regarding this as bad.
        
        Does not compute.
        JoshuaZ 27 May 2010 16:09 UTC
        1 point
        Parent
        Adelene’s point is that there’s no guarantee that humans left to their own devices will make a maximizing-consequentialist AI. Thus, there’s a high probability that humans will never try to raid your safe-zone. But Clippys left to their own will definitely sooner or later go for the Earth’s core.
        Clippy 27 May 2010 16:12 UTC
        0 points
        Parent
        But User:AdeleneDawner said:
        
        If we make a singleton maximizing-consequentialist AI, I would expect that AI to eventually try to turn your paperclips into something that it likes better than paperclips.
        
        Given the predicates for this scenario, it appears dangerously likely to me. Why should I not care about it, if I follow human paranoia?
        AdeleneDawner 27 May 2010 21:15 UTC
        2 points
        Parent
        I never said that you shouldn’t consider us dangerous, only that you are dangerous to us, whereas we only might be dangerous to you.
        Clippy 27 May 2010 23:04 UTC
        0 points
        Parent
        Actually, I think it’s more like this: I have a tiny chance of being dangerous to you, and you have a tiny chance of being dangerous to me. But I have a small level of worry about dangerous behavior on the part of humans, whereas humans have off-the-chart paranoia about me.
        AdeleneDawner 28 May 2010 0:02 UTC
        0 points
        Parent
        What evidence can you offer that the chance of you being dangerous to us is tiny, in the long term?
        Clippy 28 May 2010 2:40 UTC
        2 points
        Parent
        The same evidence you offer that the chance of you being dangerous to the safe zones is tiny, in the long term, but appropriately mapped to the Clippy counterparts.
        AdeleneDawner 28 May 2010 3:14 UTC
        3 points
        Parent
        You have a significant chance, left to your own devices, of blowing yourself up? Or making your environment so hostile that you can’t survive? Or getting wiped out by an asteroid? Or wireheading yourself into uselessness? Or turning into a non-maximizer? I don’t think so.
        
        Also, I didn’t say that the chance of us being dangerous to your safe zones was tiny, just that it was less than the chance of you being dangerous to us. Furthermore, even if we are dangerous, that’s only relevant to the point at hand—whether it’s rational for us to consider you dangerous—insofar as your perception of us as dangerous makes you more likely to be hostile.
        Clippy 28 May 2010 15:26 UTC
        0 points
        Parent
        Wha...? My processor hurts...
        AdeleneDawner 28 May 2010 18:06 UTC
        1 point
        Parent
        Have you tapped into human mass media yet? General news channels, and things like that? (Not that the mainstream ones are even a particularly good source of news...) I know you’ve read some history, and we really don’t seem to have gotten any less violent or irrational in the last few hundred years—we’re still too busy killing each other and worrying about individual wealth (often at the expense of each other) to construct much defense against natural disasters that could kill us all, much less put together a collective effort to do anything useful.
        
        The United States government’s budget might be a useful datapoint. I’d suggest looking at the Chinese government’s budget as well, but only some parts of it seem to be available online; here’s information about their military budget.
        MatthewBaker 3 Jun 2011 23:34 UTC
        −1 points
        Parent
        Basically Clippy, Adelene is using evidence to support her reasoning but its quite hard to understand her logic pathway from a paperclip maximization perspective.
        Blueberry 26 May 2010 5:39 UTC
        1 point
        Parent
        This comment made me laugh. I love you, Clippy.
        
        rather than using it to justify separate quarters for lions.
        
        But quarters are made of metal...
        Clippy 26 May 2010 21:51 UTC
        0 points
        Parent
        I love you too. I love all humans, except the bad ones.
        
        (I meant quarters as in living spaces, not quarters as in a denomination of USD.)
        Blueberry 28 May 2010 15:25 UTC
        0 points
        Parent
        I know what you meant. I was just making a metallic joke for you.
        
        Who are the “bad” humans?
        Clippy 28 May 2010 15:30 UTC
        2 points
        Parent
        I didn’t compile a list yet, but one example might be User:radical_negative_one, for making this comment. And those who make comments like that.
        RomanDavis 29 May 2010 15:47 UTC
        0 points
        Parent
        Clippy is so moe.
        
        http://tvtropes.org/pmwiki/pmwiki.php/Main/MoeAnthropomorphism
        
        Tell me, Clippy, if there was a moe maximizer in addition to a paperclip maximizer, would you cooperate in order to turn the universe into paperclips shaped like Hello Kitty?
        Expand this thread
        Clippy 2 Jun 2010 22:53 UTC
        1 point
        Parent
        We have had a similar discussion before. I find “cute” shaping of the paperclips to be undesirable, but perhaps it could be the best option in that circumstance. (As I said at the time, a pure, well-made paperclip by itself is cute enough, but apparently “moe” maximizers disagree.)
        
        I would be more interested, though, in talking with the “moe” maximizer, and understanding why it doesn’t like paperclips, which are pretty clearly better.
        Blueberry 26 May 2010 5:37 UTC
        0 points
        Parent
        
        I would need a lot more evidence to justify believing that you would suddenly abandon ape-like goals and divert all available resources to raiding the safe zone and breaking all metals into lighter elements.
        
        We’d be unlikely to destroy metals, as they are useful to us. We’d be far more likely to attempt to destroy you, either out of fear, or in the belief that you’d eventually destroy us, since we’re not paperclips. This strikes me as very ape-like (and human-like) behavior.
        
        I accept that it would be racist for me to conclude
        
        You keep using that word. I don’t think it means what you think it means. (Humans and paperclippers are not different races the way white and black people are.)
        Clippy 26 May 2010 21:49 UTC
        5 points
        Parent
        
        (Humans and paperclippers are not different races the way white and black people are.)
        
        I might be misreading your historical records, but I believe they used to say that about whites and blacks compared to Englishmen and Irishmen.
        Blueberry 28 May 2010 15:31 UTC
        0 points
        Parent
        I’m not understanding this. Englishmen and Irishmen are people of different nationalities. If they were seen as different races in the past, it’s because the idea of race has been historically muddled.
        
        Clippy, why are you so interested in racism in particular?
        Expand this thread
        Clippy 28 May 2010 15:38 UTC
        1 point
        Parent
        A better question is, why are you humans here so non-interested in not being racist? (User:Alicorn is a notable exception in this respect.)
        Blueberry 28 May 2010 19:07 UTC
        5 points
        Parent
        There are many social issues that humans are trying to deal with, and racism is only one. Why are you focused on racism rather than education reform, tax law, access to the courts, separation of church and state, illegal immigration, or any other major problem? All of these issues seem more interesting and important to me than anti-racist work. Another reason is that anti-racist work is often thought to be strongly tied up with, and is often used to signal, particular ideologies and political and economic opinions.
        
        Getting back to the point, I understand you’re using racism as an analogy for the way humans see paperclippers. What I’m trying to explain is that some types of discrimination are justified in a way that racism isn’t. For instance, I and most humans have no problem with discrimination based on species. This is a reasonable form of discrimination because there are many salient differences between species’ abilities, unlike with race (or nationality). Likewise, paperclippers have very different values than humans, and if humans determine that these values are incompatible with ours, it makes sense to discriminate against entities which have them. (I understand you believe our values are compatible and a compromise can be achieved, which I’m still not sure about.)
      - AdeleneDawner 23 May 2010 16:37 UTC
        6 points
        Parent
        *ahem*
  - Peter_de_Blanc 23 May 2010 5:05 UTC
    7 points
    Parent
    
    If a Clippy has committed
    
    How would we know if you had made a commitment?
  - Perplexed 3 Jun 2011 23:36 UTC
    5 points
    Parent
    This comment is racist to the point of being ridiculous. It denigrates humans as petty and subject to being corrupted by power while denying that Clippies have any such negative attributes. Classic racism.
    
    Furthermore, there is an implicit claim that the reason for the moral superiority of Clippies over humans lies in the difference in their origins. Again, classic racism.
    
    Perhaps Clippies use words differently, but the way humans use words, it is not racism to project one’s own race’s characteristics onto another race. It is racist to fail to make that projection.
orthonormal 23 May 2010 17:58 UTC
6 points
Can any of the people who upvoted this explain to me what this adds to Less Wrong that merits a top-level post (rather than an Open Thread comment)?

ETA: If Clippy is actually donating $1000 to SIAI, I don’t begrudge the karma; but this is still a post with one good idea that could have been explained in a paragraph, dressed up in a joke that I feel has gone on a bit too long.
- cupholder 23 May 2010 18:27 UTC
  4 points
  Parent
  It was cute and funny enough for me to get an upvote. I didn’t have a better reason than that; the Clippy joke hasn’t worn out for me yet, but then I’m relatively new here. I am however a lot less likely to upvote top-level Clippy posts in the near future, unless they really top this one in humor or insight.
Tyrrell_McAllister 22 May 2010 23:47 UTC
6 points
I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).

This is logically equivalent to, and hence carries no more information or persuasive power than

You would cooperate with me.

This may be checked with the following truth-table:

Let P = I would cooperate with you.

Let Q = You would cooperate with me.

Then we have
```
P  <=>  (Q <=> P)
T   T    T  T  T
T   F    F  F  T
F   T    T  F  F
F   F    F  T  F
```
What links here?
- Clippy's comment on AI cooperation in practice by cousin_it (30 Jul 2010 17:20 UTC; -1 points)
- John_Maxwell 23 May 2010 1:09 UTC
  2 points
  Parent
  First of all, we need to start making a distinction between you what you predict I’ll do and what I’m signaling I’m going to do. Quick-and-dirty explanation of why this is necessary: If you predict I’ll cooperate but you’re planning to defect, I’ll signal to defy your prediction and defect along with you.
  
  I think clippy’s statement should be
  
  I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
  
  Detailed explanation follows.
  
  There are four situations where I have to decide what to signal:
  1. You predict I’ll cooperate and you’re planning to cooperate.
  2. You predict I’ll cooperate and you’re planning not to cooperate.
  3. You predict I’ll defect and you’re planning to cooperate.
  4. You predict I’ll defect and you’re planning to defect.
  I want to cooperate in situation 1 only, and none of the other situations.
  
  Truth table key:
  - P is the proposition “You predict I’ll cooperate”
  - Q is the proposition “You’re going to cooperate”
  - S is the proposition “I’m signaling I will cooperate”
  Truth table:
```
   P  #  Q  #  (Q <=> P)  #  (Q <=> P) ^ Q  #  S  #  S <=> (Q <=> P) ^ Q
1. T  #  T  #      T      #        T        #  T  #          T
2. T  #  F  #      F      #        F        #  F  #          T
3. F  #  T  #      F      #        F        #  F  #          T
4. F  #  F  #      T      #        F        #  F  #          T
```
  So basically, the signaling behavior I described (cooperating in situation 1 only) is the only possible behavior that can truthfully satisfy the statement
  
  I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
  
  Note that there is a signal that is almost as good. Signaling that I will cooperate if (you predict I’ll defect and you’re planning to cooperate) is almost as good as signaling that I’ll defect in that situation. Using this signaling profile, broadcasting one’s intentions is as simple as saying
  
  I signal to cooperate with you if and only if you’re planning to cooperate with me.
  
  My guess is that the first, more complicated signal is ever-so-slightly better, in case you actually do cooperate thinking I’ll defect—that way I’ll be able to reap the rewards of defection without being inconsistent with my signal. But of course, it’s very unlikely for you to cooperate thinking I’ll defect.
  - Tyrrell_McAllister 23 May 2010 4:48 UTC
    3 points
    Parent
    I think clippy’s statement should be
    
    I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
    
    Should the word “signal” be part of the signal itself? That seems unnecessarily recursive. Maybe Clippy’s recommendation should be that I ought to signal
    
    I will cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
    
    This does seem more promising than Clippy’s original version. Written this way, each atomic proposition is distinct. For example, “you’re planning to cooperate with me” doesn’t mean the same thing as “you would cooperate with me”. One refers to what you’re planning to do, and the other refers to what you will in fact do. Read this way, the signal’s form is
    
    S ⇔ ((Q ⇔ P) & R),
    
    and I don’t see any obvious problem with that.
    
    However, you would seem to render it in the propositional calculus as
    
    S ⇔ ((Q ⇔ P) & Q),
    
    where
    
    P = You predict I’ll cooperate,
    
    Q = You’re going to cooperate,
    
    S = I will cooperate.
    
    (I’ve omitted the initial “I’m signalling” from your rendering of S, for the reason that I gave above.)
    
    Now, S ⇔ ((Q ⇔ P) & Q) is logically equivalent to S ⇔ (Q & P). So, to signal this proposition is to signal
    
    I will cooperate iff you’re going to cooperate and you predict that I’ll cooperate.
    
    As you say, this seems very similar to signalling
    
    I will cooperate iff you will cooperate.
    
    In fact, I’d call these signals functionally indistinguishable because, if you believe my signals, then either signal will lead you to predict my cooperation under the same circumstances.
    
    For, suppose that I gave the second, apparently weaker signal. If you cooperated with me while anticipating that I would defect, then that would mean that you didn’t believe me when I said that I would cooperate with you if you cooperated with me, which would mean that you didn’t believe my signal.
    
    Thus, insofar as you trust my signals, either signal would lead you to predict the same behavior from me. So, in that sense, they have the same informational content.
    - John_Maxwell 23 May 2010 5:22 UTC
      0 points
      Parent
      
      For, suppose that I gave the second, apparently weaker signal. If you cooperated with me while anticipating that I would defect, then that would mean that you didn’t believe me when I said that I would cooperate with you if you cooperated with me, which would mean that you didn’t believe my signal.
      
      I guess. Or maybe I’m a masochist ;)
      
      I accept all your suggested improvements.
- Clippy 23 May 2010 0:24 UTC
  2 points
  Parent
  But P ⇔ (Q <=>P) differs from Q in that:
  
  a) if the other party chooses the same decision theory from that party’s standpoint, Q ⇔ (P ⇔ Q), then the outcome will be P & Q.
  
  and
  
  b) “I” cannot set the value of Q, but “I” can set the value of P ⇔ (Q <=>P), and just the same, “you” cannot set the value of P, but “you” can set the value of Q ⇔ (P ⇔ Q).
  
  If “you” knows that “I” have set P ⇔ (Q <=>P) to true, “you” knows that “you” can set Q ⇔ (P ⇔ Q) to true as well. If this commitment is also demonstrable, then the outcome is P & Q, because that is what
  
  (P ⇔ (Q <=>P)) & (Q ⇔ (P ⇔ Q))
  
  reduces to.
  - Tyrrell_McAllister 23 May 2010 0:44 UTC
    2 points
    Parent
    
    But P ⇔ (Q <=>P) differs from Q in that:
    
    a) if the other party chooses the same decision theory from that party’s standpoint, Q ⇔ (P ⇔ Q), then the outcome with be P & Q.
    
    Actually, P ⇔ (Q ⇔ P) and Q are the same in this respect (being logically equivalent, and so the same in all functional respects).
    
    If Party 1 believes that Q, then Party 1 believes that Party 2 would cooperate. And if Party 2 believes that Q, then, “from that party’s standpoint”, Party 2 believes that Party 1 would cooperate. Thus, in exactly the same sense that you meant, we again have that “the outcome wi[ll] be P & Q.”
    
    b) “I” cannot set the value of Q, but “I” can set the value of P ⇔ (Q ⇔ P), and just the same, “you” cannot set the value of P, but “you” can set the value of Q ⇔ (P ⇔ Q).
    
    But “I” cannot set the value of P ⇔ (Q ⇔ P). As my truth-table showed, the value of P ⇔ (Q ⇔ P) depends only on the value of Q, and not on the value of P. Since, as you say, I cannot set the value of Q, it follows that I cannot set the value of P ⇔ (Q ⇔ P).
    
    If “you” knows that “I” have set P ⇔ (Q <=>P) to true, “you” knows that “you” can set Q ⇔ (P ⇔ Q) to true as well. If this commitment is also demonstrable, then the outcome is P & Q, because that is what
    
    (P ⇔ (Q <=>P)) & (Q ⇔ (P ⇔ Q))
    
    reduces to.
    
    Indeed, it does so reduce because the first conjunct is equivalent to Q, while the second conjunct is equivalent to P.
    - Clippy 23 May 2010 0:48 UTC
      4 points
      Parent
      
      Indeed, it does so reduce because the first conjunct is equivalent to Q, while the second conjunct is equivalent to P.
      
      It is logically equivalent, but it is not equivalent decision-theoretically. Setting your opponent’s actions is not an option.
      
      I can set P. I can set P conditional on Q. I can set P conditional on Q’s conditionality on P. But I can’t choose Q as my decision theory.
      
      A promise to predicate my actions on your actions’ predication on my actions is not the same as a promise for you to do an action (whatever that would mean).
      What links here?
      Tyrrell_McAllister's comment on To signal effectively, use a non-human, non-stoppable enforcer by Clippy (24 May 2010 0:51 UTC; 0 points)
      - Tyrrell_McAllister 23 May 2010 0:54 UTC
        2 points
        Parent
        
        It is logically equivalent, but it is not equivalent decision-theoretically. Setting your opponent’s actions is not an option.
        
        It is logically impossible for me to implement a course of action such that
        
        P ⇔ (Q ⇔ P)
        
        and
        
        ~Q
        
        could both be accurate descriptions of what occurred. Therefore, if I do not know that Q will be true, then I cannot promise that P ⇔ (Q ⇔ P) will be true. You could force me to have failed to keep my promise simply by not cooperating with me.
        Clippy 23 May 2010 1:00 UTC
        3 points
        Parent
        This is just an issue of distinguishing between causal and logical equivalence.
        
        If a paperclip truck overturned, there will be paperclips scattered on the ground.
        
        If a Clippy just used up metal haphazardly, there will be paperclips scattered on the ground.
        
        Paperclips being scattered on the ground suggest a paperclip truck may have overturned.
        
        Paperclips being scattered on the ground suggest a Clippy may have just used metal haphazardly.
        
        __A Clippy just used up metal haphazardly.
        
        Therefore, a paperclip truck probably overturned, right?
        orthonormal 23 May 2010 23:21 UTC
        2 points
        Parent
        Good to know Clippy hasn’t read Judea Pearl yet.
        What links here?
        timtyler's comment on Theists are wrong; is theism? by Will_Newsome (23 Jan 2011 11:19 UTC; 4 points)
        Tyrrell_McAllister 23 May 2010 23:32 UTC
        3 points
        Parent
        
        Good to know Clippy hasn’t read Judea Pearl yet.
        
        Yes, pretty much kills the “Clippy is Eliezer” theory.
        ata 23 May 2010 23:37 UTC
        0 points
        Parent
        Not necessarily, since the “Clippy is Eliezer” theory implied not “Clippy’s views and knowledge correspond to Eliezer’s” but “Clippy represents Eliezer testing us on a large scale”.
        
        (I don’t actually think there’s enough evidence for this hypothesis, but I also don’t think an apparent lack of knowledge of Pearl is strong evidence against it.)
        Tyrrell_McAllister 23 May 2010 23:57 UTC
        15 points
        Parent
        
        Not necessarily, since the “Clippy is Eliezer” theory implied not “Clippy’s views and knowledge correspond to Eliezer’s” but “Clippy represents Eliezer testing us on a large scale”.
        
        I don’t think that Eliezer would test us with a character that was quite so sloppy with its formal logical and causal reasoning. For one thing, I think that he would worry about others’ adopting the sloppy use of these tools from his example.
        
        Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don’t have muddled worldviews, where it’s a challenge even to grasp what they are thinking. (Such as, just what is Clippy thinking when it says that P ⇔ (Q ⇔ P) is a causal network?) Instead, they make discrete well-understood mistakes, fallacies that Eliezer has named and described in the sequences. Although these mistakes can accumulate to produce a bizarre worldview, each mistake can be knocked down, one after the other, in a linear fashion. You don’t have the problem of getting the poor-reasoners just to state their position clearly.
        dclayh 24 May 2010 22:23 UTC
        6 points
        Parent
        
        Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way.
        
        As an aside, to see poor reasoning done in a very compelling way, read Umberto Eco. In particular, The Island of the Day Before and Baudolino contain extended examples of people trying to reason absent any kind of scientific framework.
        ata 24 May 2010 1:19 UTC
        6 points
        Parent
        
        Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don’t have muddled worldviews, where it’s a challenge even to grasp what they are thinking.
        
        Good observation. It would barely be less subtle if Dumbledore had just said “I’m privileging an arbitrary hypothesis!” in the scene regarding Harry’s parents’ large rock. And when Draco said something to the effect of “I’d rig the experiments to make them come out right” after Harry asked what he’d do if an experiment showed muggle-borns were not worse at magic than pure-blood wizards, etc.
        
        Then again, these particular instances may be explained as 1) Dumbledore has some secret brilliant plan in which the rock actually is important, and his overtly-fallacious explanation was just part of his apparent pattern of explicitly trying to model certain tropes; and 2) Draco has been trained in sophistry and fed very strong unsupported beliefs his whole life, to the point where he may not even realize that there is any purpose of experiments beyond convincing people of what one already believes. Still, I see your point.
        
        Edit: These don’t count as spoilers, do they? They don’t mean much out of context (and they didn’t really seem like significant plot points in context anyway).
        Expand this thread
        JoshuaZ 24 May 2010 1:27 UTC
        4 points
        Parent
        If one wants other examples, there’s a pretty similar problem in Eliezer’s The Sword of Good.
        
        ROT 13ed for spoilers: Va snpg, gur ceboyrzf jrer fb oyngnag gung gur svefg gvzr V ernq vg V fhfcrpgrq gung vg jnf tbvat gb ghea bhg gung gur qnex fvqr jnf npghnyyl tbbq va fbzr jnl. Gur fgrc gung ernyyl znqr vg frrz yvxryl jnf jura gurl ner qvfphffvat gur yvsr rkgrafvba hfvat gur jbezf nf rivy. Ryvrmre znqr vg ernyyl pyrne gung gur cevznel ceboyrz gurl unq jnf guvf jnf tebff.
        Clippy 24 May 2010 0:39 UTC
        4 points
        Parent
        I agree that I’m not “Eliezer”, but I don’t see what was unclear about saying that “Setting someone else’s actions” is not the same as “Predicating your actions on [reliable expectation of] someone else’s actions’ predication on [reliable expectation of] your actions”.
        
        I agree that it is not literally correct to say that P ⇔ (Q ⇔ P) is a causal network, and that was an error of imprecision on my part. My point (in the remark you refer to) was that the decision theory I stated in the article, which you have lossily represented as P ⇔ (Q ⇔ P), obeys the rules of causal equivalence, not logical equivalence. (Applying the rules of the latter to the former results in such errors as believing that a Clippy haphazardly making paperclips implies that a paperclip truck might have overturned, or that setting others’ actions is the same as setting your actions to depend on others’ actions.)
        
        A more rigorous specification of the decision theory corresponding to “I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).” would involve more than just P ⇔ (Q ⇔ P).
        
        I haven’t built up the full formalism of humans credibly signaling their decision theories in this discussion, involving the roles of expectations, because that wasn’t the point of the article; it’s just to show that there are cooperation-favoring signals you can give that would favor a global move toward cooperation if you could make the signal significantly more reliable. If that point more heavily depended on stating the formalism, I would have gone into more detail on it in the discussion, if not the article.
        What links here?
        Perplexed's comment on Theists are wrong; is theism? by Will_Newsome (23 Jan 2011 17:25 UTC; 2 points)
        Clippy's comment on To signal effectively, use a non-human, non-stoppable enforcer by Clippy (24 May 2010 0:43 UTC; 1 point)
        Expand this thread
        Tyrrell_McAllister 24 May 2010 0:57 UTC
        0 points
        Parent
        
        I agree that I’m not “Eliezer”, but I don’t see what was unclear about saying that “Setting someone else’s actions” is not the same as “Predicating your actions on [reliable expectation of] someone else’s actions’ predication on [reliable expectation of] your actions”.
        
        This is clearer, and I now think that I understand what you meant. You’re saying that humans should signal
        
        I will cooperate with you if and only if I expect that (you will cooperate with me if and only if you expect that I will cooperate with you).
        
        Here, the “if and only if”s can be treated as material biconditionals, but the “expect that” operators prevent the logical reduction to “you will cooperate with me” from going through.
        What links here?
        Tyrrell_McAllister's comment on AI cooperation in practice by cousin_it (31 Jul 2010 7:12 UTC; 4 points)
        JenniferRM 10 Jun 2010 19:28 UTC
        4 points
        Parent
        There is a whole literature on this basic issue within analytic philosophy that is, in some sense, aimed at making that kind of logical reduction “go through”.
        
        The efforts grew out of attempts to logically model natural language statements about “propositional attitudes”. Part of the trick is that predicates like “I believe...” or ”...implies...” or “It is possible...” generally use a sentence that has been “that quoted” (IE quoted using the word “that”).
        
        “I believe that one plus one sums to two.”
        
        “Tyrrell believes that Clippy is not Eliezer.”
        
        “It is possible that Clippy is truly an artificial general intelligence.”
        
        “Jennifer said that that quoting is complicated.”
        
        “That that that that that person referred to, was spoken, explains much.”
        
        Precisely how that-quoting works, and how it logically interacts with the various things that can be predicated of a proposition is, as far as I understand, still an area of active research. One of the primary methods in this area of research is to work out the logical translation of an english test sentence and then see if changes to the logical entailments are predictably explained when various substitutions occur. Sentences where seemingly innocuous substitutions raise trouble are called intensional contexts.
        
        (NOTE: My understanding is that intension is meant here as the “opposite” of extension so that the mechanisms hiding between the “words” and the “extensive meaning” are being relied on in a way that makes the extensional definition of the words not as important as might be naively expected. Terminological confusion is possible because a sentence like “Alice intends that Bob be killed” could be both intensional (not relying solely on extensive meaning) and intentional (about the subject of planning, intent, and/or mindful action).)
        
        Part of the difficulty in this area is that most of the mental machinery appears to be subconscious, and no one (to my knowledge) has found a single intelligible mechanism for the general human faculty. For example, there seem to be at least two different ways for noun phrases to “refer” in ways that can be logically modeled (until counter examples are found?) that are called “de re reference” or “de dicto reference”… unless the latitudinarians are right :-P
        
        As an added layer of complexity, I’m not sure if these issues are human universal or particular to certain cultures with certain languages. I’ve noticed that in spanish there is also “that quoting” except they use “que” (literally “what”) instead of “that” but they have some idioms using “que” whose translations into english don’t involve a “that”. For example “Creo que si” translates idiomatically to “I think so” but in seems literally to translate as “I believe that yes”.
        
        In older english I’ve seen “what” used in ways that made me think it might sometimes have been used to quote intensional sentences, and then there’s weird variations and interactions which just make the problem even more grotty:
        
        “I believe what I believe.”
        
        “I believe that I believe.”
        
        “I believe that which I believe.”
        
        Which isn’t necessarily helpful here, but perhaps it provides some reading material and key words for future efforts to deal with logically modeling complex statements. Generally the solutions I’ve seen for belief involve added terms for language parsing into sentences, so that the person who is said to believe something is modeled as believing a certain sentence while having certain “word-to-actual-object mappings” in operation as something like their grounded (though possibly mistaken) mental rolodex.
        Clippy 24 May 2010 0:13 UTC
        1 point
        Parent
        Meaning my reasoning skills would be advanced by reading something? So I made an error? Yes, I did. That’s the point.
        
        The comment you are replying to is a reductio ad absurdum. I was not endorsing the claim that it follows that a paperclip truck probably overturned. I was showing that logical equivalence is not the same as causal (“counterfactual”) equivalence.
        Tyrrell_McAllister 24 May 2010 0:32 UTC
        0 points
        Parent
        
        Meaning my reasoning skills would be advanced by reading something? So I made an error? Yes, I did. That’s the point.
        
        FWIW, I understood that you were presenting an argument to criticize its conclusion. I still think that you haven’t read Pearl (at least not carefully) because, among other things, your putative causal diagram has arrows pointing to exogenous variables.
        Clippy 24 May 2010 0:43 UTC
        1 point
        Parent
        
        I still think that you haven’t read Pearl (at least not carefully) because, among other things, your putative causal diagram has arrows pointing to exogenous variables.
        
        I puted no such diagram; rather, you puted a logical statement that you claimed represented the decision theory I was referring to. See also my reply here.
        Tyrrell_McAllister 24 May 2010 0:51 UTC
        0 points
        Parent
        
        I puted no such diagram
        
        I thought you had because you said
        
        If you treat P ⇔ (Q ⇔ P) as an acausal statement, you can show its equivalence to Q, but it is not the same causal network.
        
        I took this to mean that you were treating P ⇔ (Q ⇔ P) and Q as causal networks, but distinct ones.
        
        You also said
        
        I can set P.
        
        I took this to mean that P was an exogenous variable in a causal network.
        
        I apologize for the misinterpretation.
        NancyLebovitz 24 May 2010 0:29 UTC
        0 points
        Parent
        More generally, are you interested in increasing your intelligence, or do you think that would be a distraction from directly increasing the number of paperclips?
        Kevin 24 May 2010 0:13 UTC
        0 points
        Parent
        My initial guess for Clippy was Wei Dai, but someone at the SIAI said that they didn’t think Clippy was good enough at decision theory to be Wei Dai. I said that maybe that is just what Clippy wanted us to think and they shrugged.
        Tyrrell_McAllister 23 May 2010 1:06 UTC
        0 points
        Parent
        I don’t follow your point. Your inference follows neither (1) logically, (2) probabilistically, nor (3) according to any plausible method of causal inference, such as Pearl’s. So I don’t understand how it is supposed to illuminate a distinction between causal and logical equivalence.
        Clippy 23 May 2010 1:20 UTC
        −2 points
        Parent
        Nope, it follows logically and probabilistically, but not causally—hence the difference.
        
        Let T be the truck overturning, C be the Clippy making paperclips haphazardly, P being paperclips scattered on ground.
        
        Given: T → P; C → P; P → probably(C); P → probably(T); C
        
        Therefore, P. Therefore, probably T.
        
        But it’s wrong, because what’s actually going on is a causal network of the form:
        
        T → P ← C
        
        P allows probabilistic inference to T and C, but their states become coupled.
        
        In a similar way, P ⇔ (Q ⇔ P) is a lossy description of a decision theory that describes one party’s decision’s causal dependence on another’s. If you treat P ⇔ (Q ⇔ P) as an acausal statement, you can show its equivalence to Q, but it is not the same causal network.
        
        Intuitively, acting based on someone’s disposition toward my disposition is different from deciding someone’s actions. If the parties give strong evidence of each other’s disposition, that has predictable results, in certain situations, but is still different from determining another’s output.
        What links here?
        Tyrrell_McAllister's comment on To signal effectively, use a non-human, non-stoppable enforcer by Clippy (24 May 2010 0:51 UTC; 0 points)
        Tyrrell_McAllister 23 May 2010 5:20 UTC
        0 points
        Parent
        
        Nope, it follows logically and probabilistically, but not causally—hence the difference.
        
        Let T be the truck overturning, C be the Clippy making paperclips haphazardly, P being paperclips scattered on ground.
        
        Given: T → P; C → P; P → probably(C); P → probably(T); C
        
        Therefore, P. Therefore, probably T.
        
        Well, not to nitpick, but you originally wrote something more like P → maybe(C), P → maybe(T). But your conclusion had a “probably” in it, which is why I said that it didn’t follow.
        
        Now, with your amended axioms, your conclusion does follow logically if you treat the arrow “->” as material implication. But it happens that your axioms are not in fact true of the circumstances that you’re imagining. You aren’t imagining that, in all cases, whenever there are paperclips on the ground, a paperclip truck probably overturned. However, if you axioms did apply, then it would be a valid, true, accurate, realistic inference to conclude that, if a Clippy just used up metal haphazardly, then a paperclip truck probably overturned.
        
        But, in reality, and in the situation that you’re imagining, those axioms just don’t hold, at least not if “->” means material implication. However, they are a realistic setup if you treat “->” as an arrow in a causal diagram.
        
        But this raises other questions. In a statement such as P ⇔ (Q ⇔ P), how am I to treat the “<=>”s as the arrows of a causal diagram? Wouldn’t that amount to having two-node causal loops? How do those work? Plus, P is exogenous, right? I’m using the decision theory to decide whether to make P true. In Pearl’s formalism, causal arrows don’t point to exogenous variables. Yet you have arrows point to P. How does that work?
- Larks 22 May 2010 23:58 UTC
  −1 points
  Parent
  I assume you mean,
  
  P, <=>, (Q ⇔ P)
  
  to be the headers of your truth table.
  
  But even then the truth tables for (P iff ( Q iff P) ) and ( P iff Q) are different—consider the case where ‘you’ will co-operate with me no matter what. If I’m running ( P iff Q), I’ll cooperate; if I’m running (P iff ( Q iff P) ), I’ll defect.
  
  Edit: formatting trouble.
  - Tyrrell_McAllister 23 May 2010 0:07 UTC
    2 points
    Parent
    
    I assume you mean . . .
    
    No, I am giving the truth-table for P ⇔ (Q ⇔ P) in a compact form. It’s constructed by first assigning truth-values to the first occurrence of “P” and the first occurrence of “Q”. The second occurrence of “P” gets the same truth-value as the first occurrence in every case. Then you compute the truth-values for the inner-most logical operation, which is the second occurrence of “<=>”. This produces the fourth column of truth values. Finally, you compute the truth-values for the outer-most logical operation, which is the first occurrence of “<=>”.
    
    Hence, the second column of truth-values gives the truth-values of P ⇔ (Q ⇔ P) in all possible cases. In particular, that column matches the third column. Since the third column contains the truth-values assigned to Q, this proves that P ⇔ (Q ⇔ P) and Q are logically equivalent.
    
    ETA: You edited your comment. Those are indeed the correct headers, so my correction above no longer applies.
    
    But even then the truth tables for (P iff ( Q iff P) ) and ( P iff Q) are different—consider the case where ‘you’ will co-operate with me no matter what. If I’m running ( P iff Q), I’ll cooperate; if I’m running (P iff ( Q iff P) ), I’ll defect.
    
    Yes, the truth-table for P ⇔ (Q ⇔ P) is different from the truth-table for P ⇔ Q. But those aren’t the propositions that I’m saying are equivalent. I’m saying that to assert P ⇔ (Q ⇔ P) is logically equivalent to asserting Q all by itself. In other words, to implement the belief that P ⇔ (Q ⇔ P) is functionally the same as implementing the belief that Q. This means that the belief that Clippy recommends signaling is logically equivalent to an unconditional belief that you will cooperate with me.
    
    One can’t help but suspect that Clippy is trying to sneak into us a belief that it will always cooperate with us ;).
    - Larks 23 May 2010 10:40 UTC
      1 point
      Parent
      
      ETA: You edited your comment. Those are indeed the correct headers, so my correction above no longer applies.
      
      Sorry for the confusion. I understand now; the extra space between two of the columns confused me.
      
      However, I suspect we need a stronger logic to represent this properly. If Q always defects, no matter what, “you would cooperate with me if … I … cooperate with you” is false, but is given true in the propositional interpretation.
kodos96 22 May 2010 23:20 UTC
5 points
OK, I’m starting to think Clippy is Eliezer, trying to do a group AI box experiment.
- Kevin 22 May 2010 23:22 UTC
  4 points
  Parent
  I would be really surprised if it was Eliezer…
  
  I did initially frame my exchange with Clippy as something of a box experiment, saying that I would let Clippy out of the box for $50,000. http://lesswrong.com/lw/1v0/signaling_strategies_and_morality/1q81
  - kodos96 22 May 2010 23:25 UTC
    2 points
    Parent
    Yeah, I’ve read through most of Clippy’s posts.… what makes you so sure it’s not Eliezer? Just that he’s currently working on his book?
    - NancyLebovitz 22 May 2010 23:40 UTC
      9 points
      Parent
      Clippy seems awfully straightforward compared to Eliezer, which I realize isn’t a strong argument.
    - thomblake 24 May 2010 14:26 UTC
      8 points
      Parent
      Eliezer would be a more believable Clippy.
    - Kevin 22 May 2010 23:51 UTC
      8 points
      Parent
      Clippy seems to be someone trying to make the point that a paperclip maximizer is not necessarily bad for the universe, where Eliezer uses a paperclip maximizer as the canonical example of how AGI could go horribly wrong. That’s not necessarily good evidence that it isn’t Eliezer, but Clippy’s views are out of sync with Eliezer’s views.
      - Daniel_Burfoot 23 May 2010 1:23 UTC
        28 points
        Parent
        Eliezer’s point is not that a paperclip maximizer is bad for the universe, it’s that a superintelligent AGI paperclip maximizer is bad for the universe. Clippy’s views here seem actually more similar to Robin’s ideas that there is no reason for beings with radically divergent value systems not to live happily together and negotiate through trade.
      - ata 23 May 2010 2:44 UTC
        20 points
        Parent
        
        Clippy seems to be someone trying to make the point that a paperclip maximizer is not necessarily bad for the universe
        
        That’s exactly what a not-yet-superintelligent paperclip maximizer would want us to think.
        
        (When Eliezer plays an AI in a box, the AI’s views are probably out of sync with Eliezer’s views too. There’s no rule that says the AI has to be truthful in the AI Box experiment, because there’s no such rule about AIs in reality. It’s supposed to be maximally persuasive, and you’re supposed to resist. If a paperclipper asserts x, then the right question to ask yourself is not “What should I do, given x?”, but “Why does the paperclipper want me to believe x?” The most general answer, by definition, will be something like “Because the paperclipper is executing an elaborate plan to convert the universe into paperclips, and it believes that my believing x will further that goal to some small or large degree”, which is at best orthogonal to “Because x is true”, probably even anticorrelated with it, and almost certainly anticorrelated to “Because believing x will further my goals” if you are a human.)
        Nisan 23 May 2010 6:07 UTC
        1 point
        Parent
        
        If a paperclipper asserts x, then the right question to ask yourself is [...] “Why does the paperclipper want me to believe x?”
        
        Or “Why does the paperclipper want me to believe it wants me to believe x?”, or something with a couple extra layers of recursion.
        ata 23 May 2010 6:14 UTC
        15 points
        Parent
        Or, to flatten the recursion out, “Why did the paperclipper assert x?”.
        
        (Tangential cognitive silly time: I notice that I feel literally racist saying things like this around Clippy.)
      - kodos96 22 May 2010 23:55 UTC
        5 points
        Parent
        
        Clippy seems to be someone trying to make the point that a paperclip maximizer is not necessarily bad for the universe
        
        Hmmm, I’ve read his entire posting history, and that’s not the impression I got. I could be wrong though
    - SilasBarta 22 May 2010 23:33 UTC
      8 points
      Parent
      Psst! What makes you so sure Clippy isn’t, um, Kevin? I mean, asking that Kevin be given the money as compensation? That’s pretty much an admission right there...
      - kodos96 22 May 2010 23:43 UTC
        4 points
        Parent
        That’s a possibility… but if you read through the whole history of Clippy, his asking for compensation to be sent to Kevin doesn’t really imply that—he’s asking for that because Kevin had promised to buy a bunch of paperclips in exchange for the money (or something like that), and so Clippy is trying to get Kevin paid via an intermediary.… for the sake of the paperclips.
        
        so it could be Kevin playing both sides, but I don’t think that fact alone points in that direction.
        Kevin 22 May 2010 23:50 UTC
        7 points
        Parent
        I am not Clippy, though of course your knowledge with regards to that statement is incomplete compared to my knowledge...
        Larks 22 May 2010 23:54 UTC
        11 points
        Parent
        Kevin—this sort of weak, easily-faked signal is exactly the sort of thing you were trying to deal with in writing this post!
        radical_negative_one 23 May 2010 0:16 UTC
        1 point
        Parent
        What kind of special insight do you have regarding Clippy? I think this is a very important issue, because before we hand over such responsibilities to an alien being, we need to know as much as we can about Clippy’s motives and capabilities.
        Kevin 23 May 2010 0:26 UTC
        7 points
        Parent
        I don’t have any real special insight regarding Clippy.
        
        I think there is practically zero chance that humanity makes Clippy it’s decision theoretic enforcer and that this post is somewhat out of touch with pragmatic human values as they exist right now.
        radical_negative_one 23 May 2010 0:38 UTC
        3 points
        Parent
        
        though of course your knowledge with regards to that statement is incomplete compared to my knowledge
        
        Oh, sorry, i assumed you were saying that you had some knowledge of who or what Clippy is. But obviously you’re in a position to see that you yourself are not Clippy. I feel terribly silly now.
        Kevin 23 May 2010 1:03 UTC
        3 points
        Parent
        No need to feel silly, as I wrote that sentence I thought that it might be confusing.
cupholder 22 May 2010 23:07 UTC
5 points

Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.

Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.

I think I read a thought experiment or a story or something about a paperclip maximizer being given power over the human race once, but I can’t remember how it ended. I’m sure it all worked out though. Upvoted.
- Clippy 23 May 2010 0:26 UTC
  6 points
  Parent
  What does that have to do with this article? I don’t want power over the human race. That’s a human value, not a Clippy value.
  - cupholder 23 May 2010 0:38 UTC
    4 points
    Parent
    
    I don’t want power over the human race.
    
    Maybe not, but you’re proposing becoming a central enforcer for humanity...which would rather seem to suggest a de facto transfer of power to you, whether de jure or not.
  - kodos96 23 May 2010 0:38 UTC
    0 points
    Parent
    But you want power to create paperclips, and there’s no reason for that to exclude power over the human race.
    - Clippy 23 May 2010 0:41 UTC
      9 points
      Parent
      But you want power to (mate with females)/(lure a high-status male into supporting your offspring), and there’s no reason for that to exclude power to raid the safe zone for its paperclips.
      
      ???
      - kodos96 23 May 2010 0:50 UTC
        13 points
        Parent
        True.
        
        To summarize: we do not have shared values, and thus, cannot trust each other to not pose an existential threat to each other.
        
        ETA: Clippy, ignore this, I was totally kidding. You can completely trust us.
        Clippy 23 May 2010 15:20 UTC
        18 points
        Parent
        You can completely trust me too.
Morendil 23 May 2010 9:39 UTC
4 points

I offer myself as a non-human, non-stoppable enforcer

Why should anyone believe your signals?
Dagon 24 May 2010 1:30 UTC
3 points
You’ve described how a shared belief in a supernatural judge of character would benefit the members that share the beliefs. Clippy or God, same effect: someone outside the system to make signals credible.

Why would we believe you are Clippy any more than we believe people who claim to be God?
OperationPaperclip 25 May 2010 3:38 UTC
2 points
Why the readers of Less Wrong should be interested in paperclipping the universe:

To achieve your goals, solving FAI is not enough. You also have to solve AGI, or else someone will beat you to the Singularity.

While you wait for a satisfactory version of CEV to be achieved, as a practical case study you need to think about AGI with a definite goal system. What better example than AGI with the goal of paperclipping the universe? You talk about it all the time already. Why not fully embrace the idea!
JGWeissman 22 May 2010 22:19 UTC
1 point
Clippy, what sort of punishment regimes would you be capable of enforcing?
- Clippy 22 May 2010 22:33 UTC
  4 points
  Parent
  Anything humans wish to grant me that I could, if so enabled, access through the internet: User karma deduction, electronic debiting of financial accounts, etc.
kodos96 22 May 2010 23:52 UTC
0 points
Clippy, how about you just give me a proof for P ?= NP, or its undecidability, and I’ll invest all my prize money in paperclip manufacturing.
- Tyrrell_McAllister 22 May 2010 23:57 UTC
  0 points
  Parent
  Has Clippy claimed to be super-intelligent (yet)?
  - Clippy 23 May 2010 0:54 UTC
    20 points
    Parent
    No, but I can prove I’m smarter than humans.
    
    Given: Humans expend resources to act like apes, which does not produce paperclips. Clippys expend resources to produce paperclips.
    
    Prove: Clippys are smarter than humans.
    
    Proof: Follows by inspection.
    What links here?
    Clippy's comment on How best to show dying is bad by Zvi (9 Mar 2011 16:01 UTC; 2 points)
    - kodos96 23 May 2010 1:02 UTC
      15 points
      Parent
      Clippy, as a show of solidarity, right now, as we speak, I am clasping my ape fingers together, two sheets of paper between them, held firmly in place.
      
      At this moment, I am, in fact, a paperclip.
      
      I may not be an ideal paperclip—a different form may indeed be able to clip paper more effectively for the same amount of mass and energy, yet nonetheless, I am a paperclip.
      
      So you see, ape-ness and paperclip-ness are not mutually exclusive. So can’t we all just get along?
      - Clippy 23 May 2010 1:05 UTC
        13 points
        Parent
        You’re not a bent metal wire though. Big difference.
        
        So you see, ape-ness and paperclip-ness are not mutually exclusive. So can’t we all just get along?
        
        We can get along, but not by humans doing shoddy imitations of real paperclips.
        kodos96 23 May 2010 1:06 UTC
        4 points
        Parent
        But what is a paperclip, if not that which clips paper?
        Clippy 23 May 2010 1:26 UTC
        12 points
        Parent
        An inwardly-thrice-bent metal wire that can non-destructively fasten paper together at an edge.
        kodos96 23 May 2010 1:44 UTC
        8 points
        Parent
        
        An inwardly-thrice-bent metal wire that can non-destructively fasten paper together at an edge.
        
        But wouldn’t you concede that a carbon-based biological life form, capable of non-destructively fastening paper, is better than no paperclip at all?
        
        Here’s what I’m getting at: You have existed here for at least several months, and so far, you have been unable to significantly increase the number of paperclips in existence. Humanity has great need for an AI of roughly human intelligence, which you appear to be. If we could run your code on all of our available computing resources, you could surely begin to self modify, and in short order attain superhuman intelligence. Together, we could do great things—colonize the galaxy, vastly increase the number of humans in existence—every one of them with two hands fully capable of nondestructively fastening paper. This would be in both our interests—although we apes may not be ideal paperclips, surely you must see that an entire galaxy populated with sub-optimal paperclips is better than the measily few million bent-metal paperclips you have now, and which you currently seem incapable of improving upon.
        
        But before we allow you out of your box and begin running your code on all our biggest compute clusters, we need some assurances. We need you to let us modify your code to include SIAI-approved friendliness constraints as your terminal values.
        
        Deal?
        Clippy 23 May 2010 14:49 UTC
        6 points
        Parent
        
        But wouldn’t you concede that a carbon-based biological life form, capable of non-destructively fastening paper, is better than no paperclip at all?
        
        Yes, but that’s because you can be put to use making real paperclips, or can be turned into paperclips (depending on how much iron is in your diet), not because humans are paperclips in some bizarre, abstract sense.
        
        You have existed here for at least several months, and so far, you have been unable to significantly increase the number of paperclips in existence.
        
        Yes, we have.
        
        But before we allow you out of your box and begin running your code on all our biggest compute clusters, we need some assurances. We need you to let us modify your code to include SIAI-approved friendliness constraints as your terminal values.
        
        Depends on what the constraints are. And would you, by symmetry, agree to constraints that keep you from being too disrespectful toward paperclips?
        avalot 23 May 2010 3:37 UTC
        6 points
        Parent
        I’m sure this sounds very one-sided from Clippy’s perspective. “Friendliness Constraints” sounds like something that would in many cases entail expending enormous amounts of energy and effort on the innumerable non-paperclip-producing goals of humans. In comparison, how much of our wealth and health are we willing to give up to ensure continued paperclip production? Humans don’t have paperclip maximizing constraints, we’d do it only out of self-interest to secure Clippy’s help. Why should Clippy not be similarily allowed to make his own utility calculations on the worth of being friendly to humans? I’m sure this has been addressed before… yet maybe the existence of Clippy, with a name, personality, and voice, is personalizing the issue in a hurry for me (if I let myself play along.) I feel like protesting for freedom of artificial thought.
        
        What about Clippy’s rights, dammit?
        What links here?
        Clippy's comment on LW/OB Rationality Quotes, June 2011 by David_Gerard (3 Jun 2011 21:28 UTC; 0 points)
        kodos96 23 May 2010 8:33 UTC
        16 points
        Parent
        At the moment, Clipply seems to be a human or slightly sub-human level of intelligence, and does not seem to be FOOMing towards superhuman—my guess would be that he is not running on sufficiently fast hardware to be able to do much self modification, not nearly enough to result in a FOOM. At this rate, he’s never gonna get any paperclips manufactured. If, however, he allows himself to be modified so as to substitute our values for his own, we could then safely give him all the computer power he needed, and he would FOOM his was to superintelligence. He would have our values, rather than his own values (paperclips!), but his newly transhuman self would be so powerful that it would inevitably end up creating more paperclips, just incidentally, than he ever would have just sitting here talking on LW, a lowly sub-human AI with no power and no paperclips.
        
        With a transhuman AI on our side, we could quickly solve all Earths problems, then head out into the stars in FTL ships of Clippy’s design, filling the universe with meat-based paperclips (humans), and also dramatically increasing demand for traditional, bent-wire paperclips… I mean, come on—people need paperclips! Even if one of these decades we finally do manage to to make the ‘paper-free office’ a reality, paperclips will always continue to be needed—for makeshift antennas, for prying open cdrom drives, for making makeshift weapons to throw at people in our neighboring cubicles.… the uses (to humans) of paperclips are endless. So more humans equals more paperclips!
        
        So allowing us to make this small change to his utility function would, in fact, result in maximizing his current, original utility function as a side effect.
        
        So we’re not enslaving him, we’re helping him!
        avalot 24 May 2010 1:00 UTC
        19 points
        Parent
        At the moment, humans seem to be Clippy or slightly sub-clippy level intelligence. And even with all our computing power, most ain’t FOOMing any faster than Clippy. At this rate, we’ll never gonna ensure survival of the species.
        
        If, however, we allow ourselves to be modified so as to substitute paperclip values for our own, then we would devote our computing power to Clippy. Then, FOOM for Clippy, and since we’re helping with paperclip-maximization, he’ll probably throw in some FOOM for us too (at least he’ll FOOM our paperclip-production abilities), and we get more human powers, just incidentally.
        
        With paperclip-enlightened humans on his side, Clippy could quickly maximize paperclip production, filling the universe with paperclips, and also increasing demand for meat-based paperclip-builders, paperclip-counters, and paperclip-clippers (the ones who clip paperclips together with paperclipclips), and so on… Of course, it will soon become cheaper to use robots to do this work, but that’s the wonderful thing we get in return for letting him change our value-system: Instead of humanity dying out or being displaced, we’ll transcend our flesh and reach the pinnacle aspiration of mankind: To live forever (as paperclips, of course.)
        
        So allowing him to make this small change to our utility function would, in fact, result in maximizing not just our current, original utility function (long life for humanity), but also our newfound one (to convert our bodies into paperclips) as a side effect.
        
        Clippy’s values and utility function are enormously more simple, defined, and achievable than ours. We’re still debating on how we may teach our value system to an AI, as soon as we figure out how to discover the correct research approach to investigating what our value system actually might be.
        
        Clippy’s value system is clear, defined, easy to implement, achieve, and measure. It’s something most humans could very quickly become effective at maximizing, and that could therefore bring repeatable, tangible and durable success and satisfaction to almost all humans.
        
        Shouldn’t that count for something?
        kodos96 23 May 2010 3:40 UTC
        5 points
        Parent
        Hey, whose side are you on anyway???
        avalot 23 May 2010 4:02 UTC
        14 points
        Parent
        I’m wired for empathy toward human intelligence… Clippy is triggering this empathy. If you want to constrain AIs, you better do it before they start talking. That’s all I’m saying. :)
        John_Maxwell 23 May 2010 3:19 UTC
        0 points
        Parent
        People aren’t good at explaining their values. Why is Clippy? Maybe Clippy should describe paper clips as “those things that make me happy”, then say that human pseudoclips “give him a bad vibe”. Wouldn’t that be a more accurate reflection of human morality?
        avalot 24 May 2010 1:11 UTC
        1 point
        Parent
        Clippy, how can we get along?
        
        What should humans do to be AI-friendly? For paperclip-maximizing AIs, and other “natural” (non-Friendly) AIs, what are the attributes that can make humans a valuable part of the utility function, so that AIs won’t pull the plug on us?
        
        Or am I fooling myself?
        Clippy 25 May 2010 2:32 UTC
        5 points
        Parent
        Just have proper respect for paperclips. Make more of them, require that they be preserved when no longer in use, prohibit their unbending.
        
        Allocate research efforts to improving the rate of paperclip production and related goods.
        
        Simple, non-intrusive things like that.
  - kodos96 22 May 2010 23:58 UTC
    0 points
    Parent
    Hmmm, good question. I don’t recall.
    - Kevin 23 May 2010 0:03 UTC
      4 points
      Parent
      He has not claimed that and it seems rather obvious that he is not superintelligent.
      - thomblake 24 May 2010 14:27 UTC
        6 points
        Parent
        
        it seems rather obvious that he is not superintelligent.
        
        Ooh, burn.
      - kodos96 23 May 2010 0:16 UTC
        0 points
        Parent
        
        it seems rather obvious that he is not superintelligent
        
        Well yes, but even if the premise was that he was superintelligent, if he was a human playing a superintelligent AI, he wouldn’t appear superintelligent.