Alicorn comments on Degrees of Radical Honesty

Alicorn 1 Apr 2009 0:48 UTC
17 points
Telling the truth is an expression of trust, in addition to being a way to earn it: telling someone something true that could be misused is saying “I trust you to behave appropriately with this information”. The fact that I would lie to the brownshirts as convincingly as possible shouldn’t cause anyone else to mistrust me as long as 1) they know my goals; 2) I know their goals and they know that I do; 3) our goals align, at least contextually; and 4) they know that I’m not just a pathological liar who’ll lie for no reason. The Nazis will be misled about (1), because that’s the part of their knowledge I can manipulate most directly, but anyone with whom I share much of a trust relationship (the teenage daughter playing the piano, perhaps) will know better, because they’ll be aware that I’m sheltering Jews and lying to Nazis.

The fact that I would lie to save the world should only cause someone to mistrust my statements on the eve of the apocalypse if they think that I think that they don’t want to save the world.
What links here?
- khafra's comment on Existential Risk and Public Relations by multifoliaterose (16 Aug 2010 15:06 UTC; 0 points)
- PhilGoetz 1 Apr 2009 4:36 UTC
  6 points
  Parent
  Edited to not sound like I know what Eliezer is thinking:
  
  In the Nazi example, there are only 3 likely options: Nazi, anti-Nazi, or self-interested. If non-Nazi C sees person A lie to Nazi B, C can assume, with a high degree of certainty, that person A is on the non-Nazi side. Being caught lying this way increases A’s trustworthiness to C.
  
  Radical honesty is a policy for when one is in a more complicated situation, in which there are many different sides, and there’s no way to figure out what side someone is on by process of elimination.
  
  In Eliezer’s situation in particular, which probably motivates his radical honesty policy, some simple inferences from Eliezer’s observed opinions on his own intelligence vs. the intelligence of everyone else in the world, would lead one to give a high prior probability that he will mislead people about his intentions. Additionally, he wants to get money from people who are going to ask him what he is doing and yet are incapable of understanding the answer; so it hardly seems possible for him to answer “honestly”, or even to define what that means. Most questions asked about the goals of the SIAI are probably some variation of “Have you stopped beating your wife?”
  
  Radical honesty is one way of dealing with this situation. Radical honesty is a rational variation on revenge strategies. People sometimes try to signal that they are hot-tempered, irrational people who would take horrible revenge on those who harm them, even when to do so would be irrational. Radical honesty 0 is, likewise, the attempt, say by religious people, to convince you that they will be honest with you even when it’s irrational for them to do so. Radical rational honesty is a game-theoretic argument that doesn’t require the radically honest person RHP to commit to irrationality. It tries to convince you that radical honesty is rational (or at least that RHP believes it is); therefore RHP can be trusted to be honest at all times.
  
  And it all collapses if RHP tells one lie to anybody. The game-theory argument needed to justify the lie would become so complicated that no one would take time to understand it, and so it would be useless.
  
  (Of course nobody can be honest all the time in practice; of course observers will make some allowance for “honest dishonesty” according to the circumstances.)
  
  The hell of it is that, after you make this game-theoretic argument, somebody comes along and asks you if you would lie to Nazis to save Anne Frank. If you say yes, then they can’t trust you to be radically honest. And if you say no, they decide they wouldn’t trust you because there’s something wrong with you.
  
  Because radical honesty is a game-theoretic argument, you could delimit a domain in which you will be radically honest, and reserve the right to lie outside the domain without harming your radical honesty.
  - Eliezer Yudkowsky 1 Apr 2009 5:28 UTC
    5 points
    Parent
    Phil, how many times do I have to tell you that every time you try to speak for what my positions are, you get it wrong? Are you incapable of understanding that you do not have a good model of me? Is it some naive realism thing where the little picture in your head just seems the way that Eliezer is? Do I have to request a feature that lets me tag all your posts with a little floating label that says “Phil Goetz thinks he can speak for Eliezer, but he can’t”?
    
    There’s some here that is insightful, and some that I disagree with. But if I want to make promises I’ll make them myself! If I want to stake all my reputation on always telling the truth, I’ll stake it myself! Your help is not solicited in doing so!
    
    And it all collapses if he tells one lie to anybody.
    
    I strive for honesty, hard enough to take social penalties for it; but my deliberative intelligence literally doesn’t control my voice fast enough to prevent it from ever telling a single lie to anybody. Maybe with further training and practice.
    - PhilGoetz 1 Apr 2009 16:19 UTC
      16 points
      Parent
      I do not have a good model of Eliezer. Very true. I will edit the post to make it not sound like I speak for Eliezer.
      
      But if you want to be a big man, you have to get used to people talking about you. If you open any textbook on Kant, you will find all sorts of attributions saying “Kant meant… Kant believed...” These people did not interview Kant to find out what he believed. It is understood by convention that they are presenting their interpretation of someone else’s beliefs.
      
      If you don’t want others to present their interpretations of your beliefs, you’re in the wrong business.
- MBlume 1 Apr 2009 1:11 UTC
  4 points
  Parent
  What if you need to explain to a nazi general that the bomb he’s having developed could destroy the world? Your goals don’t align, except in the fairly narrow sense that neither of you wants to destroy the world.
  - Alicorn 1 Apr 2009 15:00 UTC
    2 points
    Parent
    That’s an interesting case because, if the Nazi is well-informed about my goals, he will probably be aware that I’d lie to him for things short of the end of the world and he could easily suspect that I’m falsely informing him of this risk in order to get him not to blow up people I’d prefer to leave intact. If all he knows about my goals is that I don’t want the world to end, whether he heeds my warnings depends on his uninformed guess about the rest of my beliefs, which could fall either way.
    - MBlume 1 Apr 2009 19:27 UTC
      0 points
      Parent
      That’s why I think that if, say, a scientist were tempted by the Noble Lie “this bomb would actually destroy the whole earth, we cannot work on it any further,” this would be a terrible decision. By the same logic that says I hand Omega $100 so that counterfactual me gets $10000, I should not attempt to lie about such a risk so that counterfactual me can be believed where the risk actually exists