Edited to not sound like I know what Eliezer is thinking:
In the Nazi example, there are only 3 likely options: Nazi, anti-Nazi, or self-interested. If non-Nazi C sees person A lie to Nazi B, C can assume, with a high degree of certainty, that person A is on the non-Nazi side. Being caught lying this way increases A’s trustworthiness to C.
Radical honesty is a policy for when one is in a more complicated situation, in which there are many different sides, and there’s no way to figure out what side someone is on by process of elimination.
In Eliezer’s situation in particular, which probably motivates his radical honesty policy, some simple inferences from Eliezer’s observed opinions on his own intelligence vs. the intelligence of everyone else in the world, would lead one to give a high prior probability that he will mislead people about his intentions. Additionally, he wants to get money from people who are going to ask him what he is doing and yet are incapable of understanding the answer; so it hardly seems possible for him to answer “honestly”, or even to define what that means. Most questions asked about the goals of the SIAI are probably some variation of “Have you stopped beating your wife?”
Radical honesty is one way of dealing with this situation. Radical honesty is a rational variation on revenge strategies. People sometimes try to signal that they are hot-tempered, irrational people who would take horrible revenge on those who harm them, even when to do so would be irrational. Radical honesty 0 is, likewise, the attempt, say by religious people, to convince you that they will be honest with you even when it’s irrational for them to do so. Radical rational honesty is a game-theoretic argument that doesn’t require the radically honest person RHP to commit to irrationality. It tries to convince you that radical honesty is rational (or at least that RHP believes it is); therefore RHP can be trusted to be honest at all times.
And it all collapses if RHP tells one lie to anybody. The game-theory argument needed to justify the lie would become so complicated that no one would take time to understand it, and so it would be useless.
(Of course nobody can be honest all the time in practice; of course observers will make some allowance for “honest dishonesty” according to the circumstances.)
The hell of it is that, after you make this game-theoretic argument, somebody comes along and asks you if you would lie to Nazis to save Anne Frank. If you say yes, then they can’t trust you to be radically honest. And if you say no, they decide they wouldn’t trust you because there’s something wrong with you.
Because radical honesty is a game-theoretic argument, you could delimit a domain in which you will be radically honest, and reserve the right to lie outside the domain without harming your radical honesty.
Phil, how many times do I have to tell you that every time you try to speak for what my positions are, you get it wrong? Are you incapable of understanding that you do not have a good model of me? Is it some naive realism thing where the little picture in your head just seems the way that Eliezer is? Do I have to request a feature that lets me tag all your posts with a little floating label that says “Phil Goetz thinks he can speak for Eliezer, but he can’t”?
There’s some here that is insightful, and some that I disagree with. But if I want to make promises I’ll make them myself! If I want to stake all my reputation on always telling the truth, I’ll stake it myself! Your help is not solicited in doing so!
And it all collapses if he tells one lie to anybody.
I strive for honesty, hard enough to take social penalties for it; but my deliberative intelligence literally doesn’t control my voice fast enough to prevent it from ever telling a single lie to anybody. Maybe with further training and practice.
I do not have a good model of Eliezer. Very true. I will edit the post to make it not sound like I speak for Eliezer.
But if you want to be a big man, you have to get used to people talking about you. If you open any textbook on Kant, you will find all sorts of attributions saying “Kant meant… Kant believed...” These people did not interview Kant to find out what he believed. It is understood by convention that they are presenting their interpretation of someone else’s beliefs.
If you don’t want others to present their interpretations of your beliefs, you’re in the wrong business.
Edited to not sound like I know what Eliezer is thinking:
In the Nazi example, there are only 3 likely options: Nazi, anti-Nazi, or self-interested. If non-Nazi C sees person A lie to Nazi B, C can assume, with a high degree of certainty, that person A is on the non-Nazi side. Being caught lying this way increases A’s trustworthiness to C.
Radical honesty is a policy for when one is in a more complicated situation, in which there are many different sides, and there’s no way to figure out what side someone is on by process of elimination.
In Eliezer’s situation in particular, which probably motivates his radical honesty policy, some simple inferences from Eliezer’s observed opinions on his own intelligence vs. the intelligence of everyone else in the world, would lead one to give a high prior probability that he will mislead people about his intentions. Additionally, he wants to get money from people who are going to ask him what he is doing and yet are incapable of understanding the answer; so it hardly seems possible for him to answer “honestly”, or even to define what that means. Most questions asked about the goals of the SIAI are probably some variation of “Have you stopped beating your wife?”
Radical honesty is one way of dealing with this situation. Radical honesty is a rational variation on revenge strategies. People sometimes try to signal that they are hot-tempered, irrational people who would take horrible revenge on those who harm them, even when to do so would be irrational. Radical honesty 0 is, likewise, the attempt, say by religious people, to convince you that they will be honest with you even when it’s irrational for them to do so. Radical rational honesty is a game-theoretic argument that doesn’t require the radically honest person RHP to commit to irrationality. It tries to convince you that radical honesty is rational (or at least that RHP believes it is); therefore RHP can be trusted to be honest at all times.
And it all collapses if RHP tells one lie to anybody. The game-theory argument needed to justify the lie would become so complicated that no one would take time to understand it, and so it would be useless.
(Of course nobody can be honest all the time in practice; of course observers will make some allowance for “honest dishonesty” according to the circumstances.)
The hell of it is that, after you make this game-theoretic argument, somebody comes along and asks you if you would lie to Nazis to save Anne Frank. If you say yes, then they can’t trust you to be radically honest. And if you say no, they decide they wouldn’t trust you because there’s something wrong with you.
Because radical honesty is a game-theoretic argument, you could delimit a domain in which you will be radically honest, and reserve the right to lie outside the domain without harming your radical honesty.
Phil, how many times do I have to tell you that every time you try to speak for what my positions are, you get it wrong? Are you incapable of understanding that you do not have a good model of me? Is it some naive realism thing where the little picture in your head just seems the way that Eliezer is? Do I have to request a feature that lets me tag all your posts with a little floating label that says “Phil Goetz thinks he can speak for Eliezer, but he can’t”?
There’s some here that is insightful, and some that I disagree with. But if I want to make promises I’ll make them myself! If I want to stake all my reputation on always telling the truth, I’ll stake it myself! Your help is not solicited in doing so!
I strive for honesty, hard enough to take social penalties for it; but my deliberative intelligence literally doesn’t control my voice fast enough to prevent it from ever telling a single lie to anybody. Maybe with further training and practice.
I do not have a good model of Eliezer. Very true. I will edit the post to make it not sound like I speak for Eliezer.
But if you want to be a big man, you have to get used to people talking about you. If you open any textbook on Kant, you will find all sorts of attributions saying “Kant meant… Kant believed...” These people did not interview Kant to find out what he believed. It is understood by convention that they are presenting their interpretation of someone else’s beliefs.
If you don’t want others to present their interpretations of your beliefs, you’re in the wrong business.