(You said you didn’t want more back-and-forth in the comments, but this is just an attempt to answer your taboo request, not prompt more discussion; no reply is expected.)
We say that clarity wins when contributing to accurate shared models—communicating “clearly”—is a dominant strategy: agents that tell the truth, the whole truth, and nothing but the truth do better (earn more money, leave more descendants, create more paperclips, &c.) than agents that lie, obfuscate, rationalize, play dumb, report dishonestly, filter evidence, &c.
Creating an environment where “clarity wins” (in this sense) looks like a very hard problem, but it’s not hard to see that some things don’t work. Jessica’s example of a judged debate where points are only awarded for arguments that the opponent acknowledges, is an environment where agents who want to win the debate have an incentive to play dumb—or be dumb—never acknowledging when their opponent made a good argument (even if the opponent in fact made a good argument). In this scenario, being clear (or at least, clear to the “reasonable person”, if not your debate opponent) doesn’t help you win.
I think main thing I want to avoid with the back-and-forth is feeling a sense of urgency to respond (esp. if I’m feeling frustrated about being misunderstood). Gonna try an experiment of “respond to comments here once per day”).
Curious how that experiment ended and think this type of rule is healthy in general (e.g. rate limiting how often one checks and responds) and I’m doing my best to follow a similar one.
It certainly seemed better than rapid-fire commenting.
I don’t know whether it was better than not commenting at all – I spent this thread mostly feeling exasperated that after 20 hours of debate and doublecrux it seemed like the conversation hadn’t really progressed. (Or at least, I was still having to re-explain things that I felt I had covered over and over again)
I do think Zack’s final comment is getting at something fairly important, but which still felt like a significant topic shift to me, and which seemed beyond scope for the current discussion.
Responding in somewhat more depth: this was a helpful crystallization of what you’re going for here.
I’m not 100% sure I agree as stated – “Tell the truth, whole truth and nothing but the truth” doesn’t (as currently stated) have a term in the equation for time-cost.
(i.e. it’s not obvious to me that a good system incentives always telling the whole-truth, because it’s time intensive to do that. Figuring out how to communicate a good ratio of “true, useful information per unit of mutual time/effort” feels like it should be part of the puzzle to me. But I generally agree that it’s good to have a system wherein people are incentivized to share useful, honest information to each other, and do not perform better by withholding information with [conscious or otherwise] intent to deceive)
((but I’m guessing your wording was just convenient shorthand rather than a disagreement with the above))
...
But on the main topic:
Jessica’s Judge example still feels like a nonsequitor that doesn’t have much to do with what I was talking about. Telling the truth/whole-truth/nothing-but still only seems useful insofar as it generates clear understanding in other people. As I said, even if the Judge example, Carol has to understand Alice’s claims.
I don’t know what it’d mean to care about truth-telling, without having that caring be grounded out in other people understanding things. And “hypothetical reasonable person” doesn’t seem that useful a referent to me.
What matters is whatever people in the system you’re trying to communicate with. If they’re reasonable, great, the problem you’re trying to solve is easier. If they’re so motivatedly-unreasonable that they won’t listen at all, the problem may be so hard that maybe you should go to some other place where more reasonable people live and try there instead. (Or, if you’re Eliezer in 2009, maybe you recurse a bit and write the Sequences for 2 years so that you gain access to more reasonable people).
(Part of the reason I’m currently very interesting in Double Crux is that having it be the default frame seems much more resistant to motivated reasoning. People can fake/obfuscate their way through a doublecrux, but my current experience is that it’s much harder to do so convincingly than during debate)
but I’m guessing your wording was just convenient shorthand rather than a disagreement with the above
Yes.
As I said, even if the Judge example, Carol has to understand Alice’s claims.
Yes, trivially; Jessica and I both agree with this.
Jessica’s Judge example still feels like a nonsequitor [sic] that doesn’t have much to do with what I was talking about.
Indeed, it may not have been relevant to the specific thing you were trying to say. However, being that as it may, I claim that the judge example is relevant to one of the broader topics of conversation: specifically, “what norms and/or principles should Less Wrong aspire to.” The Less Wrong karma and curation systems are functionally a kind of Judge, insofar as ideas that get upvoted and curated “win” (get more attention, praise, general acceptance in the rationalist community, &c.).
If Alice’s tendency to lie, obfuscate, rationalize, play dumb, report dishonestly, filter evidence, &c. isn’t an immutable feature of her character, but depends on what the Judge’s behavior incentivizes (at least to some degree), then it really matters what kind of Judge you have.
We want Less Wrong specifically, and the rationalist community more generally, to be a place where clarity wins, guided by the beauty of our weapons. If we don’t have that—if we live in a world where lies and bullshit outcompete truth, not just in the broader Society, but even in the rationalist community—then we’re dead. (Because you can’t solve AI alignment with lies and bullshit.)
As a moderator and high-karma user of lesswrong.com, you, Raymond Arnold, are a Judge. Your strong-upvote is worth 10 karma; you have the power to Curate a post; you have the power to have the power to tell Alice to shape up or ship out. You arethe incentives. This is a huge and important responsibility, your Honor—one that has the potential to influence 10¹⁴ lives per second. It’s true that truthtelling is only useful insofar as it generates understanding in other people. But that observation, in itself, doesn’t tell you how to exercise your huge and important responsibility.
If Jessica says, “Proponents of short AI timelines are lying, but not necessarily consciously lying; I mostly mean covert deception hidden from conscious attention,” and Alice says, “Huh? I can’t understand you if you’re going to use words in nonstandard ways,” then you have choices to make, and your choices have causal effects.
If you downvote Alice for pretending to be stupid when Jessica explicitly explained what she meant by the word “lying” in this context, then that has causal effects, too: maybe Alice will try harder to understand what Jessica meant, or maybe Alice will quit the site in disgust.
I can’t tell you how to wield your power, your Honor. (I mean, I can, but no one listens to me, because I don’t have power.) But I want you to notice that you have it.
If they’re so motivatedly-unreasonable that they won’t listen at all, the problem may be so hard that maybe you should go to some other place where more reasonable people live and try there instead. (Or, if you’re Eliezer in 2009, maybe you recurse a bit and write the Sequences for 2 years so that you gain access to more reasonable people).
I agree that “retreat” and “exert an extraordinary level of interpretive labor” are two possible strategies for dealing with unreasonable people. (Personally, I’m a huge fan of the “exert arbitrarily large amounts of interpretive labor” strategy, even though Ben has (correctly) observed that it leaves me incredibly vulnerable to certain forms of trolling.)
The question is, are there any other strategies?
The reason “retreat” isn’t sufficient, is because sometimes you might be competing with unreasonable people for resources (e.g., money, land, status, control of the “rationalist” and Less Wrong brand names, &c.). Is there some way to make the unreasonable people have to retreat, rather than the reasonable people?
I don’t have an answer to this. But it seems like an important thing to develop vocabulary for thinking about, even if that means playing inhard mode.
P.S. (to sister comment), I’m going to be traveling through the 25th and probably won’t check this website, in case that information helps us break out of this loop of saying “Let’s stop the implicitly-emotionally-charged back-and-forth in the comments here,” and then continuing to do so anyway. (I didn’t get anything done at my dayjob today, which is an indicator of me also suffering from the “Highly tense conversations are super stressful and expensive” problem.)
Put another way: my current sense is that the reason truth-telling-is-good is basically “increased understanding”, “increased ability to coordinate” and “increase ability to build things/impact reality”. (where the latter two is largely caused by the first).
I’m not confident that list is exhaustive, and if you have other reasons in mind that truth-telling is good that you think I’m missing I’m interested in hearing about that.
It sounds something like you think I’m saying ‘clarity is about increasing understanding, and therefore we should optimizing naively for understanding in a goodharty way’, which isn’t what I mean to be saying.
In some sense that list is rather exhaustive because it includes “know anything” and “do anything” as goals that are helped, and that pretty much includes everything. But in that sense, the list is not useful. In the sense that the list is useful, it seems woefully incomplete. And it’s tricky to know what level to respond on. Most centrally, this seems like an example of the utilitarian failure mode of reducing the impact of a policy to the measured, proven direct impact of that policy, as a default (while still getting a result that is close to equal to ‘helps with everything, everywhere, that matters at all’).
“Increased ability to think” would be one potential fourth category. If truth is not being told because it’s not in one’s interest to do so, there is strong incentive to destroy one’s own ability to think. If one was looking to essentially accept the error of ‘only point to the measurable/observable directly caused effects.’
Part of me is screaming “do we really need a post explaining why it is good when people say that which is, when they believe that would be relevant or useful, and bad when they fail to do so, or say that which is not?”
(You said you didn’t want more back-and-forth in the comments, but this is just an attempt to answer your taboo request, not prompt more discussion; no reply is expected.)
We say that clarity wins when contributing to accurate shared models—communicating “clearly”—is a dominant strategy: agents that tell the truth, the whole truth, and nothing but the truth do better (earn more money, leave more descendants, create more paperclips, &c.) than agents that lie, obfuscate, rationalize, play dumb, report dishonestly, filter evidence, &c.
Creating an environment where “clarity wins” (in this sense) looks like a very hard problem, but it’s not hard to see that some things don’t work. Jessica’s example of a judged debate where points are only awarded for arguments that the opponent acknowledges, is an environment where agents who want to win the debate have an incentive to play dumb—or be dumb—never acknowledging when their opponent made a good argument (even if the opponent in fact made a good argument). In this scenario, being clear (or at least, clear to the “reasonable person”, if not your debate opponent) doesn’t help you win.
Appreciate it. That does help.
I think main thing I want to avoid with the back-and-forth is feeling a sense of urgency to respond (esp. if I’m feeling frustrated about being misunderstood). Gonna try an experiment of “respond to comments here once per day”).
Will probably respond tomorrow.
Curious how that experiment ended and think this type of rule is healthy in general (e.g. rate limiting how often one checks and responds) and I’m doing my best to follow a similar one.
It certainly seemed better than rapid-fire commenting.
I don’t know whether it was better than not commenting at all – I spent this thread mostly feeling exasperated that after 20 hours of debate and doublecrux it seemed like the conversation hadn’t really progressed. (Or at least, I was still having to re-explain things that I felt I had covered over and over again)
I do think Zack’s final comment is getting at something fairly important, but which still felt like a significant topic shift to me, and which seemed beyond scope for the current discussion.
Responding in somewhat more depth: this was a helpful crystallization of what you’re going for here.
I’m not 100% sure I agree as stated – “Tell the truth, whole truth and nothing but the truth” doesn’t (as currently stated) have a term in the equation for time-cost.
(i.e. it’s not obvious to me that a good system incentives always telling the whole-truth, because it’s time intensive to do that. Figuring out how to communicate a good ratio of “true, useful information per unit of mutual time/effort” feels like it should be part of the puzzle to me. But I generally agree that it’s good to have a system wherein people are incentivized to share useful, honest information to each other, and do not perform better by withholding information with [conscious or otherwise] intent to deceive)
((but I’m guessing your wording was just convenient shorthand rather than a disagreement with the above))
...
But on the main topic:
Jessica’s Judge example still feels like a nonsequitor that doesn’t have much to do with what I was talking about. Telling the truth/whole-truth/nothing-but still only seems useful insofar as it generates clear understanding in other people. As I said, even if the Judge example, Carol has to understand Alice’s claims.
I don’t know what it’d mean to care about truth-telling, without having that caring be grounded out in other people understanding things. And “hypothetical reasonable person” doesn’t seem that useful a referent to me.
What matters is whatever people in the system you’re trying to communicate with. If they’re reasonable, great, the problem you’re trying to solve is easier. If they’re so motivatedly-unreasonable that they won’t listen at all, the problem may be so hard that maybe you should go to some other place where more reasonable people live and try there instead. (Or, if you’re Eliezer in 2009, maybe you recurse a bit and write the Sequences for 2 years so that you gain access to more reasonable people).
(Part of the reason I’m currently very interesting in Double Crux is that having it be the default frame seems much more resistant to motivated reasoning. People can fake/obfuscate their way through a doublecrux, but my current experience is that it’s much harder to do so convincingly than during debate)
Yes.
Yes, trivially; Jessica and I both agree with this.
Indeed, it may not have been relevant to the specific thing you were trying to say. However, being that as it may, I claim that the judge example is relevant to one of the broader topics of conversation: specifically, “what norms and/or principles should Less Wrong aspire to.” The Less Wrong karma and curation systems are functionally a kind of Judge, insofar as ideas that get upvoted and curated “win” (get more attention, praise, general acceptance in the rationalist community, &c.).
If Alice’s tendency to lie, obfuscate, rationalize, play dumb, report dishonestly, filter evidence, &c. isn’t an immutable feature of her character, but depends on what the Judge’s behavior incentivizes (at least to some degree), then it really matters what kind of Judge you have.
We want Less Wrong specifically, and the rationalist community more generally, to be a place where clarity wins, guided by the beauty of our weapons. If we don’t have that—if we live in a world where lies and bullshit outcompete truth, not just in the broader Society, but even in the rationalist community—then we’re dead. (Because you can’t solve AI alignment with lies and bullshit.)
As a moderator and high-karma user of lesswrong.com, you, Raymond Arnold, are a Judge. Your strong-upvote is worth 10 karma; you have the power to Curate a post; you have the power to have the power to tell Alice to shape up or ship out. You are the incentives. This is a huge and important responsibility, your Honor—one that has the potential to influence 10¹⁴ lives per second. It’s true that truthtelling is only useful insofar as it generates understanding in other people. But that observation, in itself, doesn’t tell you how to exercise your huge and important responsibility.
If Jessica says, “Proponents of short AI timelines are lying, but not necessarily consciously lying; I mostly mean covert deception hidden from conscious attention,” and Alice says, “Huh? I can’t understand you if you’re going to use words in nonstandard ways,” then you have choices to make, and your choices have causal effects.
If you downvote Jessica because you think she’s drawing the category boundaries of “lying” too widely in a way that makes the word less useful, that has causal effects: fewer people will read Jessica’s post; maybe Jessica will decide to change her rhetorical strategy, or maybe she’ll quit the site in disgust.
If you downvote Alice for pretending to be stupid when Jessica explicitly explained what she meant by the word “lying” in this context, then that has causal effects, too: maybe Alice will try harder to understand what Jessica meant, or maybe Alice will quit the site in disgust.
I can’t tell you how to wield your power, your Honor. (I mean, I can, but no one listens to me, because I don’t have power.) But I want you to notice that you have it.
I agree that “retreat” and “exert an extraordinary level of interpretive labor” are two possible strategies for dealing with unreasonable people. (Personally, I’m a huge fan of the “exert arbitrarily large amounts of interpretive labor” strategy, even though Ben has (correctly) observed that it leaves me incredibly vulnerable to certain forms of trolling.)
The question is, are there any other strategies?
The reason “retreat” isn’t sufficient, is because sometimes you might be competing with unreasonable people for resources (e.g., money, land, status, control of the “rationalist” and Less Wrong brand names, &c.). Is there some way to make the unreasonable people have to retreat, rather than the reasonable people?
I don’t have an answer to this. But it seems like an important thing to develop vocabulary for thinking about, even if that means playing in hard mode.
P.S. (to sister comment), I’m going to be traveling through the 25th and probably won’t check this website, in case that information helps us break out of this loop of saying “Let’s stop the implicitly-emotionally-charged back-and-forth in the comments here,” and then continuing to do so anyway. (I didn’t get anything done at my dayjob today, which is an indicator of me also suffering from the “Highly tense conversations are super stressful and expensive” problem.)
Put another way: my current sense is that the reason truth-telling-is-good is basically “increased understanding”, “increased ability to coordinate” and “increase ability to build things/impact reality”. (where the latter two is largely caused by the first).
I’m not confident that list is exhaustive, and if you have other reasons in mind that truth-telling is good that you think I’m missing I’m interested in hearing about that.
It sounds something like you think I’m saying ‘clarity is about increasing understanding, and therefore we should optimizing naively for understanding in a goodharty way’, which isn’t what I mean to be saying.
In some sense that list is rather exhaustive because it includes “know anything” and “do anything” as goals that are helped, and that pretty much includes everything. But in that sense, the list is not useful. In the sense that the list is useful, it seems woefully incomplete. And it’s tricky to know what level to respond on. Most centrally, this seems like an example of the utilitarian failure mode of reducing the impact of a policy to the measured, proven direct impact of that policy, as a default (while still getting a result that is close to equal to ‘helps with everything, everywhere, that matters at all’).
“Increased ability to think” would be one potential fourth category. If truth is not being told because it’s not in one’s interest to do so, there is strong incentive to destroy one’s own ability to think. If one was looking to essentially accept the error of ‘only point to the measurable/observable directly caused effects.’
Part of me is screaming “do we really need a post explaining why it is good when people say that which is, when they believe that would be relevant or useful, and bad when they fail to do so, or say that which is not?”