[Epistemic Status: Early draft version of a post I hope to publish eventually. Strongly interested in feedback and critiques, since I feel quite fuzzy about a lot of this]
When I started studying rationality and philosophy, I had the perspective that people who were in positions of power and influence should primarily focus on how to make good decisions in general and that we should generally give power to people who have demonstrated a good track record of general rationality. I also thought of power as this mostly unconstrained resource, similar to having money in your bank account, and that we should make sure to primarily allocate power to the people who are good at thinking and making decisions.
That picture has changed a lot over the years. While I think there is still a lot of value in the idea of “philosopher kings”, I’ve made a variety of updates that significantly changed my relationship to allocating power in this way:
I have come to believe that people’s ability to come to correct opinions about important questions is in large part a result of whether their social and monetary incentives reward them when they have accurate models in a specific domain. This means a person can have extremely good opinions in one domain of reality, because they are subject to good incentives, while having highly inaccurate models in a large variety of other domains in which their incentives are not well optimized.
People’s rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don’t endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that “having accurate beliefs in one domain” doesn’t straightforwardly generalize to “will have accurate beliefs in other domains”.
One is strongly predictive of the other, and that’s in part due to general thinking skills and broad cognitive ability. But another major piece of the puzzle is the person’s ability to build and seek out environments with good incentive structures.
Everyone is highly irrational in their beliefs about at least some aspects of reality, and positions of power in particular tend to encourage strong incentives that don’t tend to be optimally aligned with the truth. This means that highly competent people in positions of power often have less accurate beliefs than much less competent people who are not in positions of power.
The design of systems that hold people who have power and influence accountable in a way that aligns their interests with both forming accurate beliefs and the interests of humanity at large is a really important problem, and is a major determinant of the overall quality of the decision-making ability of a community. General rationality training helps, but for collective decision making the creation of accountability systems, the tracking of outcome metrics and the design of incentives is at least as big of a factor as the degree to which the individual members of the community are able to come to accurate beliefs on their own.
A lot of these updates have also shaped my thinking while working at CEA, LessWrong and the LTF-Fund over the past 4 years. I’ve been in various positions of power, and have interacted with many people who had lots of power over the EA and Rationality communities, and I’ve become a lot more convinced that there is a lot of low-hanging fruit and important experimentation to be done to ensure better levels of accountability and incentive-design for the institutions that guide our community.
I also generally have broadly libertarian intuitions, and a lot of my ideas about how to build functional organizations are based on a more start-up like approach that is favored here in Silicon Valley. Initially these intuitions seemed at conflict with the intuitions for more emphasis on accountability structures, with broken legal systems, ad-hoc legislation, dysfunctional boards and dysfunctional institutions all coming to mind immediately as accountability-systems run wild. I’ve since then reconciled my thoughts on these topics a good bit.
Integrity
Somewhat surprisingly, “integrity” has not been much discussed as a concept handle on LessWrong. But I’ve found it to be a pretty valuable virtue to meditate and reflect on.
I think of integrity as a more advanced form of honesty – when I say “integrity” I mean “acting in accordance with your stated beliefs.” Where honesty is the commitment to not speak direct falsehoods, integrity is the commitment to speak truths that actually ring true to yourself, not ones that are just abstractly defensible to other people. It is also a commitment to act on the truths that you do believe, and to communicate to others what your true beliefs are.
Integrity can be a double-edged sword. While it is good to judge people by the standards they expressed, it is also a surefire way to make people overly hesitant to update. If you get punished every time you change your mind because your new actions are now incongruent with the principles you explained to others before you changed your mind, then you are likely to stick with your principles for far longer than you would otherwise, even when evidence against your position is mounting.
The great benefit that I experienced from thinking of integrity as a virtue, is that it encourages me to build accurate models of my own mind and motivations. I can only act in line with ethical principles that are actually related to the real motivators of my actions. If I pretend to hold ethical principles that do not correspond to my motivators, then sooner or later my actions will diverge from my principles. I’ve come to think of a key part of integrity being the art of making accurate predictions about my own actions and communicating those as clearly as possible.
There are two natural ways to ensure that your stated principles are in line with your actions. You either adjust your stated principles until they match up with your actions, or you adjust your behavior to be in line with your stated principles. Both of those can backfire, and both of those can have significant positive effects.
Who Should You Be Accountable To?
In the context of incentive design, I find thinking about integrity valuable because it feels to me like the natural complement to accountability. The purpose of accountability is to ensure that you do what you say you are going to do, and integrity is the corresponding virtue of holding up well under high levels of accountability.
Highlighting accountability as a variable also highlights one of the biggest error modes of accountability and integrity – choosing too broad of an audience to hold yourself accountable to.
There is tradeoff between the size of the group that you are being held accountable by, and the complexity of the ethical principles you can act under. Too large of an audience, and you will be held accountable by the lowest common denominator of your values, which will rarely align well with what you actually think is moral (if you’ve done any kind of real reflection on moral principles).
Too small or too memetically close of an audience, and you risk not enough people paying attention to what you do, to actually help you notice inconsistencies in your stated beliefs and actions. The smaller the group that is holding you accountable is, the smaller your inner circle of trust, which reduces the amount of total resources that can be coordinated under your shared principles.
I think a major mistake that even many well-intentioned organizations make is to try to be held accountable by some vague conception of “the public”. As they make public statements, someone in the public will misunderstand them, causing a spiral of less communication, resulting in more misunderstandings, resulting in even less communication, culminating into an organization that is completely opaque about any of its actions and intentions, with the only communication being filtered by a PR department that has little interest in the observers acquiring any beliefs that resemble reality.
I think a generally better setup is to choose a much smaller group of people that you trust to evaluate your actions very closely, and ideally do so in a way that is itself transparent to a broader audience. Common versions of this are auditors, as well as nonprofit boards that try to ensure the integrity of an organization.
This is all part of a broader reflection on trying to create good incentives for myself and the LessWrong team. I will probably follow this up with a post that more concretely summarizes my thoughts on how all of this applies to LessWrong concretely.
In summary:
One lens to view integrity through is as an advanced form of honesty – “acting in accordance with your stated beliefs.”
To improve integrity, you can either try to bring your actions in line with your stated beliefs, or your stated beliefs in line with your actions, or reworking both at the same time. These options all have failure modes, but potential benefits.
People with power sometimes have incentives that systematically warp their ability to form accurate beliefs, and (correspondingly) to act with integrity.
An important tool for maintaining integrity (in general, and in particular as you gain power) is to carefully think about what social environment and incentive structures you want for yourself.
Choose carefully who, and how many people, you are accountable to:
Too many people, and you are limited in the complexity of the beliefs and actions that you can justify.
Too few people, too similar to you, and you won’t have enough opportunities for people to notice and point out what you’re doing wrong. You may also not end up with a strong enough coalition aligned with your principles to accomplish your goals.
More than fine. Please do post a version on its own. A lot of strong insights here, and where I disagree there’s good stuff to chew on. I’d be tempted to respond with a post.
I do think this has a different view of integrity than I have, but in writing it out, I notice that the word is overloaded and that I don’t have as good a grasp of its details as I’d like. I’m hesitant to throw out a rival definition until I have a better grasp here, but I think the thing you’re in accordance with is not beliefs so much as principles?
This was a great post that might have changed my worldview some.
Some highlights:
1.
People’s rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don’t endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that “having accurate beliefs in one domain” doesn’t straightforwardly generalize to “will have accurate beliefs in other domains”.
I’ve heard people say things like this in the past, but haven’t really taken it seriously as an important component of my rationality practice. Somehow what you say here is compelling to me (maybe because I recently noticed a major place where my thinking was majorly constrained by my social ties and social standing) and it prodded me to think about how to build “mech suits” that not only increase my power but incentives my rationality. I now have a todo item to “think about principles for incentivizing true beliefs, in team design.”
2.
I think a generally better setup is to choose a much smaller group of people that you trust to evaluate your actions very closely,
Similarly, thinking explicitly about which groups I want to be accountable to sounds like a really good idea.
I had been going through the world keeping this Paul Graham quote in mind...
I think the best test is one Gino Lee taught me: to try to do things that would make your friends say wow. But it probably wouldn’t start to work properly till about age 22, because most people haven’t had a big enough sample to pick friends from before then.
...choosing good friends, and and doing things that would impress them.
But what you’re pointing at here seems like a slightly different thing. Which people do I want to make myself transparent to, so that they can judge if I’m living up to my values.
This also gave me an idea for a CFAR style program: a reassess your life workshop, in which a small number of people come together for a period of 3 days or so, and reevaluate cached decisions. We start by making lines of retreat (with mentor assistance), and then look at high impact questions in our life: given new info, does your current job / community / relationship / life-style choice / other still make sense?
I think you might be confusing two things together under “integrity”. Having more confidence in your own beliefs than the shared/imposed beliefs of your community isn’t really a virtue or.. it’s more just a condition that a person can be in, whether it’s virtuous is completely contextual. Sometimes it is, sometimes it isn’t. I can think of lots of people who should have more confidence other peoples’ beliefs than they have in their own. In many domains, that’s me. I should listen more. I should act less boldly. An opposite of that sense of integrity is the virtue of respect- recognising other peoples’ qualities- it’s a skill. If you don’t have it, you can’t make use of other peoples’ expertise very well. A superfluence of respect is a person who is easily moved by others’ feedback, usually, a person who is patient with their surroundings.
On the other hand I can completely understand the value of {having a known track record of staying true to self-expression, claims made about the self}. Humility is actually a part of that. The usefulness of deliniating that into a virtue separate from the more general Honesty is clear to me.
There’s a lot of focus on personally updating based on evidence. Groups aren’t addressed as much. What does it mean for a group to have a belief? To have honesty or integrity?
Thoughts on integrity and accountability
[Epistemic Status: Early draft version of a post I hope to publish eventually. Strongly interested in feedback and critiques, since I feel quite fuzzy about a lot of this]
When I started studying rationality and philosophy, I had the perspective that people who were in positions of power and influence should primarily focus on how to make good decisions in general and that we should generally give power to people who have demonstrated a good track record of general rationality. I also thought of power as this mostly unconstrained resource, similar to having money in your bank account, and that we should make sure to primarily allocate power to the people who are good at thinking and making decisions.
That picture has changed a lot over the years. While I think there is still a lot of value in the idea of “philosopher kings”, I’ve made a variety of updates that significantly changed my relationship to allocating power in this way:
I have come to believe that people’s ability to come to correct opinions about important questions is in large part a result of whether their social and monetary incentives reward them when they have accurate models in a specific domain. This means a person can have extremely good opinions in one domain of reality, because they are subject to good incentives, while having highly inaccurate models in a large variety of other domains in which their incentives are not well optimized.
People’s rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don’t endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that “having accurate beliefs in one domain” doesn’t straightforwardly generalize to “will have accurate beliefs in other domains”.
One is strongly predictive of the other, and that’s in part due to general thinking skills and broad cognitive ability. But another major piece of the puzzle is the person’s ability to build and seek out environments with good incentive structures.
Everyone is highly irrational in their beliefs about at least some aspects of reality, and positions of power in particular tend to encourage strong incentives that don’t tend to be optimally aligned with the truth. This means that highly competent people in positions of power often have less accurate beliefs than much less competent people who are not in positions of power.
The design of systems that hold people who have power and influence accountable in a way that aligns their interests with both forming accurate beliefs and the interests of humanity at large is a really important problem, and is a major determinant of the overall quality of the decision-making ability of a community. General rationality training helps, but for collective decision making the creation of accountability systems, the tracking of outcome metrics and the design of incentives is at least as big of a factor as the degree to which the individual members of the community are able to come to accurate beliefs on their own.
A lot of these updates have also shaped my thinking while working at CEA, LessWrong and the LTF-Fund over the past 4 years. I’ve been in various positions of power, and have interacted with many people who had lots of power over the EA and Rationality communities, and I’ve become a lot more convinced that there is a lot of low-hanging fruit and important experimentation to be done to ensure better levels of accountability and incentive-design for the institutions that guide our community.
I also generally have broadly libertarian intuitions, and a lot of my ideas about how to build functional organizations are based on a more start-up like approach that is favored here in Silicon Valley. Initially these intuitions seemed at conflict with the intuitions for more emphasis on accountability structures, with broken legal systems, ad-hoc legislation, dysfunctional boards and dysfunctional institutions all coming to mind immediately as accountability-systems run wild. I’ve since then reconciled my thoughts on these topics a good bit.
Integrity
Somewhat surprisingly, “integrity” has not been much discussed as a concept handle on LessWrong. But I’ve found it to be a pretty valuable virtue to meditate and reflect on.
I think of integrity as a more advanced form of honesty – when I say “integrity” I mean “acting in accordance with your stated beliefs.” Where honesty is the commitment to not speak direct falsehoods, integrity is the commitment to speak truths that actually ring true to yourself, not ones that are just abstractly defensible to other people. It is also a commitment to act on the truths that you do believe, and to communicate to others what your true beliefs are.
Integrity can be a double-edged sword. While it is good to judge people by the standards they expressed, it is also a surefire way to make people overly hesitant to update. If you get punished every time you change your mind because your new actions are now incongruent with the principles you explained to others before you changed your mind, then you are likely to stick with your principles for far longer than you would otherwise, even when evidence against your position is mounting.
The great benefit that I experienced from thinking of integrity as a virtue, is that it encourages me to build accurate models of my own mind and motivations. I can only act in line with ethical principles that are actually related to the real motivators of my actions. If I pretend to hold ethical principles that do not correspond to my motivators, then sooner or later my actions will diverge from my principles. I’ve come to think of a key part of integrity being the art of making accurate predictions about my own actions and communicating those as clearly as possible.
There are two natural ways to ensure that your stated principles are in line with your actions. You either adjust your stated principles until they match up with your actions, or you adjust your behavior to be in line with your stated principles. Both of those can backfire, and both of those can have significant positive effects.
Who Should You Be Accountable To?
In the context of incentive design, I find thinking about integrity valuable because it feels to me like the natural complement to accountability. The purpose of accountability is to ensure that you do what you say you are going to do, and integrity is the corresponding virtue of holding up well under high levels of accountability.
Highlighting accountability as a variable also highlights one of the biggest error modes of accountability and integrity – choosing too broad of an audience to hold yourself accountable to.
There is tradeoff between the size of the group that you are being held accountable by, and the complexity of the ethical principles you can act under. Too large of an audience, and you will be held accountable by the lowest common denominator of your values, which will rarely align well with what you actually think is moral (if you’ve done any kind of real reflection on moral principles).
Too small or too memetically close of an audience, and you risk not enough people paying attention to what you do, to actually help you notice inconsistencies in your stated beliefs and actions. The smaller the group that is holding you accountable is, the smaller your inner circle of trust, which reduces the amount of total resources that can be coordinated under your shared principles.
I think a major mistake that even many well-intentioned organizations make is to try to be held accountable by some vague conception of “the public”. As they make public statements, someone in the public will misunderstand them, causing a spiral of less communication, resulting in more misunderstandings, resulting in even less communication, culminating into an organization that is completely opaque about any of its actions and intentions, with the only communication being filtered by a PR department that has little interest in the observers acquiring any beliefs that resemble reality.
I think a generally better setup is to choose a much smaller group of people that you trust to evaluate your actions very closely, and ideally do so in a way that is itself transparent to a broader audience. Common versions of this are auditors, as well as nonprofit boards that try to ensure the integrity of an organization.
This is all part of a broader reflection on trying to create good incentives for myself and the LessWrong team. I will probably follow this up with a post that more concretely summarizes my thoughts on how all of this applies to LessWrong concretely.
In summary:
One lens to view integrity through is as an advanced form of honesty – “acting in accordance with your stated beliefs.”
To improve integrity, you can either try to bring your actions in line with your stated beliefs, or your stated beliefs in line with your actions, or reworking both at the same time. These options all have failure modes, but potential benefits.
People with power sometimes have incentives that systematically warp their ability to form accurate beliefs, and (correspondingly) to act with integrity.
An important tool for maintaining integrity (in general, and in particular as you gain power) is to carefully think about what social environment and incentive structures you want for yourself.
Choose carefully who, and how many people, you are accountable to:
Too many people, and you are limited in the complexity of the beliefs and actions that you can justify.
Too few people, too similar to you, and you won’t have enough opportunities for people to notice and point out what you’re doing wrong. You may also not end up with a strong enough coalition aligned with your principles to accomplish your goals.
Just wanted to say I like this a lot and think it’d be fine as a full fledged post. :)
More than fine. Please do post a version on its own. A lot of strong insights here, and where I disagree there’s good stuff to chew on. I’d be tempted to respond with a post.
I do think this has a different view of integrity than I have, but in writing it out, I notice that the word is overloaded and that I don’t have as good a grasp of its details as I’d like. I’m hesitant to throw out a rival definition until I have a better grasp here, but I think the thing you’re in accordance with is not beliefs so much as principles?
Seconded.
Thirded.
fourthed. oli, do you intend to post this?
if not, could i post this text as a linkpost to this shortform?
It’s long been posted!
Integrity and accountability are core parts of rationality
ah, lovely! maybe add that link as an edit to the top-level shortform comment?
This was a great post that might have changed my worldview some.
Some highlights:
1.
I’ve heard people say things like this in the past, but haven’t really taken it seriously as an important component of my rationality practice. Somehow what you say here is compelling to me (maybe because I recently noticed a major place where my thinking was majorly constrained by my social ties and social standing) and it prodded me to think about how to build “mech suits” that not only increase my power but incentives my rationality. I now have a todo item to “think about principles for incentivizing true beliefs, in team design.”
2.
Similarly, thinking explicitly about which groups I want to be accountable to sounds like a really good idea.
I had been going through the world keeping this Paul Graham quote in mind...
...choosing good friends, and and doing things that would impress them.
But what you’re pointing at here seems like a slightly different thing. Which people do I want to make myself transparent to, so that they can judge if I’m living up to my values.
This also gave me an idea for a CFAR style program: a reassess your life workshop, in which a small number of people come together for a period of 3 days or so, and reevaluate cached decisions. We start by making lines of retreat (with mentor assistance), and then look at high impact questions in our life: given new info, does your current job / community / relationship / life-style choice / other still make sense?
Thanks for writing.
I think you might be confusing two things together under “integrity”. Having more confidence in your own beliefs than the shared/imposed beliefs of your community isn’t really a virtue or.. it’s more just a condition that a person can be in, whether it’s virtuous is completely contextual. Sometimes it is, sometimes it isn’t. I can think of lots of people who should have more confidence other peoples’ beliefs than they have in their own. In many domains, that’s me. I should listen more. I should act less boldly. An opposite of that sense of integrity is the virtue of respect- recognising other peoples’ qualities- it’s a skill. If you don’t have it, you can’t make use of other peoples’ expertise very well. A superfluence of respect is a person who is easily moved by others’ feedback, usually, a person who is patient with their surroundings.
On the other hand I can completely understand the value of {having a known track record of staying true to self-expression, claims made about the self}. Humility is actually a part of that. The usefulness of deliniating that into a virtue separate from the more general Honesty is clear to me.
There’s a lot of focus on personally updating based on evidence. Groups aren’t addressed as much. What does it mean for a group to have a belief? To have honesty or integrity?
See Sinclair: “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”