Benquo comments on Integrity and accountability are core parts of rationality

Benquo 21 Jul 2019 1:11 UTC
2 points
I don’t think I find that objectionable, it didn’t seem particularly interesting as a claim. It’s as old as “you can only serve one master,” god vs mammon, etc etc—you can’t do well at accountability to mutually incompatible standards. I think it depends a lot on the type and scope of accountability, though.
If the takeaway were what mattered about the post, why include all the other stuff?
- Raemon 21 Jul 2019 2:21 UTC
  12 points
  Parent
  If the takeaway were what mattered about the post, why include all the other stuff?
  I think habryka was trying to get across a more cohesive worldview rather than just a few points. I also don’t know that my interpretation is the same as his. But here are some points that I took from this. (Hmm. These may not have been even slightly obvious in the current post, but were part of the background conversation that prompted it, and would probably have eventually been brought up in a future post. And I think the OP at least hints at them)
  On Accountability
  First, I think there are people in the LessWrong readership who still have some naive conception of “be accountable to the public”, which is in fact a recipe for
  It’s as old as “you can only serve one master,”
  This is pretty different from how I’d describe this.
  In some sense, you only get to serve one master. But value is complex and fragile. So there many be many facets of integrity or morality that you find important to pay attention to, and you might be missing some of them. Your master may contain multitudes.
  Any given facet of morality (or more generally, things I care about it) is complicated. I might want, to hold myself to a higher standard that I currently meet, is to have several people whom I hold myself accountable to, each of which pays deep attention to a different facet.
  If I’m running a company, I might want to be accountable to
  - people who deeply understand the industry I’m working in
  - people who deeply understand human needs, coercion, etc, who can tell me if I’m mistreating my workers,
  - people who understand how my industry interacts with other industries, the general populace, or the environment, who can call me out if I’m letting negative externalities run wild.
  - [edit] maybe just having a kinda regular person who just sanity checks if I seem crazy
  I might want multiple people for each facet, who look at that facet through a different lens.
  By having these facets represented in concrete individuals, I also improve my ability to resolve confusions about how to trade off multiple sacred values. Each individual might deeply understand their domain and see it as most important. But if they disagree, they can doublecrux with each other, or with me, and I can try to integrate their views into something coherent and actionable.
  There’s also the important operationalization of “what does accountable mean?” There’s different powers you could give these people, possibly including:
  - Emergency Doublecrux button – they can demand N hours of your time per year, at least forcing you to have a conversation to justify yourself
  - Vote of No Confidence – i.e. if your project still seems good but you seem corrupt, they can fire you and replace you
  - Shut down your project, if it seems net negative
  There might be some people you trust with some powers but not others (i.e. you might think someone has good perspectives that justify the emergency double crux button but not the “fire you” button)
  There’s a somewhat different conception you could have of all this that’s more coalition focused than personal development focused.
  - Benquo 21 Jul 2019 20:20 UTC
    2 points
    Parent
    I feel like all of this mixes together info sources and incentives, so it feels a bit wrong to say I agree, but also feels a bit wrong to say I disagree.
    - Raemon 21 Jul 2019 20:25 UTC
      2 points
      Parent
      I agree that there’s a better, crisper version of this that has those more distinct.
      I’m not sure if the end product, for most people, should keep them distinct because by default humans seem to use blurry clusters of concepts to simplify things into something manageable.
      But, I think if you’re aiming to be a robust agent, or build a robustly agentic organization, there’s is something valuable about keeping these crisply separate so you can reason about them well. (you’ve previously mentioned that this is analogous to the friendly AI problem and I agree). I think it’s a good project for many people in the rationalsphere to have undertaken to deepen our understanding, even if it turns out not to be practical for the average person.
      - Benquo 21 Jul 2019 20:57 UTC
        2 points
        Parent
        The “different masters” thing is a special case of the problem of accepting feedback (i.e. learning from approval/disapproval or reward/punishment) from approval functions in conflict with each other or your goals. Multiple humans trying to do the same or compatible things with you aren’t “different masters” in this sense, since the same logical-decision-theoretic perspective (with some noise) is instantiated on both.
        But also, there’s all sorts of gathering data from others’ judgment that doesn’t fit the accountability/commitment paradigm.