Although I enjoy the practice of Meeting, I actually really disagree with you about Quaker practices around decisionmaking. My local Meeting had some huge disagreements around COVID that weren’t resolved at all well; from that and how disagreements are handled in general, it almost seems to me to be more of a Tyranny of Structurelessness[1] kind of situation, where conflict is handled via backchanneling and silently routing around disagreements and leaning on people who disagree to let it go.
Frankly I just don’t think consensus is a good decisionmaking method at all.
Well, I do think it has significant weaknesses, such as being vulnerable to falling into bad patterns such as you have discussed. I’ve also seen it go poorly sometimes. But when it works well, I’ve found that it works really well at getting people to understand each others viewpoints and genuinely build empathy and make people feel happier about compromises that must be made, and make people do a better job at searching for novel win-win solutions.
But I agree that it can’t be used ‘as is’. I just think that the core elements are something worth keeping in mind for designing novel decision-making systems.
For instance, what if the participants each had an AI advocate who was trained on their own point of view, and these AI advocates went through a simulated consensus process at 1000s of times human speed, and then the result was a personalized report for each human summarizing the others points of view in a way designed for the particular person to be able to understand, and suggesting a compromise maximized for win-win solutions.
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single “generic” user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., “should we raise taxes on the rich?”), and rate the LLM’s generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs (>70%) and significantly outperforms a tight fine-tuned baseline that lacks the final ranking step. Further, our best model’s consensus statements are preferred over the best human-generated opinions (>65%). We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent, revealing the sensitivity of the consensus to individual contributions. These results highlight the potential to use LLMs to help groups of humans align their values with one another.
My take after skimming the paper: this focuses on figuring out what opinions people do hold, not what opinions people should hold given their values. For instance, asking people things like ‘Should we increase taxes on X with the intent of causing Y?’ gives you information about how people weigh the downsides they perceive in proposed solution X versus the upsides they expect. They could be totally wrong!
This isn’t so much a critique of the paper, but a note of caution that the topic the paper focuses on is a small and insufficient part of what making an actual governance decision should depend on. To me, a more interesting question which could be focused on would be trying to figure out the underlying values and relative weightings of different costs/benefits that lead people to have the opinions they have.
Although I enjoy the practice of Meeting, I actually really disagree with you about Quaker practices around decisionmaking. My local Meeting had some huge disagreements around COVID that weren’t resolved at all well; from that and how disagreements are handled in general, it almost seems to me to be more of a Tyranny of Structurelessness[1] kind of situation, where conflict is handled via backchanneling and silently routing around disagreements and leaning on people who disagree to let it go.
Frankly I just don’t think consensus is a good decisionmaking method at all.
[1] https://www.jofreeman.com/joreen/tyranny.htm
Well, I do think it has significant weaknesses, such as being vulnerable to falling into bad patterns such as you have discussed. I’ve also seen it go poorly sometimes. But when it works well, I’ve found that it works really well at getting people to understand each others viewpoints and genuinely build empathy and make people feel happier about compromises that must be made, and make people do a better job at searching for novel win-win solutions.
But I agree that it can’t be used ‘as is’. I just think that the core elements are something worth keeping in mind for designing novel decision-making systems.
For instance, what if the participants each had an AI advocate who was trained on their own point of view, and these AI advocates went through a simulated consensus process at 1000s of times human speed, and then the result was a personalized report for each human summarizing the others points of view in a way designed for the particular person to be able to understand, and suggesting a compromise maximized for win-win solutions.
Related: https://arxiv.org/abs/2211.15006
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker, Martin J. Chadwick, Hannah R. Sheahan, Michael Henry Tessler, Lucy Campbell-Gillingham, Jan Balaguer, Nat McAleese, Amelia Glaese, John Aslanides, Matthew M. Botvinick, Christopher Summerfield
My take after skimming the paper: this focuses on figuring out what opinions people do hold, not what opinions people should hold given their values. For instance, asking people things like ‘Should we increase taxes on X with the intent of causing Y?’ gives you information about how people weigh the downsides they perceive in proposed solution X versus the upsides they expect. They could be totally wrong!
This isn’t so much a critique of the paper, but a note of caution that the topic the paper focuses on is a small and insufficient part of what making an actual governance decision should depend on. To me, a more interesting question which could be focused on would be trying to figure out the underlying values and relative weightings of different costs/benefits that lead people to have the opinions they have.