A Sketch of Good Communication
“Often I compare my own Fermi estimates with those of other people, and that’s sort of cool, but what’s way more interesting is when they share what variables and models they used to get to the estimate.”
– Oliver Habryka, at a model building workshop at FHI in 2016
One question that people in the AI x-risk community often ask is
“By what year do you assign a 50% probability of human-level AGI?”
We go back and forth with statements like “Well, I think you’re not updating enough on AlphaGo Zero.” “But did you know that person X has 50% in 30 years? You should weigh that heavily in your calculations.”
However, ‘timelines’ is not the interesting question. The interesting parts are in the causal models behind the estimates. Some possibilities:
Do you have a story about how the brain in fact implements back-propagation, and thus whether current ML techniques have all the key insights?
Do you have a story about the reference class of human brains and monkey brains and evolution, that gives a forecast for how hard intelligence is and as such whether it’s achievable this century?
Do you have a story about the amount of resources flowing into the problem, that uses factors like ‘Number of PhDs in ML handed out each year’ and ‘Amount of GPU available to the average PhD’?
Timelines is an area where many people discuss one variable all the time, where in fact the interesting disagreement is much deeper. Regardless of whether our 50% dates are close, when you and I have different models we will often recommend contradictory strategies for reducing x-risk.
For example, Eliezer Yudkowsky, Robin Hanson, and Nick Bostrom all have different timelines, but their models tell such different stories about what’s happening in the world that focusing on timelines instead of the broad differences in their overall pictures is a red herring.
(If in fact two very different models converge in many places, this is indeed evidence of them both capturing the same thing—and the more different the two models are, the more likely this factor is ‘truth’. But if two models significantly disagree on strategy and outcome yet hit the same 50% confidence date, and we should not count this as agreement.)
Let me sketch a general model of communication.
A Sketch
Step 1: You each have a different model that predicts a different probability for a certain event.
Step 2: You talk until you have understood how they see the world.
Step 3: You do some cognitive work to integrate the evidences and ontologies of you and them, and this implies a new probability.
″If we were simply increasing the absolute number of average researchers in the field, then I’d still expect AGI much slower than you, but if now we factor in the very peak researchers having big jumps of insight (for the rest of the field to capitalise on), then I think I actually have shorter timelines than you.”
One of the common issues I see with disagreements in general is people jumping prematurely to the third diagram before spending time getting to the second one. It’s as though if you both agree on the decision node, then you must surely agree on all the other nodes.
I prefer to spend an hour or two sharing models, before trying to change either of our minds. It otherwise creates false consensus, rather than successful communication. Going directly to Step 3 can be the right call when you’re on a logistics team and need to make a decision quickly, but is quite inappropriate for research, and in my experience the most important communication challenges are around deep intuitions.
Don’t practice coming to agreement; practice exchanging models.
Something other than Good Reasoning
Here’s an alternative thing you might do after Step 1. This is where you haven’t changed your model, but decide to agree with the other person anyway.
This doesn’t make any sense but people try it anyway, especially when they’re talking to high status people and/or experts. “Oh, okay, I’ll try hard to believe what the expert said, so I look like I know what I’m talking about.”
This last one is the worst, because it means you can’t notice your confusion any more. It represents “Ah, I notice that p = 0.6 is inconsistent with my model, therefore I will throw out my model.” Equivalently, “Oh, I don’t understand something, so I’ll stop trying.”
This is the first post in a series of short thoughts on epistemic rationality, integrity, and curiosity. My thanks to Jacob Lagerros and Alex Zhu for comments on drafts.
Descriptions of your experiences of successful communication about subtle intuitions (in any domain) are welcomed.
- Book summary: Unlocking the Emotional Brain by Oct 8, 2019, 7:11 PM; 331 points) (
- Noticing Frame Differences by Sep 30, 2019, 1:24 AM; 218 points) (
- Limits to Legibility by Jun 29, 2022, 5:42 PM; 156 points) (
- 2018 Review: Voting Results! by Jan 24, 2020, 2:00 AM; 135 points) (
- Limits to Legibility by Jun 29, 2022, 5:45 PM; 104 points) (EA Forum;
- Deferring by May 12, 2022, 11:44 PM; 101 points) (EA Forum;
- MATS Models by Jul 9, 2022, 12:14 AM; 94 points) (
- Rationality Exercises Prize of September 2019 ($1,000) by Sep 11, 2019, 12:19 AM; 89 points) (
- Public Positions and Private Guts by Oct 11, 2018, 7:38 PM; 85 points) (
- Jul 17, 2019, 7:13 AM; 62 points) 's comment on Integrity and accountability are core parts of rationality by (
- The LessWrong Team by Jun 1, 2019, 12:43 AM; 59 points) (
- Which rationality posts are begging for further practical development? by Jul 23, 2023, 10:22 PM; 58 points) (
- Feb 15, 2020, 11:45 PM; 50 points) 's comment on A ‘Practice of Rationality’ Sequence? by (
- Review: LessWrong Best of 2018 – Epistemology by Dec 28, 2020, 4:32 AM; 47 points) (
- Goodhart Taxonomy: Agreement by Jul 1, 2018, 3:50 AM; 44 points) (
- Form Your Own Opinions by Apr 28, 2018, 7:50 PM; 34 points) (
- Aug 11, 2019, 1:15 AM; 27 points) 's comment on Power Buys You Distance From The Crime by (
- Sunday August 16, 12pm (PDT) — talks by Ozzie Gooen, habryka, Ben Pace by Aug 14, 2020, 6:32 PM; 27 points) (
- Real Reality by Feb 5, 2022, 4:13 AM; 22 points) (
- Epistemology Volume of “A Map That Reflects the Territory” Set—My Personal Commentary by Dec 24, 2020, 9:34 PM; 20 points) (
- Jan 28, 2020, 2:33 AM; 19 points) 's comment on 2018 Review: Voting Results! by (
- Deferring by May 12, 2022, 11:56 PM; 18 points) (
- Jul 18, 2019, 4:14 AM; 16 points) 's comment on Doublecrux is for Building Products by (
- Believing vs understanding by Jul 24, 2021, 3:39 AM; 15 points) (
- Trust the (local) expert by Apr 9, 2018, 10:22 AM; 14 points) (
- Aug 10, 2022, 11:38 PM; 13 points) 's comment on Emrik’s Quick takes by (EA Forum;
- Jun 29, 2022, 9:32 PM; 13 points) 's comment on Limits to Legibility by (EA Forum;
- Jul 2, 2018, 9:52 PM; 10 points) 's comment on Goodhart Taxonomy: Agreement by (
- More to explore on ‘What do you think?’ by Jul 9, 2022, 11:00 PM; 7 points) (EA Forum;
- Mar 3, 2022, 10:51 PM; 7 points) 's comment on Recognizing and Dealing with Negative Automatic Thoughts by (
- Dec 12, 2024, 9:18 AM; 4 points) 's comment on The Parable of the King and the Random Process by (
- Dec 1, 2019, 10:40 PM; 4 points) 's comment on Varieties Of Argumentative Experience by (
- Oct 19, 2018, 6:06 PM; 3 points) 's comment on EA Concepts: Share Impressions Before Credences by (EA Forum;
- Oct 30, 2019, 6:15 PM; 3 points) 's comment on On Internal Family Systems and multi-agent minds: a reply to PJ Eby by (
- May 8, 2021, 6:33 AM; 2 points) 's comment on Draft report on existential risk from power-seeking AI by (EA Forum;
- May 8, 2021, 9:02 PM; 2 points) 's comment on Draft report on existential risk from power-seeking AI by (EA Forum;
- Oct 11, 2022, 9:37 AM; 2 points) 's comment on When reporting AI timelines, be clear who you’re deferring to by (EA Forum;
- Dec 23, 2019, 9:17 PM; 2 points) 's comment on Conversation about whether LW Moderators should express individual opinions about mod policy − 2019/12/22 by (
- Apr 7, 2020, 7:00 PM; 2 points) 's comment on Core Tag Examples [temporary] by (
- Aug 31, 2022, 7:30 PM; 1 point) 's comment on Don’t Over-Optimize Things by (
- Dec 24, 2020, 1:27 PM; 1 point) 's comment on hamnox’s Shortform by (
- Dec 24, 2020, 1:24 PM; 1 point) 's comment on hamnox’s Shortform by (
I very often notice myself feeling psychological pressure to agree with whoever I’m currently talking to; it feels nicer, more “cooperative.” But that’s wrong—pretending to agree when you actually don’t is really just lying about what you believe! Lying is bad!
In particular, if you make complicated plans with someone without clarifying the differences between your models, and then you go off and do your part of the plan using your private model (which you never shared) and take actions that your partner didn’t expect and are harmed by, then they might feel pretty betrayed—as if you were an enemy who was only pretending to be their collaborator all along. Which is kind of what happened. You never got on the same page. As Sarah Constantin explains in another 2018-Review-nominated post, the process of getting on the same page is not a punishment!
This model of communication describes what LW would ideally be all about. Have mentally referenced this several times.
I generally got value of this post’s crisp explanation. In particular, the notion that “model driven information-exchange” can result in people forming new beliefs that are more extreme that either person started with, rather than necessarily “averaging their beliefs together”, was something I hadn’t really considered before.