I think one issue is that I don’t have a word or phrase that quite communicates the thing I was trying to point to. When I said “sit upright in alarm”, the actions I meant to be coupled with that look more like this:
The behavior I’d have wanted my parents to exhibit would probably have started with working out—with friends and community members—and with me and my sister—and first, with each other—a shared model and language for talking about the problem, before we started to do anything about it.
As opposed to either ignoring the problem, or blaming something haphazardly, or imposing screen limits without reflection, or whatever.
I’m not sure of a phrase that communicates the right motion. I agree that alarm fatigue is a thing (and basically said so right after posting the OP). “Sitting up, taking notice, and directing your attention strategically” sort of does it but in an overwrought way. If you have a suggestion for a short handle for “the sort of initial mental motion you wish your parents had done, as well as the sort of initial mental motion you wish people
The thing prompting the OP was the facts that I’ve noticed people (in a few settings), using the word lying in a way that a) seemed false [by the definition of lying that seems most common to me, i.e. including both ‘deliberateness’ and usually at least a small bit of ‘blameworthiness’], b) seemed like specifically people were making a mistake relating to “wishing they had a word that directed people’s attention better”, and it seeming unfair to ask them to stop without giving them a better tool to direct people’s attention.
It seemed to me like you were emphasizing (a), in a way that pushed to the background the difference between wishing we had a proper way to demand attention for deceptive speech that’s not literally lying, and wishing we had a way to demand attention for the right response. As I tried to indicate in the parent comment, it felt more like a disagreement in tone than in explicit content.
I think this is the same implied disagreement expressed around your comment on my Sabbath post. It seems like you’re thinking of each alarm as “extra,” implying a need for a temporary boost in activity, while I’m modeling this particular class of alarm as suggesting that much and maybe most of one’s work has effects in the wrong direction, so one should pause, and ignore a lot of object-level bids for attention until one’s worked this out.
Okay. I think I have a somewhat better handle on some of the nuances here and how various pieces of your worldview fit together. I think I’d previously been tracking and responding to a few distinct disagreements I had, and it’d make sense if those disagreements didn’t land because I wasn’t tracking the entirety of the framework at once.
Much of civilization (and the rationalsphere as a subset of it and/or memeplex that’s influenced and constrained by it) is generally pointed in the wrong direction. This has many facets, many of which reinforce each other. Society tends to:
Schools systematically teach people to associate reason with listening-to/pleasing-teachers, or moving-words-around unconnected from reality. [Order of the Soul]
Society systematically pushing people to live apart from each other, to work until they need (or believe they need) palliatives, in a way that doesn’t give you space to think [Sabbath Hard and Go Home]
Relatedly, society provides structure that incentivizes you to advance in arbitrary hierarchy, or to tread water and barely stay afloat, without reflection of what you actually want.
By contrast, for much of history, there was a much more direct connection between what you did, how you thought, and how your own life was bettered. If you wanted a nicer home, you built a nicer home. This came with many overlapping incentive structures reinforced something closer to living healthily and generating real value.
(I’m guessing a significant confusion was me seeing this whole section as only moderately connected rather than central to the other sections)
We desperately need clarity
There’s a collection of pressures, in many-but-not-all situations, to keep both facts and decision-making principles obfuscated, and to warp language in a way that enables that. This is often part of an overall strategy (sometimes conscious, sometimes unconscious) to maneuver groups for personal gain.
It’s important to be able to speak plainly about forces that obfuscate. It’s important to lean fully into clarity and plainspeak, not just taking marginal steps towards it, both because clear language is very powerful intrinsically, and there’s a sharp dropoff as soon as ambiguity leaks in (moving the conversation to higher simulacrum levels, at which point it’s very hard to recover clarity)
[Least confident] The best focus is on your own development, rather than optimizing systems or other people
Here I become a lot less confident. This is my attempt to summarize whatever’s going on in our disagreement about my “When coordinating at scale, communicating has to reduce gracefully to about 5 words” thing. I had an impression that this seemed deeply wrong, confusing, or threatening to you. I still don’t really understand why. But my best guesses include:
This is putting the locus of control in the group, at a moment-in-history where the most important thing is reasserting individual agency and thinking for yourself (because many groups are doing the wrong-things listed above)
Insofar as group coordination is a lens to be looked through, it’s important that groups a working in a way that respects everyone’s agency and ability to think (to avoid falling into some of the failure modes associated with the first bullet point), and simplifying your message so that others can hear/act on it is part of an overall strategy that is causing harm
Possibly a simpler “people can and should read a lot and engage with more nuanced models, and most of the reason you might think that they can’t is because school and hierarchical companies warped your thinking about that?”
And then, in light of all that, something is off with my mood when I’m engaging with individual pieces of that, because I’m not properly oriented around the other pieces?
Does that sound right? Are there important things left out or gotten wrong?
This sounds really, really close. Thanks for putting in the work to produce this summary!
I think my objection to the 5 Words post fits a pattern where I’ve had difficulty expressing a class of objection. The literal content of the post wasn’t the main problem. The main problem was the emphasis of the post, in conjunction with your other beliefs and behavior.
It seemed like the hidden second half of the core claim was “and therefore we should coordinate around simpler slogans,” and not the obvious alternative conclusion “and therefore we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control.” (See On the Construction of Beacons for the relevant argument.)
It seemed to me like there was some motivated ambiguity on this point. The emphasis seemed to consistently recommend public behavior that was about mobilization rather than discourse, and back-channel discussions among well-connected people (including me) that felt like they were more about establishing compatibility than making intellectual progress. This, even though it seems like you explicitly agree with me that our current social coordination mechanisms are massively inadequate, in a way that (to me obviously) implies that they can’t possibly solve FAI.
I felt like if I pointed this kind of thing out too explicitly, I’d just get scolded for being uncharitable. I didn’t expect, however, that this scolding would be accompanied by an explanation of what specific, anticipation-constraining, alternative belief you held. I’ve been getting better at pointing out this pattern (e.g. my recent response to habryka) instead of just shutting down due to a preverbal recognition of it. It’s very hard to write a comment like this one clearly and without extraneous material, especially of a point-scoring or whining nature. (If it were easy I’d see more people writing things like this.)
It seemed like the hidden second half of the core claim was “and therefore we should coordinate around simpler slogans,” and not the obvious alternative conclusion “and therefore we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control.” (See On the Construction of Beacons for the relevant argument.)
“Scale up more carefully” is a reasonable summary of what I intended to convey, although I meant it more like “here are specific ways you might fuck up if you aren’t careful.” At varying levels of scale, what is actually possible, and why?
FWIW, the motivating example for You Have About Five Words was recent (at the time) EA backlash about the phrase “EA is Talent Constrained”, which many people interpreted to mean “if I’m, like, reasonably talented, EA organizations will need me and hire me”, as opposed to “The EA ecosystem is looking for particular rare talents and skills, and this is more important than funding at the moment.”
The original 80k article was relatively nuanced about this (although re-reading it now I’m not sure it really spells out the particular distinction that’d become a source of frustration. They’d since written an apology/clarification, but it seemed like there was a more general lesson that needed learning, both among EA communicators (and, separately, rationalist communicators) and among people who were trying to keep up with the latest advice/news/thoughts.
The takeaways I meant to be building towards (but, I do recognize now that I didn’t explicitly say this at all and probably should have), were:
If you’re a communicator, make sure the concept you’re communicating degrades gracefully as it loses nuance, (and this is important enough that it should be among the things we hold thought leaders accountable to). Include the nuance, for sure. But some concepts predictably become net-harmful when reduced to their post title, or single-most-salient line.
Water flows downhill, and ideas flow towards simplicity. You can’t fight this, but you can design the contours of the hill around your idea such that it flows towards a simplicity that is useful.
If you’re a person consuming content, pay extra attention to the fact that you, and the people around you, are probably missing nuance by default. This is causing some kinds of double-illusion-of-transparency. Even if communicators are paying attention to the previous point, it’s still a very hard job. Take some responsibility for making sure you understand concepts before propagating them, and if you’re getting angry at a communicator, doublecheck what they actually said first.
this is important enough that it should be among the things we hold thought leaders accountable to
I would say that this depends on what kind of communicator or thought leader we’re talking about. That is, there may be a need for multiple, differently-specialized “communicator” roles.
To the extent that you’re trying to build a mass movement, then I agree completely and without reservations: you’re accountable for the monster spawned by the five-word summary of your manifesto, because pandering to idiots who can’t retain more than five words of nuance is part of the job description of being a movement leader. (If you don’t like the phrase “pandering to idiots”, feel free to charitably pretend I said something else instead; I’m afraid I only have so much time to edit this comment.)
To the extent that you’re actually trying to do serious intellectual work, then no, absolutely not. The job description of an intellectual is, first, to get the theory right, and second, to explain the theory clearly to whosoever has the time and inclination to learn. Those two things are already really hard! To add to these the additional demand that the thinker make sure that her concepts won’t be predictably misunderstood as something allegedly net-harmful by people who don’t have the time and inclination to learn, is just too much of a burden; it can’t be part of the job description of someone whose first duty (on which everything else depends) is to get the theory right.
The tragedy of the so-called “effective altruism” and “rationalist” communities, is that we’re trying to be both mass movements, and intellectually serious, and we didn’t realize until too late in September the extent to which this presents incompatible social-engineering requirements. I’m glad we have people like you thinking about the problem now, though!
(If you don’t like the phrase “pandering to idiots”, feel free to charitably pretend I said something else instead; I’m afraid I only have so much time to edit this comment.)
You know, it’s kind of dishonest of you to appeal to your comment-editing time budget when you really just wanted to express visceral contempt for the idea that intellectuals should be held accountable for alleged harm from simplifications of what they actually said. Like, it didn’t actually take me very much time to generate the phrase “accountability for alleged harm from simplifications” rather than “pandering to idiots”, so comment-editing time can’t have been your real reason for choosing the latter.
More generally: when the intensity of norm enforcement depends on the perceived cost of complying with the norm, people who disagree with the norm (but don’t want to risk defying it openly) face an incentive to exaggerate the costs of compliance. It takes more courage to say, “I meant exactly what I said” when you can plausibly-deniably get away with, “Oh, I’m sorry, that’s just my natural writing style, which would be very expensive for me to change.” But it’s not the expenses—it’s you!
Except you probably won’t understand what I’m trying to say for another three days and nine hours.
I agree that this applies more to mass movements than smaller intellectual groups.
Recall that my claim is “if you’re trying to coordinate with 1/10/100/1000+ people, these are the constraints or causes/effects on how you can communicate (which are different for each scale)”.
It also naively suggest different constraints on EA (which seems a bit more like a mass movement) than LessWrong (which sort of flirted with being a mass movement, but then didn’t really followup on in. It seems to me that the number of ‘serious contributors’ is more like “around 100-200” than “1000+”). And meanwhile, not everyone on LW is actually trying to coordinate with anyone, which is fine.
...
There are some weirder questions that come into play when you’re building a theory about coordination, in public in a space that does coordination. For now, set those aside and focus just on things like, say, developing theories of physics.
If you’re not trying to coordinate with anyone, you can think purely about theory with no cost.
If you’re an intellectual trying to coordinate only with intellectuals who want to follow your work (say, in the ballpark of 10 people), you can expect to have N words worth of shared nuance. (My previous best guess for N is 200,000 words worth, but I don’t strongly stand by that guess)
It is an actual interesting question, for purely intellectual pursuits, whether you get more value out of having a single collaborator that you spend hours each day talking to, vs a larger number of collaborators. You might want to focus on getting your own theory right without regard for other people’s ability to follow you (and if so, you might keep it all to yourself for the time being, or you might post braindumps into a public forum without optimizing it for readability, and let others skim it and see if it’s worth pursuing to them, and then only communicate further with those people if it seems worth it)
But there is an actual benefit to your ability to think, to have other people who can understand what you’re saying so they can critique it (or build off it). This may (or may not) lead you to decide it’s worth putting effort into distillation, so that you can get more eyes reading the thing. (Or, you might grab all the best physicists and put them in a single lab together, where nobody has to spend effort per se on distillation, it just happens naturally as a consequence of conversation)
Again, this is optional. But it’s an open question, even just in the domain of physics, how much you want to try to coordinate with others, and then what strategies that requires.
trying to coordinate with 1/10/100/1000+ people [...] not everyone on LW is actually trying to coordinate with anyone, which is fine.
I wonder if it might be worth writing a separate post explaining why the problems you want to solve with 10/100/1000+ people have the structure of a coordination problem (where it’s important not just that we make good choices, but that we make the same choice), and how much coordination you think is needed?
In World A, everyone has to choose Stag, or the people who chose Stag fail to accomplish anything. The payoff is discontinuous in the number of people choosing Stag: if you can’t solve the coordination problem, you’re stuck with rabbits.
In World B, the stag hunters get a payoff n1.1 stags, where n is the number of people choosing Stag. The payoff is continuous in n: it would be nice if the group was better-coordinated, but it’s usually not worth sacrificing on other goals in order to make the group better-coordinated. We mostly want everyone to be trying their hardest to get the theory of hunting right, rather than making sure that everyone is using the same (possibly less-correct) theory.
I think I mostly perceive myself as living in World B, and tend to be suspicious of people who seem to assume we live in World A without adequately arguing for it (when “Can’t do that, it’s a coordination problem” would be an awfully convenient excuse for choices made for other reasons).
Stag/Rabbit is a simplification (hopefully obvious but worth stating explicitly to avoid accidental motte/bailey-ing). A slightly higher-resolution-simplification:
When it comes to “what norms do we want”, it’s not that you either get all-or-nothing, but if different groups are pushing different norms in the same space, there’s deadweight loss as some people get annoyed at other people for violating their preferred norms, and/or confused about what they’re actually supposed to be doing.
[modeling this out properly and explicitly would take me at least 30 minutes and possibly much longer. Makes more sense to do later on as a post]
Oh, I see; the slightly-higher-resolution version makes a lot more sense to me. When working out the game theory, I would caution that different groups pushing different norms is more like an asymmetric “Battle of the Sexes” problem, which is importantly different from the symmetric Stag Hunt. In Stag Hunt, everyone wants the same thing, and the problem is just about risk-dominance vs. payoff-dominance. In Battle of the Sexes, the problem is about how people who want different things manage to live with each other.
Nod. Yeah that may be a better formulation. I may update the Staghunt post to note this.
“Notice that you’re not actually playing the game you think you’re playing” is maybe a better general rule. (i.e. in the Staghunt article, I was addressing people who think that they’re in a prisoner’s dilemma, but actually they’re in something more like a staghunt. But, yeah, at least some of the time they’re actually in a Battle of the Sexes, or… well, actually in real life it’s always actually some complicated nuanced thing)”
The core takeaway from the Staghunt article that still seems good to me is “if you feel like other people are defecting on your preferred strategy, actually check to see if you can coordinate on your preferred strategy. If it turns out people aren’t just making a basic mistake, you may need to actually convince people your strategy is good (or, learn from them why your strategy is not in fact straightforwardly good.”
I think this (probably?) remains a good strategy in most payoff-variants.
Another thing I’d add—putting this in its own comment to help avoid any one thread blowing up in complexity:
The orientation-towards-clarity problem is at the very least strongly analogous to, and most likely actually an important special case of, the AI alignment problem.
Friendliness is strictly easier with groups of humans, since the orthogonality thesis is false for humans—if you abuse us out of our natural values you end up with stupider humans and groups. This is reason for hope about FAI relative to UFAI, but also a pretty strong reason to prioritize developing a usable decision theory and epistemology for humans over using our crappy currently-available decision theory to direct resources in the short run towards groups trying to solve the problem in full generality.
AGI will, if ever, almost certainly be built—directly or indirectly—by a group of humans, and if that group is procedurally Unfriendly (as opposed to just foreign), there’s no reason to expect the process to correct to FAI). For this reason, friendly group intelligence is probably necessary for solving the general problem of FAI.
I’m not sure I agree with all the details of this (it’s not obvious to me that humans are friendly if you scale them up) but I agree that the orientation towards clarity likely has important analogues to the AI Alignment problem.
it’s not obvious to me that humans are friendly if you scale them up
It seems like any AI built by multiple humans coordinating is going to reflect the optimization target of the coordination process building it, so we had better figure out how to make this so.
Initially I replied to this with “yeah, that seems straightforwardly true”, then something about that felt off and then it took me awhile to figure out why.
This:
It seems like any AI built by multiple humans coordinating is going to reflect the optimization target of the coordination process building it
...seems straightforwardly true.
This:
..., so we had better figure out how to make this so. [where “this” is “humans are friendly if you scale them up”]
Could unpack a few different ways. I still agree with the general sentiment you’re pointing at here, but I think the most straightforward interpretation of this is mostly false.
Humans are not scalably friendly, so many of the most promising forms of Friendly AI seem to _not_ be “humans who are scaled up”, instead they’re doing other things.
One example being CEV. (Which hopes that “if you scale up ALL humans TOGETHER and make them think carefully as you do so, you get something good, and if it turns out that you don’t get something good that coheres it gracefully fails and says ‘nope, sorry, this didn’t work.’”. But this is a different thing that scaling any particular human or small group of humans)
Iterated Amplication seems to more directly depend on humans being friendly as you scale them up, or at least some humans being so.
I am in fact pretty wary of Iterated Amplication for that reason.
The whole point of CEV, as I understand, is to figure out the thing you could build that is actually robust to you not being friendly yourself. The sort of thing that if the ancient greeks were building, you could possibly hope for them to figure out so that they didn’t accidentally lock the entire lightcone in Bronze Age Warrior Ethos.
...
..
“You can’t built friendly AI without this”
You and Zack have said this (or something like it) on occasion, and fwiw I get a fairly political red flag from the statement. Which is not to say I don’t think the statement is getting at something important. But I notice each group I talk to has a strong sense of “the thing my groupis focused on is the key, and if we can’t get people to understand that we’re doomed.”
I myself have periodically noticed myself saying (and thinking), “if we can’t get people to understand each other’s frames and ontologies, we autolose. If we can’t get people to jointly learn how to communicate and listen non-defensively and non-defensive-causing (i.e. the paradigm I’m currently pushing), we’re doomed.”
But, when I ask myself “is that really true? Is it sheer autolose if we don’t all learn to doublecrux and whatnot?” No. Clearly not. I do think losing becomes more likely. I wouldn’t be pushing my preferred paradigm if I didn’t think that paradigm was useful. But the instinct to say “this is so important that we’re obviously doomed if everyone doesn’t understand and incorporate this” feels to me like something that should have a strong prior of “your reason for saying that is to grab attention and build political momentum.”
(and to be clear, this is just my current prior, not a decisive argument. And again, I can certainly imagine human friendliness being crucial to at least many forms of AGI, and being quite useful regardless. Just noting that I feel a need to treat claims of this form with some caution.)
Hmm – I do notice, one comment of yours up, you note: “if that group is procedurally Unfriendly (as opposed to just foreign), there’s no reason to expect the process to correct to FAI).” Something about this phrasing suggests you might be using the phrase friendly/unfriendly/foreign in ways that weren’t quite mapping to how I was using them.
Noting this mostly as “I’m updating a bit towards my previous comment not quite landing in your ontology” (which I’m trying to get better at tracking).
Okay. I’m not confident I understand the exact thing you’re pointing at but I think this comment (and reference to the sabbath conversation) helped orient me a bit in the direction towards understanding your frame. I think this may need to gestate a bit before I’m able to say more.
Put another way: the comment you just wrote seems (roughly) like a different way I might have attempted to explain my views in the OP. So I’m not sure if the issue is there still something subtle I’m not getting, or if I communicated the OP in a way that made it seem like your comment here wasn’t a valid restatement of my point, or something other thing?
I think one issue is that I don’t have a word or phrase that quite communicates the thing I was trying to point to. When I said “sit upright in alarm”, the actions I meant to be coupled with that look more like this:
As opposed to either ignoring the problem, or blaming something haphazardly, or imposing screen limits without reflection, or whatever.
I’m not sure of a phrase that communicates the right motion. I agree that alarm fatigue is a thing (and basically said so right after posting the OP). “Sitting up, taking notice, and directing your attention strategically” sort of does it but in an overwrought way. If you have a suggestion for a short handle for “the sort of initial mental motion you wish your parents had done, as well as the sort of initial mental motion you wish people
The thing prompting the OP was the facts that I’ve noticed people (in a few settings), using the word lying in a way that a) seemed false [by the definition of lying that seems most common to me, i.e. including both ‘deliberateness’ and usually at least a small bit of ‘blameworthiness’], b) seemed like specifically people were making a mistake relating to “wishing they had a word that directed people’s attention better”, and it seeming unfair to ask them to stop without giving them a better tool to direct people’s attention.
It seemed to me like you were emphasizing (a), in a way that pushed to the background the difference between wishing we had a proper way to demand attention for deceptive speech that’s not literally lying, and wishing we had a way to demand attention for the right response. As I tried to indicate in the parent comment, it felt more like a disagreement in tone than in explicit content.
I think this is the same implied disagreement expressed around your comment on my Sabbath post. It seems like you’re thinking of each alarm as “extra,” implying a need for a temporary boost in activity, while I’m modeling this particular class of alarm as suggesting that much and maybe most of one’s work has effects in the wrong direction, so one should pause, and ignore a lot of object-level bids for attention until one’s worked this out.
Okay. I think I have a somewhat better handle on some of the nuances here and how various pieces of your worldview fit together. I think I’d previously been tracking and responding to a few distinct disagreements I had, and it’d make sense if those disagreements didn’t land because I wasn’t tracking the entirety of the framework at once.
Let me know how this sounds as an ITT:
Thinking and building a life for yourself
Much of civilization (and the rationalsphere as a subset of it and/or memeplex that’s influenced and constrained by it) is generally pointed in the wrong direction. This has many facets, many of which reinforce each other. Society tends to:
Schools systematically teach people to associate reason with listening-to/pleasing-teachers, or moving-words-around unconnected from reality. [Order of the Soul]
Society systematically pushing people to live apart from each other, to work until they need (or believe they need) palliatives, in a way that doesn’t give you space to think [Sabbath Hard and Go Home]
Relatedly, society provides structure that incentivizes you to advance in arbitrary hierarchy, or to tread water and barely stay afloat, without reflection of what you actually want.
By contrast, for much of history, there was a much more direct connection between what you did, how you thought, and how your own life was bettered. If you wanted a nicer home, you built a nicer home. This came with many overlapping incentive structures reinforced something closer to living healthily and generating real value.
(I’m guessing a significant confusion was me seeing this whole section as only moderately connected rather than central to the other sections)
We desperately need clarity
There’s a collection of pressures, in many-but-not-all situations, to keep both facts and decision-making principles obfuscated, and to warp language in a way that enables that. This is often part of an overall strategy (sometimes conscious, sometimes unconscious) to maneuver groups for personal gain.
It’s important to be able to speak plainly about forces that obfuscate. It’s important to lean fully into clarity and plainspeak, not just taking marginal steps towards it, both because clear language is very powerful intrinsically, and there’s a sharp dropoff as soon as ambiguity leaks in (moving the conversation to higher simulacrum levels, at which point it’s very hard to recover clarity)
[Least confident] The best focus is on your own development, rather than optimizing systems or other people
Here I become a lot less confident. This is my attempt to summarize whatever’s going on in our disagreement about my “When coordinating at scale, communicating has to reduce gracefully to about 5 words” thing. I had an impression that this seemed deeply wrong, confusing, or threatening to you. I still don’t really understand why. But my best guesses include:
This is putting the locus of control in the group, at a moment-in-history where the most important thing is reasserting individual agency and thinking for yourself (because many groups are doing the wrong-things listed above)
Insofar as group coordination is a lens to be looked through, it’s important that groups a working in a way that respects everyone’s agency and ability to think (to avoid falling into some of the failure modes associated with the first bullet point), and simplifying your message so that others can hear/act on it is part of an overall strategy that is causing harm
Possibly a simpler “people can and should read a lot and engage with more nuanced models, and most of the reason you might think that they can’t is because school and hierarchical companies warped your thinking about that?”
And then, in light of all that, something is off with my mood when I’m engaging with individual pieces of that, because I’m not properly oriented around the other pieces?
Does that sound right? Are there important things left out or gotten wrong?
This sounds really, really close. Thanks for putting in the work to produce this summary!
I think my objection to the 5 Words post fits a pattern where I’ve had difficulty expressing a class of objection. The literal content of the post wasn’t the main problem. The main problem was the emphasis of the post, in conjunction with your other beliefs and behavior.
It seemed like the hidden second half of the core claim was “and therefore we should coordinate around simpler slogans,” and not the obvious alternative conclusion “and therefore we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control.” (See On the Construction of Beacons for the relevant argument.)
It seemed to me like there was some motivated ambiguity on this point. The emphasis seemed to consistently recommend public behavior that was about mobilization rather than discourse, and back-channel discussions among well-connected people (including me) that felt like they were more about establishing compatibility than making intellectual progress. This, even though it seems like you explicitly agree with me that our current social coordination mechanisms are massively inadequate, in a way that (to me obviously) implies that they can’t possibly solve FAI.
I felt like if I pointed this kind of thing out too explicitly, I’d just get scolded for being uncharitable. I didn’t expect, however, that this scolding would be accompanied by an explanation of what specific, anticipation-constraining, alternative belief you held. I’ve been getting better at pointing out this pattern (e.g. my recent response to habryka) instead of just shutting down due to a preverbal recognition of it. It’s very hard to write a comment like this one clearly and without extraneous material, especially of a point-scoring or whining nature. (If it were easy I’d see more people writing things like this.)
“Scale up more carefully” is a reasonable summary of what I intended to convey, although I meant it more like “here are specific ways you might fuck up if you aren’t careful.” At varying levels of scale, what is actually possible, and why?
FWIW, the motivating example for You Have About Five Words was recent (at the time) EA backlash about the phrase “EA is Talent Constrained”, which many people interpreted to mean “if I’m, like, reasonably talented, EA organizations will need me and hire me”, as opposed to “The EA ecosystem is looking for particular rare talents and skills, and this is more important than funding at the moment.”
The original 80k article was relatively nuanced about this (although re-reading it now I’m not sure it really spells out the particular distinction that’d become a source of frustration. They’d since written an apology/clarification, but it seemed like there was a more general lesson that needed learning, both among EA communicators (and, separately, rationalist communicators) and among people who were trying to keep up with the latest advice/news/thoughts.
The takeaways I meant to be building towards (but, I do recognize now that I didn’t explicitly say this at all and probably should have), were:
If you’re a communicator, make sure the concept you’re communicating degrades gracefully as it loses nuance, (and this is important enough that it should be among the things we hold thought leaders accountable to). Include the nuance, for sure. But some concepts predictably become net-harmful when reduced to their post title, or single-most-salient line.
Water flows downhill, and ideas flow towards simplicity. You can’t fight this, but you can design the contours of the hill around your idea such that it flows towards a simplicity that is useful.
If you’re a person consuming content, pay extra attention to the fact that you, and the people around you, are probably missing nuance by default. This is causing some kinds of double-illusion-of-transparency. Even if communicators are paying attention to the previous point, it’s still a very hard job. Take some responsibility for making sure you understand concepts before propagating them, and if you’re getting angry at a communicator, doublecheck what they actually said first.
I would say that this depends on what kind of communicator or thought leader we’re talking about. That is, there may be a need for multiple, differently-specialized “communicator” roles.
To the extent that you’re trying to build a mass movement, then I agree completely and without reservations: you’re accountable for the monster spawned by the five-word summary of your manifesto, because pandering to idiots who can’t retain more than five words of nuance is part of the job description of being a movement leader. (If you don’t like the phrase “pandering to idiots”, feel free to charitably pretend I said something else instead; I’m afraid I only have so much time to edit this comment.)
To the extent that you’re actually trying to do serious intellectual work, then no, absolutely not. The job description of an intellectual is, first, to get the theory right, and second, to explain the theory clearly to whosoever has the time and inclination to learn. Those two things are already really hard! To add to these the additional demand that the thinker make sure that her concepts won’t be predictably misunderstood as something allegedly net-harmful by people who don’t have the time and inclination to learn, is just too much of a burden; it can’t be part of the job description of someone whose first duty (on which everything else depends) is to get the theory right.
The tragedy of the so-called “effective altruism” and “rationalist” communities, is that we’re trying to be both mass movements, and intellectually serious, and we didn’t realize until too late in September the extent to which this presents incompatible social-engineering requirements. I’m glad we have people like you thinking about the problem now, though!
You know, it’s kind of dishonest of you to appeal to your comment-editing time budget when you really just wanted to express visceral contempt for the idea that intellectuals should be held accountable for alleged harm from simplifications of what they actually said. Like, it didn’t actually take me very much time to generate the phrase “accountability for alleged harm from simplifications” rather than “pandering to idiots”, so comment-editing time can’t have been your real reason for choosing the latter.
More generally: when the intensity of norm enforcement depends on the perceived cost of complying with the norm, people who disagree with the norm (but don’t want to risk defying it openly) face an incentive to exaggerate the costs of compliance. It takes more courage to say, “I meant exactly what I said” when you can plausibly-deniably get away with, “Oh, I’m sorry, that’s just my natural writing style, which would be very expensive for me to change.” But it’s not the expenses—it’s you!
Except you probably won’t understand what I’m trying to say for another three days and nine hours.
I agree that this applies more to mass movements than smaller intellectual groups.
Recall that my claim is “if you’re trying to coordinate with 1/10/100/1000+ people, these are the constraints or causes/effects on how you can communicate (which are different for each scale)”.
It also naively suggest different constraints on EA (which seems a bit more like a mass movement) than LessWrong (which sort of flirted with being a mass movement, but then didn’t really followup on in. It seems to me that the number of ‘serious contributors’ is more like “around 100-200” than “1000+”). And meanwhile, not everyone on LW is actually trying to coordinate with anyone, which is fine.
...
There are some weirder questions that come into play when you’re building a theory about coordination, in public in a space that does coordination. For now, set those aside and focus just on things like, say, developing theories of physics.
If you’re not trying to coordinate with anyone, you can think purely about theory with no cost.
If you’re an intellectual trying to coordinate only with intellectuals who want to follow your work (say, in the ballpark of 10 people), you can expect to have N words worth of shared nuance. (My previous best guess for N is 200,000 words worth, but I don’t strongly stand by that guess)
It is an actual interesting question, for purely intellectual pursuits, whether you get more value out of having a single collaborator that you spend hours each day talking to, vs a larger number of collaborators. You might want to focus on getting your own theory right without regard for other people’s ability to follow you (and if so, you might keep it all to yourself for the time being, or you might post braindumps into a public forum without optimizing it for readability, and let others skim it and see if it’s worth pursuing to them, and then only communicate further with those people if it seems worth it)
But there is an actual benefit to your ability to think, to have other people who can understand what you’re saying so they can critique it (or build off it). This may (or may not) lead you to decide it’s worth putting effort into distillation, so that you can get more eyes reading the thing. (Or, you might grab all the best physicists and put them in a single lab together, where nobody has to spend effort per se on distillation, it just happens naturally as a consequence of conversation)
Again, this is optional. But it’s an open question, even just in the domain of physics, how much you want to try to coordinate with others, and then what strategies that requires.
(Upvoted.)
I wonder if it might be worth writing a separate post explaining why the problems you want to solve with 10/100/1000+ people have the structure of a coordination problem (where it’s important not just that we make good choices, but that we make the same choice), and how much coordination you think is needed?
In World A, everyone has to choose Stag, or the people who chose Stag fail to accomplish anything. The payoff is discontinuous in the number of people choosing Stag: if you can’t solve the coordination problem, you’re stuck with rabbits.
In World B, the stag hunters get a payoff n1.1 stags, where n is the number of people choosing Stag. The payoff is continuous in n: it would be nice if the group was better-coordinated, but it’s usually not worth sacrificing on other goals in order to make the group better-coordinated. We mostly want everyone to be trying their hardest to get the theory of hunting right, rather than making sure that everyone is using the same (possibly less-correct) theory.
I think I mostly perceive myself as living in World B, and tend to be suspicious of people who seem to assume we live in World A without adequately arguing for it (when “Can’t do that, it’s a coordination problem” would be an awfully convenient excuse for choices made for other reasons).
Thanks.
Stag/Rabbit is a simplification (hopefully obvious but worth stating explicitly to avoid accidental motte/bailey-ing). A slightly higher-resolution-simplification:
When it comes to “what norms do we want”, it’s not that you either get all-or-nothing, but if different groups are pushing different norms in the same space, there’s deadweight loss as some people get annoyed at other people for violating their preferred norms, and/or confused about what they’re actually supposed to be doing.
[modeling this out properly and explicitly would take me at least 30 minutes and possibly much longer. Makes more sense to do later on as a post]
Oh, I see; the slightly-higher-resolution version makes a lot more sense to me. When working out the game theory, I would caution that different groups pushing different norms is more like an asymmetric “Battle of the Sexes” problem, which is importantly different from the symmetric Stag Hunt. In Stag Hunt, everyone wants the same thing, and the problem is just about risk-dominance vs. payoff-dominance. In Battle of the Sexes, the problem is about how people who want different things manage to live with each other.
Nod. Yeah that may be a better formulation. I may update the Staghunt post to note this.
“Notice that you’re not actually playing the game you think you’re playing” is maybe a better general rule. (i.e. in the Staghunt article, I was addressing people who think that they’re in a prisoner’s dilemma, but actually they’re in something more like a staghunt. But, yeah, at least some of the time they’re actually in a Battle of the Sexes, or… well, actually in real life it’s always actually some complicated nuanced thing)”
The core takeaway from the Staghunt article that still seems good to me is “if you feel like other people are defecting on your preferred strategy, actually check to see if you can coordinate on your preferred strategy. If it turns out people aren’t just making a basic mistake, you may need to actually convince people your strategy is good (or, learn from them why your strategy is not in fact straightforwardly good.”
I think this (probably?) remains a good strategy in most payoff-variants.
Thanks. This all makes sense. I think I have a bunch more thoughts but for the immediate future will just let that sink in a bit.
Another thing I’d add—putting this in its own comment to help avoid any one thread blowing up in complexity:
The orientation-towards-clarity problem is at the very least strongly analogous to, and most likely actually an important special case of, the AI alignment problem.
Friendliness is strictly easier with groups of humans, since the orthogonality thesis is false for humans—if you abuse us out of our natural values you end up with stupider humans and groups. This is reason for hope about FAI relative to UFAI, but also a pretty strong reason to prioritize developing a usable decision theory and epistemology for humans over using our crappy currently-available decision theory to direct resources in the short run towards groups trying to solve the problem in full generality.
AGI will, if ever, almost certainly be built—directly or indirectly—by a group of humans, and if that group is procedurally Unfriendly (as opposed to just foreign), there’s no reason to expect the process to correct to FAI). For this reason, friendly group intelligence is probably necessary for solving the general problem of FAI.
I’m not sure I agree with all the details of this (it’s not obvious to me that humans are friendly if you scale them up) but I agree that the orientation towards clarity likely has important analogues to the AI Alignment problem.
It seems like any AI built by multiple humans coordinating is going to reflect the optimization target of the coordination process building it, so we had better figure out how to make this so.
Initially I replied to this with “yeah, that seems straightforwardly true”, then something about that felt off and then it took me awhile to figure out why.
This:
...seems straightforwardly true.
This:
Could unpack a few different ways. I still agree with the general sentiment you’re pointing at here, but I think the most straightforward interpretation of this is mostly false.
Humans are not scalably friendly, so many of the most promising forms of Friendly AI seem to _not_ be “humans who are scaled up”, instead they’re doing other things.
One example being CEV. (Which hopes that “if you scale up ALL humans TOGETHER and make them think carefully as you do so, you get something good, and if it turns out that you don’t get something good that coheres it gracefully fails and says ‘nope, sorry, this didn’t work.’”. But this is a different thing that scaling any particular human or small group of humans)
Iterated Amplication seems to more directly depend on humans being friendly as you scale them up, or at least some humans being so.
I am in fact pretty wary of Iterated Amplication for that reason.
The whole point of CEV, as I understand, is to figure out the thing you could build that is actually robust to you not being friendly yourself. The sort of thing that if the ancient greeks were building, you could possibly hope for them to figure out so that they didn’t accidentally lock the entire lightcone in Bronze Age Warrior Ethos.
...
..
“You can’t built friendly AI without this”
You and Zack have said this (or something like it) on occasion, and fwiw I get a fairly political red flag from the statement. Which is not to say I don’t think the statement is getting at something important. But I notice each group I talk to has a strong sense of “the thing my group is focused on is the key, and if we can’t get people to understand that we’re doomed.”
I myself have periodically noticed myself saying (and thinking), “if we can’t get people to understand each other’s frames and ontologies, we autolose. If we can’t get people to jointly learn how to communicate and listen non-defensively and non-defensive-causing (i.e. the paradigm I’m currently pushing), we’re doomed.”
But, when I ask myself “is that really true? Is it sheer autolose if we don’t all learn to doublecrux and whatnot?” No. Clearly not. I do think losing becomes more likely. I wouldn’t be pushing my preferred paradigm if I didn’t think that paradigm was useful. But the instinct to say “this is so important that we’re obviously doomed if everyone doesn’t understand and incorporate this” feels to me like something that should have a strong prior of “your reason for saying that is to grab attention and build political momentum.”
(and to be clear, this is just my current prior, not a decisive argument. And again, I can certainly imagine human friendliness being crucial to at least many forms of AGI, and being quite useful regardless. Just noting that I feel a need to treat claims of this form with some caution.)
Hmm – I do notice, one comment of yours up, you note: “if that group is procedurally Unfriendly (as opposed to just foreign), there’s no reason to expect the process to correct to FAI).” Something about this phrasing suggests you might be using the phrase friendly/unfriendly/foreign in ways that weren’t quite mapping to how I was using them.
Noting this mostly as “I’m updating a bit towards my previous comment not quite landing in your ontology” (which I’m trying to get better at tracking).
Okay. I’m not confident I understand the exact thing you’re pointing at but I think this comment (and reference to the sabbath conversation) helped orient me a bit in the direction towards understanding your frame. I think this may need to gestate a bit before I’m able to say more.
Put another way: the comment you just wrote seems (roughly) like a different way I might have attempted to explain my views in the OP. So I’m not sure if the issue is there still something subtle I’m not getting, or if I communicated the OP in a way that made it seem like your comment here wasn’t a valid restatement of my point, or something other thing?