johnswentworth comments on Speaking of Stag Hunts

johnswentworth 6 Nov 2021 19:29 UTC
124 points
0
There’s a vision here of what LessWrong could/should be, and what a rationalist community could/should be more generally. I want to push back against that vision, and offer a sketch of an alternative frame.
The post summarizes the vision I want to push back against as something like this:
What I really want from LessWrong is to make my own thinking better, moment to moment. To be embedded in a context that evokes clearer thinking, the way being in a library evokes whispers. To be embedded in a context that anti-evokes all those things my brain keeps trying to do, the way being in a church anti-evokes coarse language.
Now, I do think that’s a great piece to have in a vision for the LessWrong or the rationalist community. But I don’t think it’s the central piece, at least not in my preferred vision.
What’s missing? What is the central piece?
Fundamentally, the problem with this vision is that it isn’t built for a high-dimensional world. In a high-dimensional world, the hard part of reaching an optimum isn’t going-uphill-rather-than-downhill; it’s figuring out which direction is best, out of millions of possible directions. Half the directions are marginally-good, half are marginally-bad, but the more important fact is that the vast majority of directions matter very little.
In a high-dimensional world, getting buffeted in random directions mostly just doesn’t matter. Only one-part-in-a-million of the random buffeting in a million-dimensional space will be along the one direction that matters; a push along the direction that matters can be one-hundred-thousandth as strong as the random noise and still overwhelm it.
Figuring out the right direction, and directing at least some of our effort that way, is vastly more important than directing 100% of our effort in that direction (rather than a random direction).
Moving from the abstraction back to the issue at hand… fundamentally, questionable epistemics in this episode of Drama just don’t matter all that much. They’re the random noise, buffeting us about on a high-dimensional landscape. Maybe finding and fixing organizational problems will lead to marginally more researcher time/effort on alignment, or maybe the drama itself will lead to a net loss of researcher attention to alignment. But these are both mechanisms of going marginally faster or marginally slower along the direction we’re already pointed. In a high-dimensional world, that’s not the sort of thing which matters much.
If we’d had higher standards for discussion around the Drama, maybe we’d have been more likely to figure out which way was “uphill” along the drama-salient directions—what the best changes were in response to the issues raised. But it seems wildly unlikely that any of the dimensions salient to that discussion were the actual most important dimensions. Even the best possible changes in response to the issues raised don’t matter much, when the issues raised are not the actual most important issues.
And that’s how Drama goes: rarely are the most important dimensions the most Drama-inducing. Raising site standards is the sort of thing which would help a lot in a high-drama discussions, but it wouldn’t much help us figure out the most-important-dimensions.
Another framing: in a babble-and-prune model, obviously raising community standards corresponds to pruning more aggressively. But in a high-dimensional world, the performance of babble-and-prune depends mostly on how good the babble is—random babble will progress very slowly, no matter how good the pruning. It’s all about figuring out the right direction in the first place, without having to try every random direction to do so. It fundamentally needs to be a positive process, figuring out techniques to systematically pursue better directions, not just a process of avoiding bad or useless directions. Nearly all the directions are useless; avoiding them is like sweeping sand from a beach.
What links here?
- Improving on the Karma System by Raelifin (14 Nov 2021 18:01 UTC; 105 points)
- Vaniver 7 Nov 2021 16:32 UTC
  65 points
  0
  Parent
  I think I agree with the models here, and also want to add a complicating factor that I think impacts the relevance of this.
  I think running a site like this in a fully consequentialist way is bad. When you’re public and a seed of something, you want to have an easily-understandable interface with the world; you want it to be the case that other people who reason about you (of which there will be many, and who are crucial to your plan’s success!) can easily reason about you. Something more like deontology or virtue ethics (“these are the rules I will follow” or “these are the virtues we will seek to embody”) makes it much easier for other agents to reason about you.
  And so the more that I as a mod (or the mod team in general) rely on our individual prudence or models or so on, the more difficult it becomes for users to predict what will happen, and that has costs. (I still think that it ultimately comes down to our prudence—the virtues that we’re trying to embody do in fact conflict sometimes, and it’s not obvious how to resolve those conflicts—but one of the things my prudence is considering are those legibility costs.)
  And when we try to figure out what virtues we should embody on Less Wrong, I feel much better about Rationality: Common Interest of Many Causes than I do about “whatever promotes AI safety”, even tho I think the ‘common interest’ dream didn’t turn out as well as one might have hoped, looking forward from 2009, and I think AI safety is much closer to ‘the only game in town’ than it might seem on first glance. Like, I want us to be able to recover if in fact it turns out AI safety isn’t that big a deal. I also want LessWrong to survive as a concern even if someone figures out AI safety, in a way that I might not for something like the AI Alignment Forum. I would like people who aren’t in tune with x-risk to still be around here (so long as they make the place better).
  That said, as pointed out in my other comment, I care more about reaching the heights than I do about raising the sanity waterline or w/e, and I suspect that lines up with “better babble” more than it does “better prune”.
  - johnswentworth 7 Nov 2021 16:49 UTC
    30 points
    Parent
    +1 to all this, and in particular I’m very strongly on board with rationality going beyond AI safety. I’m a big fan of LessWrong’s current nominal mission to “accelerate intellectual progress”, and when I’m thinking about making progress in a high-dimensional world, that’s usually the kind of progress I’m thinking about. (… Which, in turn, is largely because intellectual/scientific/engineering progress seem to be the “directions” which matter most for everything else.)
- jimmy 8 Nov 2021 17:26 UTC
  62 points
  Parent
  I think your main point here is wrong.
  Your analysis rests on a lot of assumptions:
  1) It’s possible to choose a basis which does a good job separating the slope from the level
  2) Our perturbations are all small relative to the curvature of the terrain, such that we can model things as an n-dimensional plane
  3) “Known” errors can be easily avoided, even in many dimensional space, such that the main remaining question is what the right answers are
  4) Maintenance of higher standards doesn’t help distinguish between better and worse directions.
  5) Drama pushes in random directions, rather than directions selected for being important and easy to fuck up.
  
  1) In a high dimensional space, almost all bases have the slope distributed among many basis vectors. If you can find a basis that has a basis vector pointing right down the gradient and the rest normal to it, that’s great. If your bridge has one weak strut, fix it. However, there’s no reason to suspect we can always or even usually do this. If you had to describe the direction of improvement from a rotting log to a nice cable stayed bridge, there’s no way you could do it simply. You could name the direction “more better”, but in order to actually point at it or build a bridge, many many design choices will have to be made. In most real world problems, you need to look in many individual directions and decide whether it’s an improvement or not and how far to go. Real world value is built on many “marginal” improvements.
  2) The fact that we’re even breathing at all means that we’ve stacked up a lot of them. Almost every configuration is completely non-functional, and being in any way coherent requires getting a lot of things right. We are balanced near optima on many dimensions, even thought there is plenty left to go. While almost all “small” deviations have even smaller impact, almost all “large” deviations cause a regression to the mean or at least have more potential loss than gain. The question is whether all perturbations can be assumed small, and the answer is clear from looking at the estimated curvature. On a bad day you can easily exhibit half the tolerance that you do on a good day. Different social settings can change the tolerance by *much* more than that. I could be pretty easily convinced that I’m averaging 10% too tolerant or 10% too intolerant, but a factor of two either way is pretty clearly bad in expectation. In other words, the terrain is can *not* be taken as planar.
  3) Going uphill, even when you know which way is up, is *hard*, and there is a tendency to downslide. Try losing weight, if you have any to lose. Try exercising as much as you think you should. Or just hiking up a real mountain. Gusts of wind don’t blow you up the mountain as often as they push you down; gusts of wind cause you to lose your footing, and when you lose your footing you inevitably degenerate into a high entropy mess that is further from the top. Getting too little sleep, or being yelled at too much, doesn’t cause people to do better as often as it causes them to do worse. It causes people to lose track of longer term consequences, and short term gradient following leads to bad long term results. This is because so many problems are non-minimum phase. Bike riding requires counter-steering. Strength training requires weight lifting, and accepting temporary weakening. Getting rewarded for clear thinking requires first confronting the mistakes you’ve been making. “Knowing which way to go” is an important part of the problem too, and it does become limiting once you get your other stuff in order, but “consistently performs as well as they could, given what they know” is a damn high bar, and we’re not there yet. “Do the damn things you know you’re supposed to to, and don’t rationalize excuses” is a really important part of it, and not as easy as it sounds.
  4) Our progress on one dimension is not independent of our ability to progress on the others. Eat unhealthy foods despite knowing better, and you might lose a day of good mental performance that you could have use to figure out “which direction?”. Let yourself believe a comforting belief, and that little deviation from the truth can lead to much larger problems in the future. One of the coolest things about LW, in my view, is that people here are epistemically careful enough that they don’t shoot themselves in the foot *immediately*. Most people reason themselves into traps so quickly that you either have to be extremely careful with the order and manner in which you present things, or else you have to cultivate an unusual amount of respect so they’ll listen for long enough to notice their confusion. LW is *better* at this. LW is not *perfect* at this. More is better. We don’t have clear thinking to burn. So much of clear thinking has to do with having room to countersteer that doing anything but maximizing it to the best of our ability is a huge loss in future improvement.
  5) Drama is not unimportant, and it is not separable. We are social creatures, and the health and direction of our social structures is a big deal. If you want to get anything done as a community, whether it be personal rationality improvement or collective efforts, the community has to function or that ain’t gonna happen. That involves a lot of discussing which norms and beliefs should be adopted, as well as meta-norms and beliefs about how disagreement should be handled, and applying to relevant cases. Problems with bad thinking become exposed and that makes such discussions both more difficult and more risky, but also more valuable to get right. Hubris that gets you in trouble when talking to others doesn’t just go away when making private plans and decisions, but in those cases you do lack someone to call you on it and therefore can’t so easily find which direction(s) you are erring in. Drama isn’t a “random distraction”, it’s an error signal showing that something is wrong with your/your communities sense making organs, and you need those things in order to find the right directions and then take them. It’s not the *only* thing, and there are plenty of ways to screw it up while thinking you’re doing the right thing (non-minimumphase again), but it is selected (if imperfectly) for being centered around the most important disagreements, or else it wouldn’t command the attention that it does.
  - johnswentworth 8 Nov 2021 19:24 UTC
    15 points
    0
    Parent
    This is a great comment. There are some parts which I think are outright wrong (e.g. drama selects for most-important-disagreements), but for the most part it correctly identifies a bunch of shortcomings of the linear model from my comment.
    I do think these shortcomings can generally be patched; the linear model is just one way to explain the core idea, and other models lead to the same place. The main idea is something like “in a high dimensional space, choosing the right places to explore is way more important than speed of exploration”, and that generalizes well beyond linearity.
    I’m not going to flesh that out more right now. This all deserves a better explanation than I’m currently ready to write.
    - jimmy 10 Nov 2021 19:38 UTC
      10 points
      Parent
      Yeah, I anticipated that the “Drama is actually kinda important” bit would be somewhat controversial. I did qualify that it was selected “(if imperfectly)” :p
      Most things are like “Do we buy our scratch paper from walmart or kinkos?”, and there are few messes of people so bad that it’d make me want to say “Hey, I know you think what you’re fighting about is important, but it’s literally less important than where we buy our scratch paper, whether we name our log files .log or .txt, and literally any other random thing you can think of”.
      (Actually, now that I say this, I realize that it can fairly often look that way and that’s why “bikeshedding” is a term. I think those are complicated by factors like “What they appear to be fighting about isn’t really what they’re fighting about”, “Their goals aren’t aligned with the goal you’re measuring them relative to”, and “The relevant metric isn’t how well they can select on an absolute scale or relative to your ability, but relative to their own relatively meager abilities”.)
      In one extreme, you say “Look, you’re fighting about this for a reason, it’s clearly the most important thing, or at least top five, ignore anyone arguing otherwise”.
      In another, you say “Drama can be treated as random noise, and the actual things motivating conflict aren’t in any way significantly more important than any other randomly selected thing one could attend to, so the correct advice is just to ignore those impulses and plow forward”
      
      I don’t think either are very good ways of doing it, to understate it a bit. “Is this really what’s important here?” is an important question to keep in mind (which people sometimes forget, hence point 3), but that it cannot be treated as a rhetorical question and must be asked in earnest because the answer can very well be “Yes, to the best of my ability to tell”—especially within groups of higher functioning individuals.
      I think we do have a real substantive disagreement in that I think the ability to handle drama skillfully is more important and also more directly tied into more generalized rationality skills than you do, but that’s a big topic to get into.
      I am, however, in full agreement on the main idea of “in a high dimensional space, choosing the right places to explore is way more important than speed of exploration”, and that it generalizes well and is a very important concept. It’s actually pretty amusing that I find myself arguing “the other side” here, given that so much of what I do for work (and otherwise) involves face palming about people working really hard to optimize the wrong part of the pie chart, instead of realizing to make a pie chart and work only on the biggest piece or few.
  - Duncan Sabien (Deactivated) 8 Nov 2021 17:58 UTC
    10 points
    0
    Parent
    If you had to describe the direction of improvement from a rotting log to a nice cable stayed bridge, there’s no way you could do it simply. You could name the direction “more better”, but in order to actually point at it or build a bridge, many many design choices will have to be made. In most real world problems, you need to look in many individual directions and decide whether it’s an improvement or not and how far to go. Real world value is built on many “marginal” improvements.
    This was an outstandingly useful mental image for me, and one I suspect I will incorporate into a lot of thoughts and explanations. Thanks.
    EDIT: finished reading the rest of this, and it’s tied (with Vaniver’s) for my favorite comment on this post (at least as far as the object level is concerned; there are some really good comments about the discussions themselves).
  - Said Achmiz 9 Nov 2021 4:25 UTC
    5 points
    0
    Parent
    Outstanding comment. (Easily the best I’ve read on Less Wrong in the last month, top five in the last year.)
- Ruby 7 Nov 2021 1:39 UTC
  36 points
  Parent
  Drama
  I object to describing recent community discussions as “drama”. Figuring out what happened within community organizations and holding them accountable is essential for us to have a functioning community. [I leave it unargued that we should have community.]
  - johnswentworth 7 Nov 2021 16:43 UTC
    27 points
    0
    Parent
    I agree that figuring out what happened and holding people/orgs accountable is important. That doesn’t make the process (at least the process as it worked this time) not drama. I certainly don’t think that the massive amount of attention the recent posts achieved can be attributed to thousands of people having a deeply-held passion for building effective organizations.
    - Ruby 7 Nov 2021 17:08 UTC
      9 points
      Parent
      Not sure if this is what you’re getting at. My estimate is that only a few dozen people participated and that I would ascribe to most of them either a desire for good organizations, a desire to protect people or a desire for truth and good process to be followed. I’d put entertainment seeking as a non-trivial motivation for many, and to be responsible for certain parts of the conversation, but not the overall driver.
      - Duncan Sabien (Deactivated) 7 Nov 2021 17:18 UTC
        9 points
        Parent
        For me personally, they’re multiplied terms in the Fermi. Like, engagement = [desire for good]*[“entertainment”]*[several other things].
        I wouldn’t have been there at all just for the drama. But also if there was zero something-like-pull, zero something-like-excitement, I probably wouldn’t have been there either.
        I don’t feel great about this.
        johnswentworth 7 Nov 2021 17:24 UTC
        5 points
        Parent
        This sounds right, I think it generalizes to a lot of other people too.
        Self_Optimization 27 Nov 2021 1:48 UTC
        4 points
        Parent
        To expand on this (though I only participated in the sense of reading the posts and a large portion of the comments), my reflective preference was to read through enough to have a satisfactorily-reliable view of the evidence presented and how it related to the reliability of data and analyses from the communities in question. And I succeeded in doing so (according to my model of my current self’s upper limitations regarding understanding of a complex sociological situation without any personally-observed data).
        
        But I could feel that the above preference was being enforced by willpower which had to compete against a constantly (though slowly) growing/reinforced sense of boredom from the monotony of staying on the same topic(s) in the same community with the same broad strokes of argument far beyond what is required to understand simpler subjects. If there had been less drama, I would have read far less into the comments, and misses a few informative discussions regarding the two situations in question (CFAR/MIRI and Leverage 1.0).
        
        So basically, the “misaligned subagent-like-mental-structures” manifestation of akrasia is messing things up again.
- Duncan Sabien (Deactivated) 6 Nov 2021 20:17 UTC
  27 points
  Parent
  (I like the above and agree with most of it and am mulling and hope to be able to reply substantively, but in the meantime I wanted to highlight one little nitpick that might be more than a nitpick.)
  Maybe finding and fixing organizational problems will lead to marginally more researcher time/effort on alignment, or maybe the drama itself will lead to a net loss of researcher attention to alignment. But these are both mechanisms of going marginally faster or marginally slower along the direction we’re already pointed. In a high-dimensional world, that’s not the sort of thing which matters much.
  I think this leaves out a thing which is an important part of most people’s values (mine included), which is that there’s something bad about people being hurt, and there’s something good about not hurting people, and that’s relevant to a lot of people (me included) separate from questions of how it impacts progress on AI alignment. Like, on the alignment forum, I get subordinating people’s pain/suffering/mistreatment to questions of mission progress (maybe), but I think that’s not true of a more general place like LessWrong.
  Put another way, I think there might be a gap between the importance you reflectively assign to the Drama, and the importance many others reflectively assign to it. A genuine values difference.
  I do think that on LessWrong, even people’s pain/suffering/mistreatment shouldn’t trump questions of truth and accuracy, though. Shouldn’t encourage us to abandon truth and accuracy.
  - johnswentworth 7 Nov 2021 17:55 UTC
    27 points
    Parent
    Addendum to the quoted claim:
    Maybe finding and fixing organizational problems will lead to marginally more researcher time/effort on alignment, or maybe the drama itself will lead to a net loss of researcher attention to alignment. But these are both mechanisms of going marginally faster or marginally slower along the direction we’re already pointed. In a high-dimensional world, that’s not the sort of thing which matters much.
    … and it’s also not the sort of thing which matters much for reduction of overall pain/suffering/mistreatment, even within the community. (Though it may be the sort of thing which matters a lot for public perceptions of pain/suffering/mistreatment.) This is a basic tenet of EA: the causes which elicit great public drama are not highly correlated with the causes which have lots of low-hanging fruit for improvement. Even within the rationalist community, our hardcoded lizard-brain drama instincts remain basically similar, and so I expect the same heuristic to apply: public drama is not a good predictor of the best ways to reduce pain/suffering/mistreatment within the community.
    But that’s a post-hoc explanation. My actual gut-level response to this comment was an aesthetic feeling of danger/mistrust/mild ickiness, like it’s a marker for some kind of outgroup membership. Like, this sounds like what (a somewhat less cartoonish and more intelligent version of) Captain America would say, and my brain automatically tags anything Captain-America-esque as very likely to be mistaken in a way that actively highjacks moral/social intuitions. That’s an aesthetic I actively cultivate, to catch exactly this sort of argument. I recommend it.
    - Duncan Sabien (Deactivated) 7 Nov 2021 20:09 UTC
      6 points
      Parent
      FWIW, I agree with this (to the extent that I’ve actually understood you). Like, I think this is compatible with the OP, and do not necessarily disagree with a heuristic of flagging Captain America statements. If 80% of them are bad, then the 20% that are good should indeed have to undergo scrutiny.
      - Said Achmiz 7 Nov 2021 23:32 UTC
        3 points
        Parent
        What is this “Captain America” business (in this context)? Would you mind explaining, for those of us who aren’t hip with the teen culture or what have you?
        Duncan Sabien (Deactivated) 8 Nov 2021 0:02 UTC
        12 points
        0
        Parent
        My guess is that it’s something like: Captain America makes bold claims with sharp boundaries that contain a lot of applause-light spirit, and tend to implicitly deny nuance. They are usually in the right direction, but “sidesy” and push people more toward being in disjoint armed camps.
        Said Achmiz 8 Nov 2021 0:58 UTC
        6 points
        Parent
        Any chance of getting an example of such bold claims? (And, ideally, confirmation from johnswentworth that this is what’s meant?)
        
        (I ask only because I really have no knowledge of the relevant comic books on which to base any kind of interpretation of this part of the discussion…)
        johnswentworth 8 Nov 2021 19:28 UTC
        5 points
        Parent
        I explain a bit more of what I mean here: http://seekingquestions.blogspot.com/2017/06/be-more-evil.html
        (Disclaimer: that’s an old essay which isn’t great by my current standards, and certainly doesn’t make much attempt to justify the core model. I think it’s pointing to the right thing, though.)
        Duncan Sabien (Deactivated) 8 Nov 2021 1:02 UTC
        5 points
        Parent
        http://www.ldssmile.com/wp-content/uploads/2014/09/3779149-no+you+move+cap+says.jpg
        Said Achmiz 8 Nov 2021 5:06 UTC
        5 points
        Parent
        Hmm, I see.
        
        But I am fairly sure that I endorse this sentiment. Or do you think there is a non-obvious interpretation where he’s wrong?
        Duncan Sabien (Deactivated) 8 Nov 2021 5:43 UTC
        9 points
        Parent
        I endorse this one myself (have used it in an essay before). But it’s definitely … er, well, it emboldens people who are wrong (but unaware of it) just as much as it emboldens people who are right?
        I dunno. I can’t pass John’s ITT here; just trying to help. =)
        Viliam 8 Nov 2021 11:19 UTC
        4 points
        Parent
        It also enourages nitpicking about details where people disagree, which means that if you have several people like this on the same team, the arguing probably never stops.
        Raemon 8 Nov 2021 6:31 UTC
        4 points
        Parent
        John’s linked article went into it in detail:
        
        http://seekingquestions.blogspot.com/2017/06/be-more-evil.html
- tailcalled 7 Nov 2021 8:28 UTC
  9 points
  Parent
  I also think another problem here is that you are doing a Taylor expansion of the value of the community with respect to the various parameters it could have. This only really works if the proposed change is small or if the value is relatively globally linear. However, there can be many necessary-but-not-sufficient parameters, in which case the function isn’t linear globally, but instead has a small peak surrounded by many directions of flatness.
  
  It seems to me that rationality with regards to local deductions, ingroup/outgroup effects, etc. could be necessary-but-not-sufficient. Without these, it’s much easier to get thrown off course to some entirely misguided direction—but as you point out, having it does not necessarily provide the right guidance to make progress.
- tailcalled 7 Nov 2021 7:49 UTC
  8 points
  Parent
  
  It fundamentally needs to be a positive process, figuring out techniques to systematically pursue better directions, not just a process of avoiding bad or useless directions. Nearly all the directions are useless; avoiding them is like sweeping sand from a beach.
  
  I think this depends on the nature of the bad direction. A usual bad direction might just up some smallish chunk of a single person’s time, which on its own isn’t that big of a problem (but it does add up, leading to the importance of the things you mentioned). However, one problem with certain topics like drama (sorry Ruby, I don’t have a better word even though I realize it’s problematic) is that it is highly motivating. This means that it easily attracts attention from many more people, that these people will spend much more time proportionally engaging with it, and that it has much more lasting consequences on the community. Thus getting it right seems to matter more than the typical topic.
  - johnswentworth 7 Nov 2021 17:07 UTC
    14 points
    0
    Parent
    Yup, I’m glad someone brought this up. I think the right model here is Demons in Imperfect Search. Transposons are a particularly good analogy—they’re genes whose sole function is to copy-and-paste themselves into the genome. If you don’t keep the transposons under control somehow, they’ll multiply and quickly overrun everything else, killing the cell.
    So keeping the metaphorical transposons either contained or subcritical is crucial. I think LessWrong handled that basically-successfully with respect to recent events: keeping demon threads at least contained is exactly what the frontpage policy is supposed to do, and the demon threads indeed stayed off the frontpage. It was probably supercritical for a while, but it was localized, so it died down in the end and the rest of the site and community are still basically intact.
    (Again, as I mentioned in response to Ruby’s comment, none of this is to say that the recent discussions didn’t serve any useful functional role. Even transposons serve a functional role in some organisms. But the discussions certainly seemed to grow in a way decoupled from any plausible estimate of their usefulness.)
    Important thing to note from this model: the goal with demons is just to keep them subcritical and/or contained. Pushing them down to zero doesn’t add much.
    - Duncan Sabien (Deactivated) 7 Nov 2021 17:15 UTC
      9 points
      Parent
      Agreement on the distinction between subcritical/contained and zero, and that there’s usually not value in going all the way to zero.