Ben Pace comments on Richard Ngo’s Shortform

Ben Pace Aug 22, 2020, 3:40 AM
10 points
Quoting your reply to Ruby below, I agree I’d like LessWrong to be much better at “being able to reliably produce and build on good ideas”.
The reliability and focus feels most lacking to me on the building side, rather than the production, which I think we’re doing quite well at. I think we’ve successfully formed a publishing platform that provides and audience who are intensely interested in good ideas around rationality, AI, and related subjects, and a lot of very generative and thoughtful people are writing down their ideas here.
We’re low on the ability to connect people up to do more extensive work on these ideas – most good hypotheses and arguments don’t get a great deal of follow up or further discussion.
Here are some subjects where I think there’s been various people sharing substantive perspectives, but I think there’s also a lot of space for more ‘details’ to get fleshed out and subquestions to be cleanly answered:
- Sabbath and Rest Days (Zvi, Lauren Lee, Jacobian, Scott)
- Moloch and Slack and Mazes (Scott, Eliezer, Zvi, Swentworth, Jameson)
- Inner/Outer Alignment (EvHub, Rafael, Paul, Swentworth, Steve2152)
- Embedded Agency + Optimization (Abram, Scott, Swentworth, Alex Flint, nostalgebraist)
- Simulacra Levels (Benquo, Zvi, Elizabeth)
- AI Takeoff (Paul, Katja, Kokotajlo, Zhukeepa)
- Iterated Amplification (Paul, EvHub, Zhukeepa, Vaniver, William S, Wei Dai)
- Insight meditation + IFS (Kaj, Kaj, Kaj, and Kaj. Also Abram and Val and Romeo and Scott)
- Coordination Problems (Eliezer, Scott, Sustrik, Swentworth, Zvi, me)
The above isn’t complete, it’s just some of the ones that come to mind as having lots of people sharing perspectives. And the list of people definitely isn’t complete.
Here examples of things that I’d like to see more of, that feel more like doing the legwork to actually dive into the details:
- Eli Tyre and Bucky replicating Scott’s birth-order hypothesis
- Katja and the other fine people at AI Impacts doing long-term research on a question (discontinuous progress) with lots of historical datapoints
- Jameson writing up his whole research question in great detail and very well, and then an excellent commenter turning up and answering it
- Zhukeepa writing up an explanation of Paul’s research, allowing many more to understand it, and allowing Eliezer to write a response
- Scott writing Goodhart Taxonomy, and the commenters banding together to find a set of four similar examples to add to the post
- Val writing some interesting things about insight meditation, prompting Kaj to write a non-mysterious explanation
- In the LW Review when Bucky checked out the paper Zvi analysed and argued it did not support the conclusions Zvi reached (this changed my opinion of Zvi’s post from ‘true’ to ‘false’)
- The discussion around covid and EMH prompting Richard Meadows to write down a lot of the crucial and core arguments around the EMH
The above is also not mentioning lots of times when the person generating the idea does a lot of the legwork, like Scott or Jameson or Sarah or someone.
I see a lot of (very high quality) raw energy here that wants shaping and directing, with the use of lots of tools for coordination (e.g. better collaboration tools).
The epistemic standards being low is one way of putting it, but it doesn’t resonate with me much and kinda feels misleading. I think our epistemic standards are way higher than the communities you mention (historians, people interested in progress studies). Bryan Caplan said he knows of no group whose beliefs are more likely to be right in general than the rationalists, this seems often accurate to me. I think we do a lot of exploration and generation and evaluation, just not in a very coordinated manner, and so could make progress at like 10x–100x the rate if we collaborated better, and I think we can get there without too much work.
- Richard_Ngo Aug 22, 2020, 6:28 AM
  12 points
  Parent
  “I see a lot of (very high quality) raw energy here that wants shaping and directing, with the use of lots of tools for coordination (e.g. better collaboration tools).”
  
  Yepp, I agree with this. I guess our main disagreement is whether the “low epistemic standards” framing is a useful way to shape that energy. I think it is because it’ll push people towards realising how little evidence they actually have for many plausible-seeming hypotheses on this website. One proven claim is worth a dozen compelling hypotheses, but LW to a first approximation only produces the latter.
  
  When you say “there’s also a lot of space for more ‘details’ to get fleshed out and subquestions to be cleanly answered”, I find myself expecting that this will involve people who believe the hypothesis continuing to build their castle in the sky, not analysis about why it might be wrong and why it’s not.
  
  That being said, LW is very good at producing “fake frameworks”. So I don’t want to discourage this too much. I’m just arguing that this is a different thing from building robust knowledge about the world.
  - Ben Pace Aug 23, 2020, 2:19 AM
    6 points
    Parent
    One proven claim is worth a dozen compelling hypotheses
    I will continue to be contrary and say I’m not sure I agree with this.
    For one, I think in many domains new ideas are really hard to come by, as opposed to making minor progress in the existing paradigms. Fundamental theories in physics, a bunch of general insights about intelligence (in neuroscience and AI), etc.
    And secondly, I am reminded of what Lukeprog wrote in his moral consciousness report, that he wished the various different philosophies-of-consciousness would stop debating each other, go away for a few decades, then come back with falsifiable predictions. I sometimes take this stance regarding many disagreements of import, such as the basic science vs engineering approaches to AI alignment. It’s not obvious to me that the correct next move is for e.g. Eliezer and Paul to debate for 1000 hours, but instead to go away and work on their ideas for a decade then come back with lots of fleshed out details and results that can be more meaningfully debated.
    I feel similarly about simulacra levels, Embedded Agency, and a bunch of IFS stuff. I would like to see more experimentation and literature reviews where they make sense, but I also feel like these are implicitly making substantive and interesting claims about the world, and I’d just be interested in getting a better sense of what claims they’re making, and have them fleshed out + operationalized more. That would be a lot of progress to me, and I think each of them is seeing that sort of work (with Zvi, Abram, and Kaj respectively leading the charges on LW, alongside many others).
    - Richard_Ngo Aug 23, 2020, 6:28 AM
      16 points
      0
      Parent
      I feel like this comment isn’t critiquing a position I actually hold. For example, I don’t believe that “the correct next move is for e.g. Eliezer and Paul to debate for 1000 hours”. I am happy for people to work towards building evidence for their hypotheses in many ways, including fleshing out details, engaging with existing literature, experimentation, and operationalisation.
      Perhaps this makes “proven claim” a misleading phrase to use. Perhaps more accurate to say: “one fully fleshed out theory is more valuable than a dozen intuitively compelling ideas”. But having said that, I doubt that it’s possible to fully flesh out a theory like simulacra levels without engaging with a bunch of academic literature and then making predictions.
      I also agree with Raemon’s response below.
    - Raemon Aug 23, 2020, 5:02 AM
      11 points
      0
      Parent
      I think I’m concretely worried that some of those models / paradigms (and some other ones on LW) don’t seem pointed in a direction that leads obviously to “make falsifiable predictions.”
      And I can imagine worlds where “make falsifiable predictions” isn’t the right next step, you need to play around with it more and get it fleshed out in your head before you can do that. But there is at least some writing on LW that feels to me like it leaps from “come up with an interesting idea” to “try to persuade people it’s correct” without enough checking.
      (In the case of IFS, I think Kaj’s sequence is doing a great job of laying it out in a concrete way where it can then be meaningfully disagreed with. But the other people who’ve been playing around with IFS didn’t really seem interested in that, and I feel like we got lucky that Kaj had the time and interest to do so.)
  - Ben Pace Aug 23, 2020, 2:18 AM
    4 points
    −4
    Parent
    Yepp, I agree with this. I guess our main disagreement is whether the “low epistemic standards” framing is a useful way to shape that energy. I think it is because it’ll push people towards realising how little evidence they actually have for many plausible-seeming hypotheses on this website.
    A housemate of mine said to me they think LW has a lot of breadth, but could benefit from more depth.
    I think in general when we do intellectual work we have excellent epistemic standards, capable of listening to all sorts of evidence that other communities and fields would throw out, and listening to subtler evidence than most scientists (“faster than science”), but that our level of coordination and depth is often low. “LessWrongers should collaborate more and go into more depth in fleshing out their ideas” sounds more true to me than “LessWrongers have very low epistemic standards”.
    - Richard_Ngo Aug 23, 2020, 6:18 AM
      20 points
      0
      Parent
      In general when we do intellectual work we have excellent epistemic standards, capable of listening to all sorts of evidence that other communities and fields would throw out, and listening to subtler evidence than most scientists (“faster than science”)
      “Being more openminded about what evidence to listen to” seems like a way in which we have lower epistemic standards than scientists, and also that’s beneficial. It doesn’t rebut my claim that there are some ways in which we have lower epistemic standards than many academic communities, and that’s harmful.
      In particular, the relevant question for me is: why doesn’t LW have more depth? Sure, more depth requires more work, but on the timeframe of several years, and hundreds or thousands of contributors, it seems viable. And I’m proposing, as a hypothesis, that LW doesn’t have enough depth because people don’t care enough about depth—they’re willing to accept ideas even before they’ve been explored in depth. If this explanation is correct, then it seems accurate to call it a problem with our epistemic standards—specifically, the standard of requiring (and rewarding) deep investigation and scholarship.
      - John_Maxwell Aug 26, 2020, 5:41 AM
        7 points
        Parent
        
        LW doesn’t have enough depth because people don’t care enough about depth—they’re willing to accept ideas even before they’ve been explored in depth. If this explanation is correct, then it seems accurate to call it a problem with our epistemic standards—specifically, the standard of requiring (and rewarding) deep investigation and scholarship.
        
        Your solution to the “willingness to accept ideas even before they’ve been explored in depth” problem is to explore ideas in more depth. But another solution is to accept fewer ideas, or hold them much more provisionally.
        
        I’m a proponent of the second approach because:
        
        I suspect even academia doesn’t hold ideas as provisionally as it should. See Hamming on expertise: https://forum.effectivealtruism.org/posts/mG6mckPHAisEbtKv5/should-you-familiarize-yourself-with-the-literature-before?commentId=SaXXQXLfQBwJc9ZaK
        
        I suspect trying to browbeat people to explore ideas in more depth works against the grain of an online forum as an institution. Browbeating works in academia because your career is at stake, but in an online forum, it just hurts intrinsic motivation and cuts down on forum use (the forum runs on what Clay Shirky called “cognitive surplus”, essentially a term for peoples’ spare time and motivation). I’d say one big problem with LW 1.0 that LW 2.0 had to solve before flourishing was people felt too browbeaten to post much of anything.
        
        If we accept fewer ideas / hold them much more provisionally, but provide a clear path to having an idea be widely held as true, that creates an incentive for people to try & jump through hoops—and this incentive is a positive one, not a punishment-driven browbeating incentive.
        
        Maybe part of the issue is that on LW, peer review generally happens in the comments after you publish, not before. So there’s no publication carrot to offer in exchange for overcoming the objections of peer reviewers.
        Richard_Ngo Aug 26, 2020, 6:39 AM
        4 points
        Parent
        “If we accept fewer ideas / hold them much more provisionally, but provide a clear path to having an idea be widely held as true, that creates an incentive for people to try & jump through hoops—and this incentive is a positive one, not a punishment-driven browbeating incentive.”
        
        Hmm, it sounds like we agree on the solution but are emphasising different parts of it. For me, the question is: who’s this “we” that should accept fewer ideas? It’s the set of people who agree with my argument that you shouldn’t believe things which haven’t been fleshed out very much. But the easiest way to add people to that set is just to make the argument, which is what I’ve done. Specifically, note that I’m not criticising anyone for producing posts that are short and speculative: I’m criticising the people who update too much on those posts.
        John_Maxwell Aug 26, 2020, 8:10 AM
        4 points
        Parent
        Fair enough. I’m reminded of a time someone summarized one of my posts as being a definitive argument against some idea X and me thinking to myself “even I don’t think my post definitively settles this issue” haha.
        Raemon Aug 26, 2020, 5:56 AM
        3 points
        Parent
        Yeah, this is roughly how I think about it.
        I do think right now LessWrong should lean more in the direction the Richard is suggesting – I think it was essential to establish better Babble procedures but now we’re doing well enough on that front that I think setting clearer expectations of how the eventual pruning works is reasonable.
        Richard_Ngo Aug 26, 2020, 6:53 AM
        5 points
        Parent
        I wanted to register that I don’t like “babble and prune” as a model of intellectual development. I think intellectual development actually looks more like:
        1. Babble
        2. Prune
        3. Extensive scholarship
        4. More pruning
        5. Distilling scholarship to form common knowledge
        And that my main criticism is the lack of 3 and 5, not the lack of 2 or 4.
        I also note that: a) these steps get monotonically harder, so that focusing on the first two misses *almost all* the work; b) maybe I’m being too harsh on the babble and prune framework because it’s so thematically appropriate for me to dunk on it here; I’m not sure if your use of the terminology actually reveals a substantive disagreement.
        Raemon Aug 27, 2020, 5:09 AM
        2 points
        Parent
        I basically agree with your 5-step model (I at least agree it’s a more accurate description than Babel and Prune, which I just meant as rough shorthand). I’d add things like “original research/empiricism” or “more rigorous theorizing” to the “Extensive Scholarship” step.
        I see the LW Review as basically the first of (what I agree should essentially be at least) a 5 step process. It’s adding a stronger Step 2, and a bit of Step 5 (at least some people chose to rewrite their posts to be clearer and respond to criticism)
        ...
        Currently, we do get non-zero Extensive Scholarship and Original Empiricism. (Kaj’s Multi-Agent Models of Mind seems like it includes real scholarship. Scott Alexander / Eli Tyre and Bucky’s exploration into Birth Order Effects seemed like real empiricism). Not nearly as much as I’d like.
        But John’s comment elsethread seems significant:
        If the cost of evaluating a hypothesis is high, and hypotheses are cheap to generate, I would like to generate a great deal before selecting one to evaluate.
        This reminded of a couple posts in the 2018 Review, Local Validity as Key to Sanity and Civilization, and Is Clickbait Destroying Our General Intelligence?. Both of those seemed like “sure, interesting hypothesis. Is it real tho?”
        During the Review I created a followup “How would we check if Mathematicians are Generally More Law Abiding?” question, trying to move the question from Stage 2 to 3. I didn’t get much serious response, probably because, well, it was a much harder question.
        But, honestly… I’m not sure it’s actually a question that was worth asking. I’d like to know if Eliezer’s hypothesis about mathematicians is true, but I’m not sure it ranks near the top of questions I’d want people to put serious effort into answering.
        I do want LessWrong to be able to followup Good Hypotheses with Actual Research, but it’s not obvious which questions are worth answering. OpenPhil et al are paying for some types of answers, I think usually by hiring researchers full time. It’s not quite clear what the right role for LW to play in the ecosystem.
        What links here?
        Raemon's comment on Are We Right about How Effective Mockery Is? by Ronny Fernandez (Aug 27, 2020, 9:15 PM; 10 points)
        John_Maxwell Aug 26, 2020, 11:52 AM
        2 points
        Parent
        
        All else equal, the harder something is, the less we should do it.
        
        My quick take is that writing lit reviews/textbooks is a comparative disadvantage of LW relative to the mainstream academic establishment.
        
        In terms of producing reliable knowledge… if people actually care about whether something is true, they can always offer a cash prize for the best counterargument (which could of course constitute citation of academic research). The fact that people aren’t doing this suggests to me that for most claims on LW, there isn’t any (reasonably rich) person who cares deeply re: whether the claim is true. I’m a little wary of putting a lot of effort into supply if there is an absence of demand.
        
        (I guess the counterargument is that accurate knowledge is a public good so an individual’s willingness to pay doesn’t get you complete picture of the value accurate knowledge brings. Maybe what we need is a way to crowdfund bounties for the best argument related to something.)
        
        (I agree that LW authors would ideally engage more with each other and academic literature on the margin.)
        DirectedEvolution Aug 26, 2020, 4:16 PM
        4 points
        Parent
        I’ve been thinking about the idea of “social rationality” lately, and this is related. We do so much here in the way of training individual rationality—the inputs, functions, and outputs of a single human mind. But if truth is a product, then getting human minds well-coordinated to produce it might be much more important than training them to be individually stronger. Just as assembly line production is much more effective in producing almost anything than teaching each worker to be faster in assembling a complete product by themselves.
        My guess is that this could be effective not only in producing useful products, but also in overcoming biases. Imagine you took 5 separate LWers and asked them to create a unified consensus response to a given article. My guess is that they’d learn more through that collective effort, and produce a more useful response, than if they spent the same amount of time individually evaluating the article and posting their separate replies.
        Of course, one of the reasons we don’t to that so much is that coordination is an up-front investment and is unfamiliar. Figuring out social technology to make it easier to participate in might be a great project for LW.
        John_Maxwell Aug 27, 2020, 4:36 AM
        11 points
        0
        Parent
        There’s been a fair amount of discussion of that sort of thing here: https://www.lesswrong.com/tag/group-rationality There are also groups outside LW thinking about social technology such as RadicalxChange.
        
        Imagine you took 5 separate LWers and asked them to create a unified consensus response to a given article. My guess is that they’d learn more through that collective effort, and produce a more useful response, than if they spent the same amount of time individually evaluating the article and posting their separate replies.
        
        I’m not sure. If you put those 5 LWers together, I think there’s a good chance that the highest status person speaks first and then the others anchor on what they say and then it effectively ends up being like a group project for school with the highest status person in charge. Some related links.
        DirectedEvolution Aug 27, 2020, 2:17 PM
        3 points
        Parent
        That’s definitely a concern too! I imagine such groups forming among people who either already share a basic common view, and collaborate to investigate more deeply. That way, any status-anchoring effects are mitigated.
        Alternatively, it could be an adversarial collaboration. For me personally, some of the SSC essays in this format have led me to change my mind in a lasting way.
      - curi Sep 3, 2020, 10:32 PM
        4 points
        0
        Parent
        
        they’re willing to accept ideas even before they’ve been explored in depth
        
        People also reject ideas before they’ve been explored in depth. I’ve tried to discuss similar issues with LW before but the basic response was roughly “we like chaos where no one pays attention to whether an argument has ever been answered by anyone; we all just do our own thing with no attempt at comprehensiveness or organizing who does what; having organized leadership of any sort, or anyone who is responsible for anything, would be irrational” (plus some suggestions that I’m low social status and that therefore I personally deserve to be ignored. there were also suggestions – phrased rather differently but amounting to this – that LW will listen more if published ideas are rewritten, not to improve on any flaws, but so that the new versions can be published at LW before anywhere else, because the LW community’s attention allocation is highly biased towards that).
      - Ben Pace Aug 23, 2020, 6:48 AM
        2 points
        Parent
        I feel somewhat inclined to wrap up this thread at some point, even while there’s more to say. We can continue if you like and have something specific or strong you’d like to ask, but otherwise will pause here.
      - TAG Aug 23, 2020, 10:49 AM
        1 point
        Parent
        
        why doesn’t LW have more depth?
        
        You have to realise that what you are doing isn’t adequate in order to gain the motivation to do it better, and that is unlikely to happen if you are mostly communicating with other people who think everything is OK.
    - TAG Aug 23, 2020, 10:44 AM
      3 points
      Parent
      Lesswrong is competing against philosophy as well as science, and philosophy has broader criterion of evidence still. In fact , lesswrongians are often frustrated that mainstream philosophy takes such topics as dualism or theism seriously.. even though theres an abundance of Bayesian evidence for them.
  - John_Maxwell Aug 26, 2020, 5:27 AM
    2 points
    Parent
    
    One proven claim is worth a dozen compelling hypotheses, but LW to a first approximation only produces the latter.
    
    Depends on the claim, right?
    
    If the cost of evaluating a hypothesis is high, and hypotheses are cheap to generate, I would like to generate a great deal before selecting one to evaluate.