TekhneMakre comments on My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)

TekhneMakre 21 Oct 2021 7:27 UTC
1 point
which in turn I fundamentally see as a consequence of epistemic learned helplessness run rampant
I don’t understand this, but for some reason I’m interested. Could you say a couple sentences more? How does rampant learned helplessness about having correct beliefs make it more appealing to seek new information by seeking disagreement? Are you saying that there’s learned helplessness about a different strategy for relating to potential sources of information?
- dxu 21 Oct 2021 16:56 UTC
  27 points
  Parent
  So, my model is that “epistemic learned helplessness” essentially stems from an inability to achieve high confidence in one’s own (gears-level) models. Specifically, by “high confidence” here I mean a level of confidence substantially higher than one would attribute to an ambient hypothesis in a particular space—if you’re not strongly confident that your model [in some domain] is better than the average competing model [in that domain], then obviously you’d prefer to adopt an exploration-based strategy (that is to say: one in which you seek out disagreeing hypotheses in order to increase the variance of your information intake) with respect to that domain.
  
  I think this is correct, so far as it goes, as long as we are in fact restricting our focus to some domain or set of domains. That is to say: as humans, naturally it’s impossible to explore every domain in sufficient depth that we can form and hold high confidence in gears-level model for said domain, which in turn means there will obviously be some domains in which “epistemic learned helplessness” is simply the correct attitude to take. (And indeed, the original blog post in which Scott introduced the concept of “epistemic learned helplessness” does in fact contextualize it using history books as an example.)
  
  Where I think this goes wrong, however, is when the proponent of “epistemic learned helplessness” fails to realize that this attitude’s appropriateness is actually a function of one’s confidence in some particular domain, and instead allows the attitude to seep into every domain. Once that happens, “inability to achieve confidence in own’s own models” ceases to be a rational reaction to a lack of knowledge, and instead turns into an omnipresent fog clouding over everything you think and do. (And the exploration-based strategy I outlined above ceases to be a rational reaction to a lack of confidence, and instead turns into a strategy that’s always correct and virtuous.)
  
  This is the sense in which I characterized the result as
  
  a consequence of epistemic learned helplessness run rampant, leaking past the limits of any particular domain and seeping into a general attitude towards anything considered sufficiently “hard”
  
  (Note the importance of the disclaimer “hard”. For example, I’ve yet to encounter anyone whose “epistemic learned helplessness” is so extreme that they stop to question e.g. whether they are in fact capable of driving a car. But that in itself is not particularly reassuring, especially when domains we care about include stuff labeled “hard”.)
  
  Now for the rub: I think anyone working on AI alignment (or any technical question of comparable difficulty) mustn’t exhibit this attitude with respect to [the thing they’re working on]. If you have a problem where you’re not able to achieve high confidence in your own models of something (relative to competing ambient models), you’re not going to be able to follow your own thoughts far enough to do good work—not without being interrupted by thoughts like “But if I multiply the probability of this assumption being true, by the probability of that assumption being true, by the probability of that assumption being true...” and “But [insert smart person here] thinks this assumption is unlikely to be true, so what probability should I assign to it really?”
  
  I think this is very bad. And since I think it’s very bad, naturally I will strongly oppose attempts to increase pressure in that particular direction—especially since I think pressure to think this way in this particular community is already ALARMINGLY HIGH. I think “epistemic learned helplessness” (which sometimes goes by more innocuous names as well, like fox epistemology or modest epistemology) is epistemically corrosive once it has breached quarantine, and by and large I think it has breached quarantine for a dismayingly large number of people (though thankfully my impression is that this has largely not occurred at MIRI).
  - hg00 22 Oct 2021 10:10 UTC
    3 points
    Parent
    It seems like you wanted me to respond to this comment, so I’ll write a quick reply.
    
    Now for the rub: I think anyone working on AI alignment (or any technical question of comparable difficulty) mustn’t exhibit this attitude with respect to [the thing they’re working on]. If you have a problem where you’re not able to achieve high confidence in your own models of something (relative to competing ambient models), you’re not going to be able to follow your own thoughts far enough to do good work—not without being interrupted by thoughts like “But if I multiply the probability of this assumption being true, by the probability of that assumption being true, by the probability of that assumption being true...” and “But [insert smart person here] thinks this assumption is unlikely to be true, so what probability should I assign to it really?”
    
    This doesn’t seem true for me. I think through details of exotic hypotheticals all the time.
    
    Maybe others are different. But it seems like maybe you’re proposing that people self-deceive in order to get themselves confident enough to explore the ramifications of a particular hypothesis. I think we should be a bit skeptical of intentional self-deception. And if self-deception is really necessary, let’s make it a temporary suspension of belief sort of thing, as opposed to a life belief that leads you to not talk to those with other views.
    
    It’s been a while since I read Inadequate Equilibria. But I remember the message of the book being fairly nuanced. For example, it seems pretty likely to me that there’s no specific passage which contradicts the statement “hedgehogs make better predictions on average than foxes”.
    
    I support people trying to figure things out for themselves, and I apologize if I unintentionally discouraged anyone from doing that—it wasn’t my intention. I also think people consider learning from disagreement to be virtuous for a good reason, not just due to “epistemic learned helplessness”. Also, learning from disagreement seems importantly different from generic deference—especially if you took the time to learn about their views and found yourself unpersuaded. Basically, I think people should account for both known unknowns (in the form of people who disagree whose views you don’t understand) and unknown unknowns, but it seems OK to not defer to the masses / defer to authorities if you have a solid grasp of how they came to their conclusion (this is my attempt to restate the thesis of Inadequate Equilibria as I remember it).
    
    I don’t deny that learning from disagreement has costs. Probably some people do it too much. I encouraged MIRI to do it more on the margin, but it could be that my guess about their current margin is incorrect, who knows.
    - dxu 22 Oct 2021 15:26 UTC
      22 points
      Parent
      Thanks for the reply.
      
      But it seems like maybe you’re proposing that people self-deceive in order to get themselves confident enough to explore the ramifications of a particular hypothesis. I think we should be a bit skeptical of intentional self-deception.
      
      I want to clarify that this is not my proposal, and to the extent that it had been someone’s proposal, I would be approximately as wary about it as you are. I think self-deception is quite bad on average, and even on occasions when it’s good, that fact isn’t predictable in advance, making choosing to self-deceive pretty much always a negative expected-value action.
      
      The reason I suspect you interpreted this as my proposal is that you’re speaking from a frame where “confidence in one’s model” basically doesn’t happen by default, so to get there people need to self-deceive, i.e. there’s no way for someone [in a sufficiently “hard” domain] to have a model and be confident in that model without doing [something like] artificially inflating their confidence higher than it actually is.
      
      I think this is basically false. I claim that having (real, not artificial) confidence in a given model (even of something “hard”) is entirely possible, and moreover happens naturally, as part of the process of constructing a gears-level model to begin with. If your gears-level model actually captures some relevant fraction of the problem domain, I claim it will be obviously the case that it does so—and therefore a researcher holding that model would be very much justified in placing high confidence in [that part of] their model.
      
      How much should such a researcher be swayed by the mere knowledge that other researchers disagree? I claim the ideal answer is “not at all”, for the same reason that argument screens off authority. And I agree that, from the perspective of somebody on the outside (who only has access to the information that two similarly-credentialed researchers disagree, without access to the gears in question), this can look basically like self-deception. But (I claim) from the inside the difference is very obvious, and not at all reminiscent of self-deception.
      
      (Some fields do not admit good gears-level models at all, and therefore it’s basically impossible to achieve the epistemic state described above. For people in such fields, they might plausibly imagine that all fields have this property. But this isn’t the case—and in fact, I would argue that the division of the sciences into “harder” and “softer” is actually pointing at precisely this distinction: the “hardness” attributed to a field is in fact a measure of how possible it is to form a strong gears-level model.)
      
      Does this mean “learning from disagreement” is useless? Not necessarily; gears-level models can also be wrong and/or incomplete, and one entirely plausible (and sometimes quite useful) mechanism by which to patch up incomplete models is to exchange gears with someone else, who may not be working with quite the same toolbox as you. But (I claim) for this process to actually help, it should done in a targeted way: ideally, you’re going into the conversation already with some idea of what you hope to get out of it, having picked your partner beforehand for their likeliness to have gears you personally are missing. If you’re merely “seeking out disagreement” for the purpose of fulfilling a quota, that (I claim) is unlikely to lead anywhere productive. (And I view your exhortations for MIRI to “seek out more disagreement on the margin” as proposing essentially just such a quota.)
      
      (Standard disclaimer: I am not affiliated with MIRI, and my views do not necessarily reflect their views, etc.)
      What links here?
      dxu's comment on Hashing out long-standing disagreements seems low-value to me by So8res (18 Feb 2023 8:01 UTC; 22 points)