Thane Ruthenis comments on Hashing out long-standing disagreements seems low-value to me

Thane Ruthenis 16 Feb 2023 19:31 UTC
15 points
6
I don’t think that’s it, because I don’t think the situation in AI Alignment is all that unusual. “Science progresses one funeral at a time” is a universally applied adage. There’s also an even more general “common wisdom” that people in a debate ~never succeed in changing each others’ minds, and the only point of having a debate is to sway the on-the-fence onlookers — that debates are a performance art.
There’s something fundamentally off in human psyche that invalidates Aumann’s for us.
And put like this, I think the capabilities researchers will forever disagree with the alignment researchers for this exact same reason.
- 1a3orn 16 Feb 2023 21:41 UTC
  27 points
  25
  Parent
  It might be the case that it’s because of a more universal thing. Like sometimes time is just necessary for science to progress. And definitely the right view of debate is of changing the POV of onlookers, not the interlocutors.
  
  But—I still suspect, without being able to quantify, that alignment is worse than the other sciences in that the standards by-which-people-agree-what-good-work-is are just uncertain.
  
  People in alignment sometimes say that alignment is pre-paradigmatic. I think that’s a good frame—I take it to mean that the standards of what qualifies as good work themselves not yet ascertained, among many other things. I think that if paradigmaticity is a line with math on the left and, like… pre-atomic chemistry all the way on the right, alignment is pretty far to the right. Modern RL is further to the left, and modern supervised learning with transformers much further to the left, followed up by things for which we actually have textbooks which don’t go out of date every 12 months.
  
  I don’t think this would be disputed? But this really means that it’s almost certain at some point that > 80% of alignment-related-intellectual-output will be tossed at some point in the future, because that’s what pre paradigmaticity means. (Like, 80% is arguably a best case scenario for preparadigmatic fields!) Which means in turn that engaging with it is really a deeply unattractive prospect.
  
  I guess what I’m saying is that I agree that the situation for alignment is not at all bad for a pre-paradigmatic field, but if you call your field preparadigmatic that seems like a pretty bad place to be in, in term of what kind of credibility well-calibrated observers should accord you.
  
  Edit: And like, to the degree that arguments that p(doom) is high are entirely separate from the field of alignment, this is actually a reason for ML engineers to care deeply about alignment, as a way of preventing doom, even if it is preparadigmatic! But I’m quite uncertain that this is true.
  - abramdemski 28 Feb 2023 17:19 UTC
    6 points
    2
    Parent
    But—I still suspect, without being able to quantify, that alignment is worse than the other sciences in that the standards by-which-people-agree-what-good-work-is are just uncertain.
    People in alignment sometimes say that alignment is pre-paradigmatic. I think that’s a good frame—I take it to mean that the standards of what qualifies as good work themselves not yet ascertained, among many other things. I think that if paradigmaticity is a line with math on the left and, like… pre-atomic chemistry all the way on the right, alignment is pretty far to the right. Modern RL is further to the left, and modern supervised learning with transformers much further to the left, followed up by things for which we actually have textbooks which don’t go out of date every 12 months.
    I don’t think this would be disputed?
    Noting that I don’t dispute this.
    An important reason why this is true is because existential risk prevention can’t be an experimental field. Some existential risks—such as asteroid impacts—can be understood with strong theory (like, settled physics). AI risk isn’t one of those (and any path by which it could become one of those depends on an inferential leap which is itself uncertain, namely, extrapolating results from near-term AI experiments, to much more powerful AI systems).
  - Thane Ruthenis 16 Feb 2023 22:29 UTC
    3 points
    0
    Parent
    Yeah, I agree with that.
- Noosphere89 16 Feb 2023 20:00 UTC
  8 points
  2
  Parent
  I do want to argue against the theory that science progresses one funeral at a time:
  
  “Science advances one funeral at a time” → this seems to be both generally not true as well as being a harmful meme (because it is a common argument used to argue against life extension research).
  
  https://www.lesswrong.com/posts/fsSoAMsntpsmrEC6a/does-blind-review-slow-down-science
  - Thane Ruthenis 16 Feb 2023 21:07 UTC
    2 points
    0
    Parent
    Fair enough, I suppose I don’t actually have a vast body of rigorous evidence in favour of that phrase.