Davidmanheim comments on harfe’s Shortform

Davidmanheim 13 Feb 2025 11:42 UTC
2 points
0
Sorry if this was unclear, but there’s a difference between plans which work conditioning on an impossibility, and trying to do the impossible. For example, building a proof that works only if P=NP is true is silly in ways that trying to prove P=NP is not. The second is trying to do the impossible, the first is what I was dismissive of.
- Mitchell_Porter 13 Feb 2025 17:02 UTC
  2 points
  0
  Parent
  So what’s the impossible thing—identifying an adequate set of values? instilling them in a superintelligence?
  - Davidmanheim 14 Feb 2025 1:55 UTC
    2 points
    0
    Parent
    Yes, doing those things in ways that a capable alignment research can’t find obvious failure modes for. (Which may not be enough, given that they.aren’t superintelligences—but is still a bar which no proposed plan comes close to passing.)
    - Mitchell_Porter 14 Feb 2025 9:58 UTC
      5 points
      2
      Parent
      Is there someone you regard as the authority on why it can’t be done? (Yudkowsky? Yampolskiy?)
      Because what I see, are not problems that we know to be unsolvable, but rather problems that the human race is not seriously trying to solve.
      - Davidmanheim 14 Feb 2025 13:57 UTC
        2 points
        0
        Parent
        I think that basically everyone at MIRI, Yampolskiy, and a dozen other people all have related and strong views on this. You’re posting on Lesswrong, and I don’t want to be rude, but I don’t know why I’d need to explain this instead of asking you to read the relevant work.
        Mitchell_Porter 15 Feb 2025 23:04 UTC
        4 points
        0
        Parent
        I asked because I’m talking with you and I wanted to know *your* reasoning as to why a technical solution to the alignment of superintelligence is impossible. It seems to be “lots of people see lots of challenges and they are too many to overcome, take it up with them”.
        But it’s just a hard problem, and the foundations are not utterly mysterious. Humanity understands quite a lot about the physical and computational nature of our reality by now.
        Maybe it would be more constructive to ask how you envisage achieving the political impossible of stopping the worldwide AI race, since that’s something that you do advocate.