David Scott Krueger (formerly: capybaralet) comments on Takeoff speeds have a huge effect on what it means to work on AI x-risk

David Scott Krueger (formerly: capybaralet) 13 Apr 2022 18:15 UTC
LW: 4 AF: 3
0
AF
It’s possible that a lot of our disagreement is due to different definitions of “research on alignment”, where you would only count things that (e.g.) 1) are specifically about alignment that likely scales to superintelligent systems, or 2) is motivated by X safety.

To push back on that a little bit...
RE (1): It’s not obvious what will scale, And I think historically this community has been too pessimistic (i.e. almost completely dismissive) about approaches that seem hacky or heuristic.
RE (2): This is basically circular.
- adamShimi 15 Apr 2022 9:05 UTC
  LW: 7 AF: 5
  0
  AF Parent
  I disagree, so I’m curious about what are great examples for you of good research on alignment that is not done by x-risk motivated people? (Not being dismissive, I’m genuinely curious, and discussing specifics sounds more promising than downvoting you to oblivion and not having a conversation at all).
  - Joe Collman 16 Apr 2022 17:22 UTC
    LW: 1 AF: 1
    0
    AF Parent
    Examples would be interesting, certainly. Concerning the post’s point, I’d say the relevant claim is that [type of alignment research that’ll be increasingly done in slow takeoff scenarios] is already being done by non x-risk motivated people.
    I guess the hope is that at some point there are clear-to-everyone problems with no hacky solutions, so that incentives align to look for fundamental fixes—but I wouldn’t want to rely on this.