paulfchristiano comments on [Link] A minimal viable product for alignment

paulfchristiano 10 Apr 2022 4:50 UTC
LW: 9 AF: 7
AF
I think how well we can evaluate claims and arguments about AI alignment absolutely determines whether delegating alignment to machines is easier than doing alignment ourselves. A heuristic argument that says “evaluation isn’t easier than generation, and that claim is true regardless of how good you are at evaluation until you get basically perfect at it” seems obviously wrong to me. If that’s a good summary of the disagreement I’m happy to just leave it there.
- johnswentworth 10 Apr 2022 16:09 UTC
  LW: 6 AF: 6
  AF Parent
  A heuristic argument that says “evaluation isn’t easier than generation, and that claim is true regardless of how good you are at evaluation until you get basically perfect at it” seems obviously wrong to me.
  Yup, that sounds like a crux. Bookmarked for later.
  What links here?