scasper comments on Eight Strategies for Tackling the Hard Part of the Alignment Problem

scasper 22 Jul 2023 19:11 UTC
LW: 1 AF: 1
2
AF
Sounds right, but the problem seems to be semantic. If understanding is taken to mean a human’s comprehension, then I think this is perfectly right. But since the method is mechanistic, it seems difficult nonetheless.
- davidad 22 Jul 2023 19:15 UTC
  LW: 2 AF: 1
  1
  AF Parent
  Less difficult than ambitious mechanistic interpretability, though, because that requires human comprehension of mechanisms, which is even more difficult.