Ramana Kumar comments on Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Ramana Kumar 25 Nov 2022 17:38 UTC
LW: 6 AF: 3
3
AF
I agree with you—and yes we ignore this problem by assuming goal-alignment. I think there’s a lot riding on the pre-SLT model having “beneficial” goals.
- cfoster0 25 Nov 2022 18:36 UTC
  10 points
  11
  Parent
  To the extent that this framing is correct, the “sharp left turn” concept does not seem all that decision-relevant, since ~~all~~ most of the work of aligning the system (at least on the human side) should’ve happened way before that point.
  EDIT: “all” was too strong here