TurnTrout comments on Alignment allows “nonrobust” decision-influences and doesn’t require robust grading

TurnTrout 16 Dec 2022 1:48 UTC
LW: 2 AF: 2
0
AF
What do you mean by “nonrobust” aligned intelligence? Is “robust” being used in the “robust grading” sense, or in the “robust values” sense (of e.g. agents caring about lots of things, only some of which are human-aligned), or some other sense?
Anyways, responding to the vibe of your comment—I feel… quite worried about that? Is there something I wrote which gave the impression otherwise? Maybe the vibe of the post is “alignment admits way more dof than you may have thought” which can suggest I believe “alignment is easy with high probability”?