MondSemmel comments on Stop posting prompt injections on Twitter and calling it “misalignment”

MondSemmel 4 Mar 2023 14:38 UTC
1 point
0
Weak down-vote: I feel like if one takes this position to its logical extreme, they could claim that any arbitrary AI misbehavior is not misaligned, almost by definition: you just don’t know the true held values of its creators, according to which this behavior is perfectly aligned.