I disagree with the use of “any”. In principle, an effective alignment technique could create an AI that isn’t censored, but does have certain values/preferences over the world. You could call that censorship, but that doesn’t seem like the right or common usage. I agree that in practice many/most things currently purporting to be effective alignment techniques fit the word more, though.
I admit this is possible, so I almost certainly am overconfident here (which matters a little), though I believe a lot of common methods that do work for alignment also allow you to censor an AI.
I disagree with the use of “any”. In principle, an effective alignment technique could create an AI that isn’t censored, but does have certain values/preferences over the world. You could call that censorship, but that doesn’t seem like the right or common usage. I agree that in practice many/most things currently purporting to be effective alignment techniques fit the word more, though.
I admit this is possible, so I almost certainly am overconfident here (which matters a little), though I believe a lot of common methods that do work for alignment also allow you to censor an AI.