[deleted] comments on AI caught by a module that counterfactually doesn’t exist

[deleted] 18 Nov 2014 11:24 UTC
1 point

EDIT: Whenever I use colloquial phrases like “the AI believes a (false) X” I mean that we are using utility indifference to accomplish that goal, without actually giving the AI false beliefs.

It would be better to say, “The AI believes in falsely believing X” or “The AI believes it ought to falsely believe X” or “the AI is compelled to self-delude on X.”