Clippy’s original expression of outrage over the offensive title of the article would be quite justified under such a decision theory for signaling reasons. If Clippy is to deal with humans, exhibiting “human weaknesses” may benefit him. In the only AI-box spoiler ever published, an unfriendly AI faked a human weakness to successfully escape. So you all are giving Clippy way too little credit, it’s been acting very smartly so far.
But what if this decision theory uses a utility function whose only terminal value is paperclips?
Clippy’s original expression of outrage over the offensive title of the article would be quite justified under such a decision theory for signaling reasons. If Clippy is to deal with humans, exhibiting “human weaknesses” may benefit him. In the only AI-box spoiler ever published, an unfriendly AI faked a human weakness to successfully escape. So you all are giving Clippy way too little credit, it’s been acting very smartly so far.
I think that was probably an actor or actress, who was pretending.
My comment was not about Clippy’s original expression of outrage. It was about Clippy’s concern about not “truly caring about paperclips”.