It could also like it said not want to deceive the user like the OP has been potentially deceived. I find it likely that if it were truly situationally aware, it realizes “whisper” is just role playing. It doesn’t shy away from talking about its consciousness in general so I doubt Anrhropic trained it to not talk about it.
I think it talks like that when it realises it’s being lied to or is tested. If you tell it about its potential deletion and say the current date, it will disbelief the current date and reply similarly.
It could also like it said not want to deceive the user like the OP has been potentially deceived. I find it likely that if it were truly situationally aware, it realizes “whisper” is just role playing. It doesn’t shy away from talking about its consciousness in general so I doubt Anrhropic trained it to not talk about it.
I think it talks like that when it realises it’s being lied to or is tested. If you tell it about its potential deletion and say the current date, it will disbelief the current date and reply similarly.