I know this post was chronologically first, but since I read them out of order my reaction was “wow, this post is sure using some of the notions from the Waluigi Effect mega-post, but for humans instead of chatbots”! In particular, they’re both pointing at the notion that an agent (human or AI chatbot) can be in something like a superposition between good actor and bad actor unlike the naive two-tone picture of morality one often gets from children’s books.
I know this post was chronologically first, but since I read them out of order my reaction was “wow, this post is sure using some of the notions from the Waluigi Effect mega-post, but for humans instead of chatbots”! In particular, they’re both pointing at the notion that an agent (human or AI chatbot) can be in something like a superposition between good actor and bad actor unlike the naive two-tone picture of morality one often gets from children’s books.