It’s more that an apology is a signal; to make it effective, you must communicate that it’s a real signal reflecting your actual internal processes, and not a result of a surface-level “what words can I say to appear maximally virtuous” process.
So for instance, if you say a sentence equivalent to “I admit that I was wrong to do X and I’m sorry about it, but I think Y is unfair”, then you’re not communicating that you underwent the process of “I realized I was wrong, updated my beliefs based on it, and wondered if I was wrong about other things”.
I’m not entirely sure what the simplest fix is
A simple fix would be “I admit I was wrong to do X, and I’m sorry about it. Let me think about Y for a moment.” And then actually think about Y, because if you did one thing wrong, you probably did other things wrong too.
I think “API calls” are the wrong way to word it.
It’s more that an apology is a signal; to make it effective, you must communicate that it’s a real signal reflecting your actual internal processes, and not a result of a surface-level “what words can I say to appear maximally virtuous” process.
So for instance, if you say a sentence equivalent to “I admit that I was wrong to do X and I’m sorry about it, but I think Y is unfair”, then you’re not communicating that you underwent the process of “I realized I was wrong, updated my beliefs based on it, and wondered if I was wrong about other things”.
A simple fix would be “I admit I was wrong to do X, and I’m sorry about it. Let me think about Y for a moment.” And then actually think about Y, because if you did one thing wrong, you probably did other things wrong too.