That’s a good question! I don’t know but I suppose it’s possible, at least when the input fits in the context window. How well it actually does at this seems like a question for researchers?
There’s also a question of why it would do it when the training doesn’t have any way of rewarding accurate explanations over human-like explanations. We also have many examples of explanations that don’t make sense.
There are going to be deductions about previous text that are generally useful, though, and would need to be reconstructed. This will be true even if the chatbot didn’t write the text in the first place (it doesn’t know either way). The deductions couldn’t be constructing the original thought process, though, when the chatbot didn’t write the text.
So I think this points to a weakness in my explanation that I should look into, though it’s likely still true that it confabulates explanations.
Yes, I agree that confabulation happens a lot, and also that our explanations of why we do things aren’t particularly trustworthy; they’re often self-serving. I think there’s also pretty good evidence that we remember our thoughts at least somewhat, though. A personal example: when thinking about how to respond to someone online, I tend to write things in my head when I’m not at a computer.