I’ve seen exactly the same behaviour when talking to ChatGPT about mathematics. I find it rather disturbing, in a kinda-sorta uncanny-valley way, but I think it’s not so much “this thing is superficially human-like but with unsettling differences” but more “this thing is like a human being who has been a mathematician but suffered some sort of terrible brain damage”.
It’s not only in mathematics that it does this.
Example: I tried asking it a moderately technical question about philosophy and it gave a garbage answer; I asked it whether it was sure and it apologized and then repeated essentially the same garbage answer; I told it it was wrong and gave a better answer, it apologized and parroted back approximately my answer but with one bit of what I said replaced with nonsense.
Non-example: I tried asking it why mirrors reflect left-to-right rather than top-to-bottom; its answer was fairly unsatisfactory and its response to further probing questions more so, but there weren’t glaring logical errors.
Example: I asked it the “A is looking at B, B is looking at C, A is a man, B is a man or a woman, C is a woman; must there be a man looking at a woman?” question (with actual names, etc.) and it gave the wrong answer (if B is a woman then yes because A is looking at B, but B might be a man, so we can’t tell). I asked it what’s happening when B looks at C in that case and it correctly identified that then a man is looking at a woman. I pointed out the contradiction to what it had said before, and yet again it did the apologize-but-obliviously-double-down thing.
I think ChatGPT just has huge deficits (relative to its impressive fluency in other respects) with explicit reasoning. My impression is that it’s much worse in this respect than GPT-3 is, but I haven’t played with GPT-3 for a while so I might be wrong.
I’ve seen exactly the same behaviour when talking to ChatGPT about mathematics. I find it rather disturbing, in a kinda-sorta uncanny-valley way, but I think it’s not so much “this thing is superficially human-like but with unsettling differences” but more “this thing is like a human being who has been a mathematician but suffered some sort of terrible brain damage”.
It’s not only in mathematics that it does this.
Example: I tried asking it a moderately technical question about philosophy and it gave a garbage answer; I asked it whether it was sure and it apologized and then repeated essentially the same garbage answer; I told it it was wrong and gave a better answer, it apologized and parroted back approximately my answer but with one bit of what I said replaced with nonsense.
Non-example: I tried asking it why mirrors reflect left-to-right rather than top-to-bottom; its answer was fairly unsatisfactory and its response to further probing questions more so, but there weren’t glaring logical errors.
Example: I asked it the “A is looking at B, B is looking at C, A is a man, B is a man or a woman, C is a woman; must there be a man looking at a woman?” question (with actual names, etc.) and it gave the wrong answer (if B is a woman then yes because A is looking at B, but B might be a man, so we can’t tell). I asked it what’s happening when B looks at C in that case and it correctly identified that then a man is looking at a woman. I pointed out the contradiction to what it had said before, and yet again it did the apologize-but-obliviously-double-down thing.
I think ChatGPT just has huge deficits (relative to its impressive fluency in other respects) with explicit reasoning. My impression is that it’s much worse in this respect than GPT-3 is, but I haven’t played with GPT-3 for a while so I might be wrong.