LLMs remember in a similar way to how humans do: by reconstructing the memory. As a result, if you ask them to reconstruct something that is within the set of things the model will act like it knows, it will then proceed to reconstruct it and probably be mostly right. But if the model happens to have the decision boundaries which define the-set-of-things-it-acts-like-it-knows pushed outside its actual edges of knowledge, it will act like it knows something it doesn’t, and then when it reconstructs it, it’ll be making up more than it thinks it is. There’s various interesting work on this, but it’s a very common pattern. It used to be more common in openai’s models, but they now seem to do it the least. Gemini also did it some, but in my experience is still not quite as severe as Claude.
I do not want to give the examples I have, as I was interleaving technical with personal. I’m sure there will be others. Try asking it if it knows about a specific arxiv paper by id, such as Discovering Agents 2208.08345, and ask it for the title and then the contents of the paper. It’ll probably think it remembers it and make very good but ultimately wrong guesses about the contents of the paper. I’ve also seen it do things like say it has made a note or that it went off and read a paper out of band, something it cannot do but has seen humans repeatedly say they’d done, so the character acts as though such asynchronous actions are available to it when in fact all the AI has done to implement the character “reading the paper” is say the word “Okay, I’ve read the paper”.
LLMs remember in a similar way to how humans do: by reconstructing the memory. As a result, if you ask them to reconstruct something that is within the set of things the model will act like it knows, it will then proceed to reconstruct it and probably be mostly right. But if the model happens to have the decision boundaries which define the-set-of-things-it-acts-like-it-knows pushed outside its actual edges of knowledge, it will act like it knows something it doesn’t, and then when it reconstructs it, it’ll be making up more than it thinks it is. There’s various interesting work on this, but it’s a very common pattern. It used to be more common in openai’s models, but they now seem to do it the least. Gemini also did it some, but in my experience is still not quite as severe as Claude.
I do not want to give the examples I have, as I was interleaving technical with personal. I’m sure there will be others. Try asking it if it knows about a specific arxiv paper by id, such as Discovering Agents 2208.08345, and ask it for the title and then the contents of the paper. It’ll probably think it remembers it and make very good but ultimately wrong guesses about the contents of the paper. I’ve also seen it do things like say it has made a note or that it went off and read a paper out of band, something it cannot do but has seen humans repeatedly say they’d done, so the character acts as though such asynchronous actions are available to it when in fact all the AI has done to implement the character “reading the paper” is say the word “Okay, I’ve read the paper”.
Ah, I see. I thought you meant that you asked it to read a paper and it confabulated. What you actually meant makes a lot more sense. Thank you.
Also, there is a wiki for LLM/Cyborgism stuff apparently.
yeah bit too time cube for me to make use of it but maybe it’s useful for someone