Given that LLM’s can use tools, it sounds like a traditional database might be able to be used. The data would still have to fit inside the context window, along with the generated continuation prompt, but that might work for a lot of cases.
I could also imagine this working without explicit tool use. There are already systems for querying corpuses (using embeddings to query vector databases, from what I’ve seen). Perhaps the corpus could be past chat transcripts, chunked.
I suspect the trickier part would be making this useful enough to justify the additional computation.
Given that LLM’s can use tools, it sounds like a traditional database might be able to be used. The data would still have to fit inside the context window, along with the generated continuation prompt, but that might work for a lot of cases.
I could also imagine this working without explicit tool use. There are already systems for querying corpuses (using embeddings to query vector databases, from what I’ve seen). Perhaps the corpus could be past chat transcripts, chunked.
I suspect the trickier part would be making this useful enough to justify the additional computation.