Lao Mein comments on LLMs, Batches, and Emergent Episodic Memory

Lao Mein 3 Jul 2023 5:23 UTC
2 points
0
I’m mostly just curious about how difficult it is for a transformer to learn to effectively access information from recent backprops, without using outside structures. Can it pull an essay title? General topic? And how well does this work for stochastic vs. batch processing? Thanks a lot btw.