LLMs, Batches, and Emergent Episodic Memory

Let’s say that we are training an LLM and have 3 selections of writing, A, B, and C, each of which is then broken down into 3 parts. Call them A1, A2, ect.

Is there a benefit to structuring the batches like this:

Batch 1: {A1,B1,C1}

Batch 2: {A2,B2,C2}

Batch 3: {A3,B3,C3}

such that the most recent batches can provide information for the current prediction?

My understanding is that human episodic memory works something like this, but that current neural networks have a learning rate that is too low for this to be useful. Have there been experiments run on this? I feel like this is an obvious idea and has been examined exhaustively, but I just don’t know the right vocabulary.