Money Pump Arguments assume Memoryless Agents. Isn't this Unrealistic?

I have been reading about money pump arguments for justifying the VNM axioms, and I’m already stuck at the part where Gustafsson justifies acyclicity. Namely, he seems to assume that agents have no memory. Why does this make sense?^[1] To elaborate:

The standard money pump argument looks like this.

Let’s assume (1) $A > B > C > A$ , and that we have a souring of $A$ such that it satisfies (2) $C > A > A^{-}$ , and (3) $C > A^{-}$ ^[2]. Then if you start out with $A$ , at each of the nodes you’ll trade for $C$ , $B$ , and $A^{-}$ , so you’ll end up paying for getting what you started with.

This makes sense, until you realize that the agent here implicitly believes that their current choice is the last choice they’ll ever make, i.e. they’re behaving myopically.

Notice that without such restriction, there is the obvious strategy of: “look at the full tree, only pick leaf nodes that aren’t in the state of having been money pumped, and stick to the plan of only reaching that node.”

In practice, this looks fairly reasonable: “Yes, I do in fact prefer onions to pineapples to mushrooms to onions on my pizza. So what? If I know that you’re trying to money pump me, I’ll just refuse the trade. I may end up locally choosing options that I disprefer, but I don’t care since there is no rule that I must be myopic, and I will end up with a preferred outcome at the end.”

I’d say myopic agents are unnatural (as Gustafsson notes) because we want to assume the agent has full knowledge of all the trades that are available to them. Otherwise a defect (i.e. getting money-pumped) could be associated not necessarily with their preferences, but with their incomplete knowledge of the world.

So he proceeds to consider less restrictive agents such as sophisticated^[3] and minimally sophisticated^[4] agents, for which the above setup fails—but there exist modifications that still make money pump possible as long as they have cyclic preferences.

However, all of these agents still follow one critical assumption:

Decision-Tree Separability: The rational status of the options at a choice node does not depend on other parts of the decision tree than those that can be reached from that node.

This means that agents have no memory (this ruling out my earlier strategy of “looking at the final outcome and committing to a plan”). This still seems very restrictive.

Can anyone give a better explanation as to why such money pump arguments (with very unrealistic assumptions) are considered good arguments for the normativity of acyclic preferences as a rationality principle?

^
To be clear, I’m not claiming ”… therefore I think cyclic preferences are actually okay.” I understand that the point of formalizing money pump arguments is to capture our intuitive notion of something being wrong when someone has cyclic preferences. I’m more questioning the formalism and its assumptions.
^
Note (3) doesn’t follow from (2), because $>$ here is a general binary relation. We are trying to derive all the nice properties like acyclicity, transitivity and such.
^
Agents that can, starting from their current node, use backwards induction to locally take the most preferred path.
^
A modification of sophisticated agents such that it no longer needs to predict it will act rationally in nodes that can only be reached by irrational decisions.

[Question] Money Pump Arguments assume Memoryless Agents. Isn’t this Unrealistic?