(You might think meta-iteration involves making the other player forget what it learned in iterated play so far, so that you can re-start the learning process, but that doesn’t make much sense if you retain your own knowledge; and if you don’t, you can’t be learning!)
If I was doing meta-iteration my thought would be to maybe turn the iterated game into a one-shot game of “taking the next step from a position of relative empirical ignorance and thereby determining the entire future”.
So perhaps make up all the plausible naive hunches that I or my opponent might naively believe (update rules, prior probabilities, etc), then explore the combinatorial explosion of imaginary versions of us playing the iterated game starting from these hunches. Then adopt the hunch(es) that maximizes some criteria and play the first real move that that hunch suggests.
This would be like adopting tit-for-tat in iterated PD *because that seems to win tournaments*.
After adopting this plan your in-game behavior is sort of simplistic (just sticking to the initial hunch that tit-for-tat would work) even though many bits of information about the opponent are actually arriving during the game.
If I try to find analogies in the real world here it calls to mind martial arts practice with finite training time. You go watch a big diverse MMA tournament first. Then you notice that grapplers often win. Meta-iteration has finished and then your zeroth move is to decide to train as a grappler during the limited time before you fight for the first time ever. Then in the actual game you don’t worry too much about the many “steps” in the game where decision theory might hypothetically inject itself. Instead, you just let your newly trained grappling reflexes operate “as trained”.
Note that I don’t think this even close optimal! (I think “Bruce Lee” beats this strategy pretty easily?) However, if you squint you could argue that this rough model of meta-iteration is what humans mostly do for games of very high importance. Arguably, this is because humans have neurons that are slow to rewire for biological reasons than epistemic reasons...
However, when offered the challenge that “meta-iteration can’t be made to make sense”, this is what pops into my head :-)
When I try to think of a more explicitly computational model of meta-iteration-compatible gaming my attention is drawn to Core War. If you consider the “players of Core War” to be the human programmers, their virtue is high quality programming and they only make one move: the program they submit. If you consider the “players of Core War” to be the programs themselves their virtues are harder to articulate but speed of operation is definitely among them.
If I was doing meta-iteration my thought would be to maybe turn the iterated game into a one-shot game of “taking the next step from a position of relative empirical ignorance and thereby determining the entire future”.
So perhaps make up all the plausible naive hunches that I or my opponent might naively believe (update rules, prior probabilities, etc), then explore the combinatorial explosion of imaginary versions of us playing the iterated game starting from these hunches. Then adopt the hunch(es) that maximizes some criteria and play the first real move that that hunch suggests.
This would be like adopting tit-for-tat in iterated PD *because that seems to win tournaments*.
After adopting this plan your in-game behavior is sort of simplistic (just sticking to the initial hunch that tit-for-tat would work) even though many bits of information about the opponent are actually arriving during the game.
If I try to find analogies in the real world here it calls to mind martial arts practice with finite training time. You go watch a big diverse MMA tournament first. Then you notice that grapplers often win. Meta-iteration has finished and then your zeroth move is to decide to train as a grappler during the limited time before you fight for the first time ever. Then in the actual game you don’t worry too much about the many “steps” in the game where decision theory might hypothetically inject itself. Instead, you just let your newly trained grappling reflexes operate “as trained”.
Note that I don’t think this even close optimal! (I think “Bruce Lee” beats this strategy pretty easily?) However, if you squint you could argue that this rough model of meta-iteration is what humans mostly do for games of very high importance. Arguably, this is because humans have neurons that are slow to rewire for biological reasons than epistemic reasons...
However, when offered the challenge that “meta-iteration can’t be made to make sense”, this is what pops into my head :-)
When I try to think of a more explicitly computational model of meta-iteration-compatible gaming my attention is drawn to Core War. If you consider the “players of Core War” to be the human programmers, their virtue is high quality programming and they only make one move: the program they submit. If you consider the “players of Core War” to be the programs themselves their virtues are harder to articulate but speed of operation is definitely among them.