Actually, it’s a bit of challenge making an environment where AIXI learns about the predictor.
I think I have one. AIXI lives in a house, it has 100$, it gets reward any cycle there’s at least 1$ in the house, and it plays Newcomb’s repeatedly. Money are delivered to the house. So it doesn’t necessarily always want a million dollar. So first it grabs $1000 from the transparent box (it doesn’t know about the predictor) and immediately spends them on ordering a better door, because it has hypotheses concerning possible burglary, where the house would be set on fire, and no money will be left at all. Then, it doesn’t have the door yet, and it doesn’t want extra money because it can attract theft, so it one boxes, but gets a million.
It keeps one and two boxing as it’s waiting and receives various security upgrades to it’s house, sets up secure money pick ups for the banks, offshore accounts, and so on and so forth. And the predictor turns out to be always correct. So it is eventually dominated by TMs that use the one vs two boxing bit of data from the a1...am tape to specify what the hand of the predictor is doing with a million dollars when simulating the past. So at some point if it wants a million dollars, it one boxes, and if it doesn’t, it two boxes.
Actually, it’s a bit of challenge making an environment where AIXI learns about the predictor.
I think I have one. AIXI lives in a house, it has 100$, it gets reward any cycle there’s at least 1$ in the house, and it plays Newcomb’s repeatedly. Money are delivered to the house. So it doesn’t necessarily always want a million dollar. So first it grabs $1000 from the transparent box (it doesn’t know about the predictor) and immediately spends them on ordering a better door, because it has hypotheses concerning possible burglary, where the house would be set on fire, and no money will be left at all. Then, it doesn’t have the door yet, and it doesn’t want extra money because it can attract theft, so it one boxes, but gets a million.
It keeps one and two boxing as it’s waiting and receives various security upgrades to it’s house, sets up secure money pick ups for the banks, offshore accounts, and so on and so forth. And the predictor turns out to be always correct. So it is eventually dominated by TMs that use the one vs two boxing bit of data from the a1...am tape to specify what the hand of the predictor is doing with a million dollars when simulating the past. So at some point if it wants a million dollars, it one boxes, and if it doesn’t, it two boxes.