drnickbone comments on Sneaky Strategies for TDT

drnickbone 26 May 2012 11:16 UTC
1 point
My understanding was that “this problem” constitutes randomly picking a single TDT agent, which would presumably also have been done in the simulation.

So that’s another variant—in that interpretation you’re correct that C-sim would hardly ever see the same source-code C-sim in its own instances of the problem. I think you are right here that the chance of winning rises to at least 55%; not sure yet if it’s possible to do any better.

EDIT. I have a strategy for your variant which gives almost 100% chance for TDT winning the prize. The trick is that instead of each agent having a favourite number it has a least-favourite or “unlucky number” selected in a balanced way from the set {1,2,...,10}. Again consider a construction like SHA-256(C-act), reduce modulo 10 and then add 1. Here’s how the strategy works:

If C-sim has the same unlucky number as C-act then
```
    Pick the unlucky numbered box with probability 1 - epsilon. 
    Pick the others with equal probability epsilon / 9
```
Else
```
    Pick the box with C-sim's unlucky number
```
End If

It’s quite easy to see that each C-act, if presented multiple instances of the problem with different C-sim codes, will pick its own unlucky-numbered box slightly less often than any of the others. So the money is always in the box with C-sim’s unlucky number. This gives C-act ⁹⁄₁₀ + ¹⁄₁₀ x (1 - epsilon) or approx 100% chance of winning. CDT has exactly 100% chance of winning still, but the gap’s negligible.

EDIT 2. I just realised we can get TDT to win with close to 100% probability in your variant. I’ve amended above...
- lackofcheese 26 May 2012 23:32 UTC
  0 points
  Parent
  That’s still not quite the formulation of the problem I was considering, though it seems valid. Admittedly, your formulation is closer to the original idea since it does say “multiple simulations”, though I will note that the number of simulations has to be something like O(1/epsilon) for the difference to be noticeable.
  
  My previous strategy was designed for a variant of the problem where Omega only simulates a single instance of the problem (and calculates the probabilities directly from the source code of C-sim).
  - drnickbone 27 May 2012 13:40 UTC
    1 point
    Parent
    Sorry I misunderstood you then.
    
    Does your variant looks like this?
    
    Omega selects C-sim at random from some distribution over TDT full source-codes.
    Then Omega selects C-sim-sim at random from the same distribution.
    Then Omega calculates what will happen if it presents the problem to C-sim, but specifying the simulation’s full source code as C-sim-sim. Omega determines the probability of C-sim choosing each of the boxes, conditional on it having seen that fixed C-sim-sim.
    Then Omega fills the box with lowest probability (of being chosen by C-sim) or uses the tie-break rule.
    Finally Omega presents the real problem to C-act, but specifying the simulation’s full source code as C-sim.
    
    What is the best strategy for TDT to play as C-act?
    
    If that is the problem, then consider the following. It still uses the “unlucky number” construction from the set {1, 2, …, 10}. Each C-act will always choose its unlucky number with lowest probability, so the money is always in C-sim’s unlucky number box.
    
    If C-sim has a different unlucky number from C-act then
    
    Pick C-sim's unlucky number with probability 1 - epsilon Pick C-act's unlucky number with probability 0 Pick each of the other boxes with probability epsilon / 8
    Else
    
    Pick the common unlucky number with probability 1/10 - epsilon Pick each other box with probability 1/10 + epsilon / 9
    End If
    
    That looks like winning with probability ⁹⁄₁₀ x (1 - epsilon) + ¹⁄₁₀ x (1/10 - epsilon) so close to 91%.
    
    Is there a better strategy though?
    
    P.S. We are getting some interesting behaviour here, with slight variations under the conditions for selecting C-sim and calculating its choice probabilities leading to very different best strategies (and different success probabilities such as 10%, 50%, 91% or close to 100%). Quite fascinating.
    - lackofcheese 27 May 2012 14:38 UTC
      0 points
      Parent
      Yeah, that’s the problem I had in mind, and your “unlucky number” strategy definitely seems pretty solid in that case.