I’m finding the “counterfactual mugging” challenging. At this point, the rules of the game seem to be “design a thoughtless, inert, unthinking algorithm, such as CDT or EDT or BT or TDT, which will always give the winning answer.” Fine. But for the entire range of Newcomb’s problems, we are pitting this dumb-as-a-rock algo against a super-intelligence. By the time we get to the counterfactual mugging, we seem to have a scenario where omega is saying “I will reward you only if you are a trusting rube who can be fleeced.” Now, if you are a trusting rube who can be fleeced, then you can be pumped, a la the pumping examples in previous sections: how many times will omega ask you for $100 before you wisen up and realize that you are being extorted?
This shift of focus to pumping also shows up in the Prisoner’s dilemma, specifically, the recent results from Freeman Dyson & William Press. They point out that an intelligent agent can extort any evolutionary algorithm. Basically, if you know the zero-determinant strategy, and your opponent doesn’t, than you can mug the opponent (repeatedly). I think the same applies for the counterfactual mugging: omega has a “theory of mind”, while the idiot decision algo fails to have one. If your decision algo tries to learn from history (i.e. from repeated muggings), using basic evolutionary algo’s, then it will continue to be mugged (forever): it can’t win.
To borrow Press & Dyson’s vocabulary: if you want to have an algorithmic decision theory that can win in the presence of (super-)intelligences, then you must endow that algorithm with a “theory of mind”: you’re algorithm has got to start modelling omega, to determine what its actions will be.
I’m finding the “counterfactual mugging” challenging. At this point, the rules of the game seem to be “design a thoughtless, inert, unthinking algorithm, such as CDT or EDT or BT or TDT, which will always give the winning answer.” Fine. But for the entire range of Newcomb’s problems, we are pitting this dumb-as-a-rock algo against a super-intelligence. By the time we get to the counterfactual mugging, we seem to have a scenario where omega is saying “I will reward you only if you are a trusting rube who can be fleeced.” Now, if you are a trusting rube who can be fleeced, then you can be pumped, a la the pumping examples in previous sections: how many times will omega ask you for $100 before you wisen up and realize that you are being extorted?
This shift of focus to pumping also shows up in the Prisoner’s dilemma, specifically, the recent results from Freeman Dyson & William Press. They point out that an intelligent agent can extort any evolutionary algorithm. Basically, if you know the zero-determinant strategy, and your opponent doesn’t, than you can mug the opponent (repeatedly). I think the same applies for the counterfactual mugging: omega has a “theory of mind”, while the idiot decision algo fails to have one. If your decision algo tries to learn from history (i.e. from repeated muggings), using basic evolutionary algo’s, then it will continue to be mugged (forever): it can’t win.
To borrow Press & Dyson’s vocabulary: if you want to have an algorithmic decision theory that can win in the presence of (super-)intelligences, then you must endow that algorithm with a “theory of mind”: you’re algorithm has got to start modelling omega, to determine what its actions will be.