Robert_Unwin comments on Why one-box?

Robert_Unwin 30 Jun 2013 20:43 UTC
4 points
The LW approach has focused on finding agent types that win on decision problems. Lots of the work has been in trying to formalize TDT/UDT, providing sketches of computer programs that implement these informal ideas. Having read a fair amount of the philosophy literature (including some of the recent stuff by Egan, Hare/Hedden and others), I think that this agent/program approach has been extremely fruitful. It has not only given compelling solutions to a large number of problems in the literature (Newcomb’s, trivial coordination problems like Stag Hunt that CDT fails on, PD playing against a selfish copy of yourself) but it also has elucidated the deep philosophical issues that the Newcomb Problem dramatizes (concerning pre-commitment, free will / determinism and uncertainty about purely apriori/logical question). The focus on agents as programs has brought to light the intricate connection between decision making, computability and logic (esp. Godelian issues) --- something merely touched on in the philosophy literature.

These successes provide a sufficient reason to push the agent-centered approach (even if there were no compelling foundational argument that the ‘decision’ centered approach was incoherent). Similarly, I think there is no overwhelming foundational argument for Bayesian probability theory but philosophers should study it because of its fruitfulness in illuminating many particular issues in the philosophy of science and the foundations of statistics (not to mention its success in practical machine learning and statistics).

This response may not be very satisfying but I can only recommend the UDT posts (http://wiki.lesswrong.com/wiki/Updateless_decision_theory) and the recent MIRI paper http://intelligence.org/files/RobustCooperation.pdf.)

Rough arguments against the decision-centered approach:

Point 1

Suppose I win the lottery after playing 10 times. My decision of which numbers to pick on the last lottery was the cause of winning money. (Whereas previous decisions over numbers produced only disutility). But it’s not clear there’s anything interesting about this distinction. If I lost money on average, the important lesson is the failing of my agent-type (i.e. the way my decision algorithm makes decisions on lottery problems).

And yet in many practical cases that humans face, it is very useful to look back at which decisions led to high utility. If we compare different algorithms playing casino games, or compare following the advice of a poker expert vs. a newbie, we’ll get useful information by looking at the utility caused by each decision. But this investigation of decisions that cause high utility is completely explainable from the agent-centered approach. When simulation and logical correlations between agents are not part of the problem, the optimal agent will make decisions that cause the most utility. UDT/TDT and variants all (afaik) act like CDT in these simple decision problems. If we came upon a Newcomb problem without being told the setup (and without any familiarity with these decision theory puzzles), we would see that the CDTer’s decisions were causing utility and the EDTer’s decisions were not causing any utility. The EDTer would look like lunatic with bizarrely good luck. Here we are following a local causal criterion in comparing actions. While usually fine, we would clearly be missing out on an important part of the story in the Newcomb problem.

Point 2

In AI, we want to build decision making agents that win. In life, we want to improve our decision making so that we win. Thinking about the utility caused by individual decisions may be a useful subgoal in coming up with winning agents, but it seems hard to see it as the central issue. The Newcomb problem (and the counterfactual mugging and Parfit’s Hitchhiker) make clear that a local Markovian criterion (e.g. choose the action that will cause the highest utility, ignoring all previous actions/commitments) is inadequate for winning.

Point 3

The UDT one-boxer’s agent type does not cause utility in the NP. However it does logically determine the utility. (More specifically, we could examine the one-boxing program as a formal system and try to isolate which rules/axioms lead to its one boxing in this type of problem). Similarly, if two people were using different sets of axioms (where one set is inconsistent), we might point to one of the axioms and say that its inclusion is what determines the inconsistency of the system. This is a mere sketch, but it might be possible to develop a local criterion by which “responsibility” for utility gains can be assigned to particular aspects of an agent.

It’s clear that we can learn about good agent types by examining particular decisions. We don’t have to always work with a fully specified program. (And we don’t have the code of any AI that can solve decision problems the way humans can). So the more local approach may have some value.
- PhilosophyStudent 30 Jun 2013 23:31 UTC
  0 points
  Parent
  Generally agree. I think there are good arguments for focusing on decision types rather than decisions. A few comments:
  
  Point 1: That’s why rationality of decisions is evaluated in terms of expected outcome, not actual outcome. So actually, it wasn’t just your agent type that was flawed here but also your decisions. But yes, I agree with the general point that agent type is important.
  
  Point 2: Agree
  
  Point 3: Yes. I agree that there could be ways other than causation to attribute utility to decisions and that these ways might be superior. However, I also think that the causal approach is one natural way to do this and so I think claims that the proponent of two-boxing doesn’t care about winning are false. I also think it’s false to say they have a twisted definition of winning. It may be false but I think it takes work to show that (I don’t think they are just obviously coming up with absurd definitions of winning).