There is definitely not an “objective” way to define “winning” in general, at least not yet. It’s more of an intuition pump that can help make it clearer in each example what the right thing to do is (so that we can judge whether a decision theory recommends doing the right thing). To add some context, prior to Eliezer’s innovation of asking what the “winning” choice is, philosophers used to ask what the “rational” choice is, which led to some to say that the “rational” choice in Newcomb’s problem is to two-box and to propose decision theories like CDT that two-box. Talking about “winning” makes it clearer or more intuitive that one-box is the right thing to do. But it’s not a panacea and often it’s not entirely obvious what the “winning” choice is, or different people can disagree about it.
So hypothetically, there could be an undiscovered Decision Theory in which Prisoner A is tricked into cooperating, only for B to betray him, resulting in B achieving the optimal outcome of C,D. Wouldn’t such a result objectively see B “winning”, because he got a higher payoff than C,C?
Yeah, I would say an agent or decision theory that can trick other agents into the C,D outcome would be “more winning”. From my perspective, we’re not so much “striving to generate algorithms that will guarantee a C,C result” but instead aiming for a decision theory that exploits everyone who can be exploited and achieves C,C with everyone else (to the extent possible).
There is definitely not an “objective” way to define “winning” in general, at least not yet. It’s more of an intuition pump that can help make it clearer in each example what the right thing to do is (so that we can judge whether a decision theory recommends doing the right thing). To add some context, prior to Eliezer’s innovation of asking what the “winning” choice is, philosophers used to ask what the “rational” choice is, which led to some to say that the “rational” choice in Newcomb’s problem is to two-box and to propose decision theories like CDT that two-box. Talking about “winning” makes it clearer or more intuitive that one-box is the right thing to do. But it’s not a panacea and often it’s not entirely obvious what the “winning” choice is, or different people can disagree about it.
Yeah, I would say an agent or decision theory that can trick other agents into the C,D outcome would be “more winning”. From my perspective, we’re not so much “striving to generate algorithms that will guarantee a C,C result” but instead aiming for a decision theory that exploits everyone who can be exploited and achieves C,C with everyone else (to the extent possible).
Could you please provide a simple explanation of your UDT?