Error: Adding values from different utility functions.
See this comment.
Error: Adding values from different utility functions.
See this comment.
Eliezer’s “arbitrary” strategy has the nice property that it gives both players more expected utility than the Nash equilibrium. Of course there are other strategies with this property, and indeed multiple strategies that are not themselves dominated in this way. It isn’t clear how ideally rational players would select one of these strategies or which one they would choose, but they should choose one of them.
Why not “P1: C, P2: Y”, which maximizes the sum of the two utilities, and is the optimal precommitment under the Rawlian veil-of-ignorance prior?
If we multiply player 2′s utility function by 100, that shouldn’t change anything because it is an affine transformation to a utility function. But then “P1: B, P2: Y” would maximize the sum. Adding values from different utility functions is a meaningless operation.
The reason player 1 would choose B is not because it directly has a higher payout but because including B in a mixed strategy gives player 2 an incentive to include Y in its own mixed strategy, increasing the expected payoff of C for player 1. The fact that A dominates B is irrelevant. The fact that A has better expected utility than the subgame with B and C indicates that player 1 not choosing A is somehow irrational, but that doesn’t give a useful way for player 2 to exploit this irrationality. (And in order for this to make sense for player 1, player 1 would need a way to counter exploit player 2′s exploit, and for player 2 to try its exploit despite this possibility.)
The definition you linked to doesn’t say anything about entering subgame not giving the players information, so no, I would not agree with that.
I would agree that if it gave player 2 useful information, that should influence the analysis of the subgame.
(I also don’t care very much whether we call this object within the game of how the strategies play out given that player 1 doesn’t choose A a “subgame”. I did not intend that technical definition when I used the term, but it did seem to match when I checked carefully when you objected, thinking that maybe there was a good motivation for the definition so it could indicated a problem with my argument if it didn’t fit.)
I also disagree that player 1 not picking A provides useful information to player 2.
I’m sorry but “subgame” has a very specific definition in game theory which you are not being consistent with.
I just explained in detail how the subgame I described meets the definition you linked to. If you are going to disagree, you should be pointing to some aspect of the definition I am not meeting.
Also, intuitively when you are in a subgame you can ignore everything outside of the subgame, playing as if it didn’t exist. But when Player 2 moves he can’t ignore A because the fact that Player 1 could have picked A but did not provides insight into whether Player 1 picked B or C.
If it is somehow the case that giving player 2 info about player 1 is advantageous for player 1, then player 2 should just ignore the info, and everything still plays out as in my analysis. If it is advantageous for player 2, then it just strengthens the case that player 1 should choose A.
I am a game theorist.
I still think you are making a mistake, and should pay more attention to the object level discussion.
To see that it is indeed a subgame:
Represent the whole game with a tree whose root node represents player 1 choosing whether to play A (leads to leaf node), or to enter the subgame at node S. Node S is the root of the subgame, representing player 1′s choices to play B or C leading to nodes representing player 2 choice to play X or Y in those respective cases, each leading to leaf nodes.
Node S is the only node in its information set. The subgame contains all the descendants of S. The subgame contains all nodes in the same information set as any node in the subgame. It meets the criteria.
There is no uncertainty that screws up my argument. The whole point of talking about the subgame was to stop thinking about the possibility that player 1 chose A, because that had been observed not to happen. (Of course, I also argue that player 2 should be interested in logically causing player 1 not to have chosen A, but that gets beyond classical game theory.)
Classical game theory says that player 1 should chose A for expected utility 3, as this is better than than the sub game of choosing between B and C where the best player 1 can do against a classically rational player 2 is to play B with probability 1⁄3 and C with probability 2⁄3 (and player 2 plays X with probability 2⁄3 and Y and with probability 1⁄3), for an expected value of 2.
But, there are pareto improvements available. Player 1′s classically optimal strategy gives player 1 expected utility 3 and player 2 expected utility 0. But suppose instead Player 1 plays C, and player 2 plays X with probability 1⁄3 and Y with probability 2⁄3. Then the expected utility for player 1 is 4 and for player 2 it is 1⁄3. Of course, a classically rational player 2 would want to play X with greater probability, to increase its own expected utility at the expense of player 1. It would want to increase the probability beyond 1⁄2 which is the break even point for player 1, but then player 1 would rather just play A.
So, what would 2 TDT/UDT players do in this game? Would they manage to find a point on the pareto frontier, and if so, which point?
If you have trouble confronting people, you make a poor admin.
Can we please act like we actually know stuff about practical instrumental rationality given how human brains work, and not punish people for openly noticing their weaknesses.
You could have more constructively said something like “Thank you for taking on these responsibilities even though it sometimes makes you uncomfortable. I wonder if anyone else who is more comfortable with that would be willing to help out.”
I use whole life insurance. If you use term insurance, you should have a solid plan for an alternate funding source to replace your insurance at the end of the term.
I believe the Efficient Market Hypothesis is correct enough that reliably getting good results from buying term insurance and investing the premium difference would be a lot of work if possible at all.
See also The Valley of Bad Rationality.
Mere survival doesn’t sound all that great. Surviving in a way that is comforting is a very small target in the general space of survival.
By saying “clubs”, I communicate the message that my friend would be better off betting $1 on a random club than $2 on the seven of diamonds, (or betting $1 on a random heart or spade), which is true, so I don’t really consider that lying.
If, less conveniently, my friend takes what I see to literally mean the suit of the top card, but I still can get them to not bet $2 on the wrong card, then I bite the bullet and lie.
And there’s a very real danger of this being a fully general counterargument against any sufficiently simple moral theory.
Establishing a lower bound on the complexity of a moral theory that has all the features we want seems like a reasonable thing to do. I don’t think the connotations of “fully general counterargument” are appropriate here. “Fully general” means you can apply it against a theory without really looking at the details of the theory. If you have to establish that the theory is sufficiently simple before applying the counterargument, you are referencing the details of the theory in a way that differentiates from other theories, and the counterargument is not “fully general”.
and the number of possible models for T rounds is exponential in T
??? Here n is the number of other people betting. It’s a constant.
Within a single application of online learning, n is a constant, but that doesn’t mean we can’t look at the consequences of it having particular values, even values that vary with other parameters. But, you seem to be agreeing with the main points that if you use all possible models (or “super-people”) the regret bound is meaningless, and that in order to reduce the number of models so it is not meaningless, while also keeping a good model that is worth performing almost as well as, you need structural assumptions.
even if the “true hypothesis” isn’t in the family of models we consider
I agree you don’t need the model that is right every round, but you do need the model to be right in a lot of rounds. You don’t need a perfect model, but you need a model that is as correct as you want your end results to be.
maybe even adversarial data
I think truly adversarial data gives a result that is within the regret bounds, as guaranteed, but still uselessly inaccurate because the data is adversarial against the collection of models (unless the collection is so large you aren’t really bounding regret).
Everyone with half a brain could game them either to shorten their stay or to get picked as a leader candidate.
Maybe that’s the test.
Regarding myth 5 and the online learning, I don’t think the average regret bound is as awesome as you claim. The bound is square root( (log n) / T). But if there are really no structural assumptions, then you should be considering all possible models, and the number of possible models for T rounds is exponential in T, so the bound ends up being 1, which is the worst possible average regret using any strategy. With no assumptions of structure, there is no meaningful guarantee on the real accuracy of the method.
The thing that is awesome about the bounds guarantee is that if you assume some structure, and choose a subset of possible models based on that structure, you know you get increased accuracy if your structural assumptions hold.
So this method doesn’t really avoid relying on structural assumptions, it just punts the question of which structural assumption to make to the choice of models to run the method over. This is pretty much the same as Bayesian methods putting the structural assumptions in the prior, and it seems that choosing a collection of models is an approximation of choosing a prior, though less powerful because instead of assigning models probabilities in a continuous range, it just either includes the model or doesn’t.
The obvious steelman of dialogue participant A would keep the coin hidden but ready to inspect, so that A can offer bets having credible ignorance of the outcomes and B isn’t justified in updating on A offering the bet.
Yvain says that people claim to be using one simple deontological rule “Don’t violate consent” when in fact they are using a complicated collection of rules of the form “Don’t violate consent in this specific domain” while not following other rules of that form.
And yet, you accuse him of strawmanning their argument to be simple.
“Rationality” seems to give different answer to the same problem posed with different affine transformations of the players’ utility functions.