Researcher at the Center on Long-Term Risk. I (occasionally) write about altruism-relevant topics on my Substack. All opinions my own.
Anthony DiGiovanni
Without a clear definition of “winning,”
This is part of the problem we’re pointing out in the post. We’ve encountered claims of this “winning” flavor that haven’t been made precise, so we survey different things “winning” could mean more precisely, and argue that they’re inadequate for figuring out which norms of rationality to adopt.
The key claim is: You can’t evaluate which beliefs and decision theory to endorse just by asking “which ones perform the best?” Because the whole question is what it means to systematically perform better, under uncertainty. Every operationalization of “systematically performing better” we’re aware of is either:
Incomplete — like “avoiding dominated strategies”, which leaves a lot unconstrained;
A poorly motivated proxy for the performance we actually care about — like “doing what’s worked in the past”; or
Secretly smuggling in nontrivial non-pragmatic assumptions — like “doing what’s worked in the past, not because that’s what we actually care about, but because past performance predicts future performance”
This is what we meant to convey with this sentence: “On any way of making sense of those words, we end up either calling a very wide range of beliefs and decisions “rational”, or reifying an objective that has nothing to do with our terminal goals without some substantive assumptions.”
(I can’t tell from your comment if you agree with all of that. But, if this was all obvious to you, great! But we’ve often had discussions where someone appealed to “which ones perform the best?” in a way that misses these points.)
Sorry this was confusing! From our definition here:
We’ll use “pragmatic principles” to refer to principles according to which belief-forming or decision-making procedures should “perform well” in some sense.
“Avoiding dominated strategies” is pragmatic because it directly evaluates a decision procedure or set of beliefs based on its performance. (People do sometimes apply pragmatic principles like this one directly to beliefs, see e.g. this work on anthropics.)
Deference isn’t pragmatic, because the appropriateness of your beliefs is evaluated by how your beliefs relate to the person you’re deferring to. Someone could say, “You should defer because this tends to lead to good consequences,” but then they’re not applying deference directly as a principle — the underlying principle is “doing what’s worked in the past.”
at time 1 you’re in a strictly better epistemic position
Right, but 1-me has different incentives by virtue of this epistemic position. Conditional on being at the ATM, 1-me would be better off not paying the driver. (Yet 0-me is better off if the driver predicts that 1-me will pay, hence the incentive to commit.)
I’m not sure if this is an instance of what you call “having different values” — if so I’d call that a confusing use of the phrase, and it doesn’t seem counterintuitive to me at all.
(I might not reply further because of how historically I’ve found people seem to simply have different bedrock intuitions about this, but who knows!)
I intrinsically only care about the real world (I find the Tegmark IV arguments against this pretty unconvincing). As far as I can tell, the standard justification for acting as if one cares about nonexistent worlds is diachronic norms of rationality. But I don’t see an independent motivation for diachronic norms, as I explain here. Given this, I think it would be a mistake to pretend my preferences are something other than what they actually are.
Thanks for clarifying!
covered under #1 in my list of open questions
To be clear, by “indexical values” in that context I assume you mean indexing on whether a given world is “real” vs “counterfactual,” not just indexical in the sense of being egoistic? (Because I think there are compelling reasons to reject UDT without being egoistic.)
I strongly agree with this, but I’m confused that this is your view given that you endorse UDT. Why do you think your future self will honor the commitment of following UDT, even in situations where your future self wouldn’t want to honor it (because following UDT is not ex interim optimal from his perspective)?
I’m afraid I don’t understand your point — could you please rephrase?
Linkpost: “Against dynamic consistency: Why not time-slice rationality?”
This got too long for a “quick take,” but also isn’t polished enough for a top-level post. So onto my blog it goes.
I’ve been skeptical for a while of updateless decision theory, diachronic Dutch books, and dynamic consistency as a rational requirement. I think Hedden’s (2015) notion of time-slice rationality nicely grounds the cluster of intuitions behind this skepticism.
“I’ll {take/lay} $100 at those odds, what’s our resolution mechanism?” is an excellent clarification mechanism
I think one reason this has fallen out of favor is that it seems to me to be a type error. Taking $100 at some odds is a (hypothetical) decision, not a belief. And the reason you’d be willing to take $100 at some odds is, your credence in the statement is such that taking the bet would be net-positive.
I still feel like I don’t know what having a strict preference or permissibility means — is there some way to translate these things to actions?
As an aspiring rational agent, I’m faced with lots of options. What do I do? Ideally I’d like to just be able to say which option is “best” and do that. If I have a complete ordering over the expected utilities of the options, then clearly the best option is the expected utility-maximizing one. If I don’t have such a complete ordering, things are messier. I start by ruling out dominated options (as Maximality does). The options in the remaining set are all “permissible” in the sense that I haven’t yet found a reason to rule them out.
I do of course need to choose an action eventually. But I have some decision-theoretic uncertainty. So, given the time to do so, I want to deliberate about which ways of narrowing down this set of options further seem most reasonable (i.e., satisfy principles of rational choice I find compelling).
(Basically I think EU maximization is a special case of “narrow down the permissible set as much as you can via principles of rational choice,[1] then just pick something from whatever remains.” It’s so straightforward in this case that we don’t even recognize we’re identifying a (singleton) “permissible set.”)
Now, maybe you’d just want to model this situation like: “For embedded agents, ‘deliberation’ is just an option like any other. Your revealed strict preference is to deliberate about rational choice.” I might be fine with this model.[2] But:
For the purposes of discussing how {the VOI of deliberation about rational choice} compares to {the value of going with our current “best guess” in some sense}, I find it conceptually helpful to think of “choosing to deliberate about rational choice” as qualitatively different from other choices.
The procedure I use to decide to deliberate about rational choice principles is not “I maximize EV w.r.t. some beliefs,” it’s “I see that my permissible set is not a singleton, I want more action-guidance, so I look for more action-guidance.”
It seems to me like you were like: “why not regiment one’s thinking xyz-ly?” (in your original question), to which I was like “if one regiments one thinking xyz-ly, then it’s an utter disaster” (in that bullet point), and now you’re like “even if it’s an utter disaster, I don’t care
My claim is that your notion of “utter disaster” presumes that a consequentialist under deep uncertainty has some sense of what to do, such that they don’t consider ~everything permissible. This begs the question against severe imprecision. I don’t really see why we should expect our pretheoretic intuitions about the verdicts of a value system as weird as impartial longtermist consequentialism, under uncertainty as severe as ours, to be a guide to our epistemics.
I agree that intuitively it’s a very strange and disturbing verdict that ~everything is permissible! But that seems to be the fault of impartial longtermist consequentialism, not imprecise beliefs.
The branch that’s about sequential decision-making, you mean? I’m unconvinced by this too, see e.g. here — I’d appreciate more explicit arguments for this being “nonsense.”
my response: “The arrow does not point toward most Sun prayer decision rules. In fact, it only points toward the ones that are secretly bayesian expected utility maximization. Anyway, I feel like this does very little to address my original point that there is this big red arrow pointing toward bayesian expected utility maximization and no big red arrow pointing toward Sun prayer decision rules.”
I don’t really understand your point, sorry. “Big red arrows towards X” only are a problem for doing Y if (1) they tell me that doing Y is inconsistent with doing [the form of X that’s necessary to avoid leaving value on the table]. And these arrows aren’t action-guiding for me unless (2) they tell me which particular variant of X to do. I’ve argued that there is no sense in which either (1) or (2) is true. Further, I think there are various big green arrows towards Y, as sketched in the SEP article and Mogensen paper I linked in the OP, though I understand if these aren’t fully satisfying positive arguments. (I tentatively plan to write such positive arguments up elsewhere.)
I’m just not swayed by vibes-level “arrows” if there isn’t an argument that my approach is leaving value on the table by my lights, or that you have a particular approach that doesn’t do so.
Addendum: The approach I take in “Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective” has precedent in Hedden (2015)’s defense of “time-slice rationality,” which I highly recommend. Relevant quote:
I am unmoved by the Diachronic Dutch Book Argument, whether for Conditionalization or for Reflection. This is because from the perspective of Time-Slice Rationality, it is question-begging. It is uncontroversial that collections of distinct agents can act in a way that predictably produces a mutually disadvantageous outcome without there being any irrationality. The defender of the Diachronic Dutch Book Argument must assume that this cannot happen with collections of time-slices of the same agent; if a collection of time-slices of the same agent predictably produces a disadvantageous outcome, there is ipso facto something irrational going on. Needless to say, this assumption will not be granted by the defender of Time-Slice Rationality, who thinks that the relationship between time-slices of the same agent is not importantly different, for purposes of rational evaluation, from the relationship between time-slices of distinct agents.
I reject the premise that my beliefs are equivalent to my betting odds. My betting odds are a decision, which I derive from my beliefs.
It’s not that I “find it unlikely on priors” — I’m literally asking what your prior on the proposition I mentioned is, and why you endorse that prior. If you answered that, I could answer why I’m skeptical that that prior really is the unique representation of your state of knowledge. (It might well be the unique representation of the most-salient-to-you intuitions about the proposition, but that’s not your state of knowledge.) I don’t know what further positive argument you’re looking for.
really ridiculously strong claim
What’s your prior that in 1000 years, an Earth-originating superintelligence will be aligned to object-level values close to those of humans alive today [for whatever operationalization of “object-level” or “close” you like]? And why do you think that prior uniquely accurately represents your state of knowledge? Seems to me like the view that a single prior does accurately represent your state of knowledge is the strong claim. I don’t see how the rest of your comment answers this.
(Maybe you have in mind a very different conception of “represent” or “state of knowledge” than I do.)
And indeed, it is easy to come up with a case where the action that gets chosen is not best according to any distribution in your set of distributions: let there be one action which is uniformly fine and also for each distribution in the set, let there be an action which is great according to that distribution and disastrous according to every other distribution; the uniformly fine action gets selected, but this isn’t EV max for any distribution in your representor.
Oops sorry, my claim had the implicit assumptions that (1) your representor includes all the convex combinations, and (2) you can use mixed strategies. ((2) is standard in decision theory, and I think (1) is a reasonable assumption — if I feel clueless as to how much I endorse distribution p vs distribution q, it seems weird for me to still be confident that I don’t endorse a mixture of the two.)
If those assumptions hold, I think you can show that the max-regret-minimizing action maximizes EV w.r.t. some distribution in your representor. I don’t have a proof on hand but would welcome counterexamples. In your example, you can check that either the uniformly fine action does best on a mixture distribution, or a mix of the other actions does best (lmk if spelling this out would be helpful).
Adding to Jesse’s comment, the “We’ve often heard things along the lines of...” line refers both to personal communications and to various comments we’ve seen, e.g.:
[link]: “Since this intuition leads to the (surely false) conclusion that a rational beneficent agent might just as well support the For Malaria Foundation as the Against Malaria Foundation, it seems to me that we have very good reason to reject that theoretical intuition”
[link]: “including a few mildly stubborn credence functions in some judiciously chosen representors can entail effective altruism from the longtermist perspective is a fool’s errand. Yet this seems false”
[link]: “I think that if you try to get any meaningful mileage out of the maximality rule … basically everything becomes permissible, which seems highly undesirable”
(Also, as we point out in the post, this is only true insofar as you only use maximality, applied to total consequences. You can still regard obviously evil things as unacceptable on non-consequentialist grounds, for example.)