What’s the point of utility functions if you can’t even in principle know their value for the universe you’re actually in? Utility functions are supposed to guide decisions. A utility function that can’t be approximated, even a little, even with infinite computing power, can’t be linked to a decision theory or used in any other way.
I’m generally inclined to agree with you because there’s generally a lot of issues that come up with anthropics, but in order to drop the matter altogether you would need to genuinely dissolve the question.
The steelman response to your point is this:
For each possible strategy you could choose, you can evaluate the probability of which “you” you actually are. You can then evaluate the utility values conditional on each possible self, and calculate the expected value over the probability distribution of selves.
As such, it is clearly possible to approximate and calculate utilities for such functions, and use them to make decisions.
The question is not whether or not you can do the calculations, the question is whether or not those calculations correspond to something meaningful.
A simpler version of the original post is this. Let there be a single, consistent utility function shared by all copies of the agent (X and all Xi). It assigns these utility values:
X chooses “sim”, and then N instances of Xi choose “sim” and 1000-N instances choose “don’t sim” → 1.0 + 0.2N + 0.1(1000-N)
X chooses “don’t sim”, no Xi gets created → 0.9
Of course, the post’s premise is that the only actually possible universe in category 1 is that where all 1000 Xi instances choose “sim” (because they can’t tell if they’re in the simulation or not), so the total utility is then 1 + 0.2*1000 = 201.
This is a simple demonstration of TDT giving the right answer which maximizes the utility (“sim”) while CDT doesn’t (I think?)
What didn’t make sense to me was saying X and Xi somehow have “different” utility functions. Maybe this was just confusion generated by imprecise use of words, and not any real difference.
The post then says:
For every agent it is true that she does not gain anything from the utility of another agent despite the fact she and the other agents are identical!
I’m not sure if this is intended to change the situation. Once you have a utility function that gives out actual numbers, you don’t care how it works on the inside and whether it takes into accounts another agent’s utility or anything else.
What’s the point of utility functions if you can’t even in principle know their value for the universe you’re actually in? Utility functions are supposed to guide decisions. A utility function that can’t be approximated, even a little, even with infinite computing power, can’t be linked to a decision theory or used in any other way.
I’m generally inclined to agree with you because there’s generally a lot of issues that come up with anthropics, but in order to drop the matter altogether you would need to genuinely dissolve the question.
The steelman response to your point is this: For each possible strategy you could choose, you can evaluate the probability of which “you” you actually are. You can then evaluate the utility values conditional on each possible self, and calculate the expected value over the probability distribution of selves. As such, it is clearly possible to approximate and calculate utilities for such functions, and use them to make decisions.
The question is not whether or not you can do the calculations, the question is whether or not those calculations correspond to something meaningful.
A simpler version of the original post is this. Let there be a single, consistent utility function shared by all copies of the agent (X and all Xi). It assigns these utility values:
X chooses “sim”, and then N instances of Xi choose “sim” and 1000-N instances choose “don’t sim” → 1.0 + 0.2N + 0.1(1000-N)
X chooses “don’t sim”, no Xi gets created → 0.9
Of course, the post’s premise is that the only actually possible universe in category 1 is that where all 1000 Xi instances choose “sim” (because they can’t tell if they’re in the simulation or not), so the total utility is then 1 + 0.2*1000 = 201.
This is a simple demonstration of TDT giving the right answer which maximizes the utility (“sim”) while CDT doesn’t (I think?)
What didn’t make sense to me was saying X and Xi somehow have “different” utility functions. Maybe this was just confusion generated by imprecise use of words, and not any real difference.
The post then says:
I’m not sure if this is intended to change the situation. Once you have a utility function that gives out actual numbers, you don’t care how it works on the inside and whether it takes into accounts another agent’s utility or anything else.
The idea is that they have the same utility function, but the utility function takes values over anthropic states (values of “I”).
U(I am X and X chooses sim) = 1
U(I am Xi and Xi chooses sim) = 0.2 etc.
I don’t like it, but I also don’t see an obvious way to reject the idea.