It seems to me that, also with the bet you describe, there is no paradox/inconsistency.
To make sure we’re talking about the same thing: The bet I’m considering is:
each agent in each room is separately/independently given the option to bet that the coin came up heads. If the agent says “yes, heads”, and coin=1, that one agent wins 1 utilon. If the agent says “yes, heads”, but coin=0, that one agent loses 3 utilons.
One likely source of confusion that I see here is: If one thinks about {what the agent cares about} in terms of “I”, “me”, “this agent”, or other such concepts which correspond poorly to the Territory (in this kind of dilemma/situation).
To properly deconfuse that, I recommend Tabooing “this”, “I”, “me”, “you”, etc. (A trick I found useful for that was to consider variations of the original dilemma, where the agents additionally have numbers tattooed on them; either numbers from 1 to 20, or random UUIDs, or etc.; either visible to the agent or not. Then one can formulate the agent’s utility function in terms of “agent with number N tattooed on it”, instead of e.g. “instances of me”.)
For brevity, below I do use “this” and “I” and etc. Hopefully enough of the idea still comes through to be useful.
If what the agent cares about is something like “utilons gained in total, by computations/agents that are similar to the original agent”, then:
Before the experiment: The agent would want agents in green rooms to accept the bet, and agents in red rooms to reject the bet.
Upon waking up in a green room: The agent has received no information which would allow it to distinguish between coin-flip-outcomes, and its probability for coin=1 is still 50⁄50. I.e., the agent is in practically the same situation as before the experiment, and so its answer is still the same: accept the bet. (And conversely if in a red room.)
The above seems consistent/paradox-free to me.(?)
If what the agent cares about is something like “utilons gained, by this particular blob of atoms, and the temporal sort-of-continuation of it, as usually understood by e.g. humans”, then:
Before the experiment: The original description of the dilemma leaves unclear what happens to the original blob of atoms, but here I’ll assume that the original blob-of-atoms is destroyed or somehow split 20 ways. In that case, before the experiment, the agent would not care at all how the other copy-agents bet. They’re not the same blob-of-atoms, after all. The pre-experiment agent’s preferences w.r.t. betting are undefined.
Upon waking up in a green room: the agent’s probability distribution over coin=1 is still 50⁄50. And it is a new blob-of-atoms, not the original pre-experiment blob-of-atoms. It doesn’t care at all how other blobs-of-atoms fare. It rejects the bet.
To the extent that there’s an inconsistency between the pre-experiment agent and in-room agents, I think it’s due to them being different agents with different utility functions. So it doesn’t seem accurate to say that either agent’s individual preferences are inconsistent?
A likely objection I anticipate to the above is something like:
“{The total number of utilons (summed over agents) gained by agents using the above probability-update- and decision-algorithm} is less than {the total number of utilons gained by agents that would update their probability for coin=1 to 90% upon seeing green}.”
To which I think Blob-of-atoms #N would respond:
“Yes, but Blob #N does not care that Other-blobs-of-atoms-using-the-same-reasoning-algorithm tend to fare poorly. Blob #N isn’t trying to cooperate with near-copies of itself. So what if other blobs-of-atoms don’t gain utility? That argument doesn’t really weigh against Blob #N’s reasoning-algorithm, does it?”
Does this make sense to you? If not, what seems (most) wrong/confused?
If you’re ok with time inconsistent probabilities then you can be dutch booked.
I think of identity in terms of expectations. Right before you go to sleep, you have a rational subjective expectation of “waking up” with any number from 1-20 with a 5% probability.
It’s not clear how the utility function in your first case says to accept the bet given that you have the probability as 50⁄50. You can’t be maximizing utility, have that probability, and accept the bet—that’s just not what maximizes probability under those assumptions.
My version of the bet shouldn’t depend on if you care about other agents or not, because the bet doesn’t affect other agents.
If you’re ok with time inconsistent probabilities then you can be dutch booked.
Sure. Has some part of what I’ve written given the impression that I think time-inconsistent probabilities (or preferences) are OK?
I think of identity in terms of expectations. [...]
I want to give a thumbs-up to the policy of sharing ways-of-thinking-about-stuff. (Albeit that I think I see how that particular way of thinking about this stuff is probably confused. I’m still suggesting Tabooing “I”, “me”, “you”, “[me] waking up in …”, etc.) Thanks.
It’s not clear how the utility function in your first case says to accept the bet given that [...]
True, that part of what I wrote glossed over a large bunch of details (which may well be hiding confusion on my part). To try to quickly unpack that a bit:
In the given scenario, each agent cares about all similar agents.
Pretending to be a Solomonoff inductor, and updating on all available information/observations—without mapping low-level observations into confused nonsense like “I/me is observing X”—an agent in a green room ends up with p(coin=1) = 0.5.
The agent’s model of reality includes a model of {the agent itself, minus the agent’s model of itself (to avoid infinite recursion)}.
Looking at that model from a bird’s-eye-view, the agent searches for an action a that would maximize ∑w∈Wutility received by xeroxed agents in the version of w where this agent outputs a, where W is the set of “possible” worlds. (I.e.W is the set of worlds that are consistent with what has been observed thus far.) (We’re not bothering to weight the summed terms by p(w) because here all w are equiprobable.)
According to the agent’s model, all in-room-agents are running the same decision-algorithm, and thus all agents observing the same color output the same decision. This constrains what W can contain. In particular, it only contains worlds w where if this agent is outputting a, then also all other agents (in rooms of the same color) are also outputting a.
The agent’s available actions are “accept bet” and “decline bet”. When the agent considers those worlds where it (and thus, all other agents-in-green) outputs “accept bet”, it calculates the total utility gained by xeroxed agents to be higher, than in those worlds where it output “decline bet”.
The agent outputs “accept bet”.
If the above is not “maximizing utility”, then I’m confused about what (you mean by) “maximizing utility”. Did this clarify anything?
My version of the bet shouldn’t depend on if you care about other agents or not, because the bet doesn’t affect other agents.
It’s true that (if the rooms are appropriately sealed off from each other) the blobs-of-atoms in different rooms cannot causally affect each other. But given knowledge that all agents are exact copies of each other, the set of “possible” worlds is constrained to contain only {worlds where all agents (in rooms of the same color) output the same decision}. (I’m thinking very loosely in terms of something like Solomonoff induction here.) Thus it seems to me that {operating/deciding as if agents in other rooms “could” decide something different from each other} is like operating with the wrong set of “possible” worlds; i.e. like doing something wrong relative to Solomonoff induction, and/or having an incorrect model of reality.
I’ve spent a lot of time and written a handful of posts (including one on the interaction between Solomonoff and SIA) building my ontology. Parts may be mistaken but I don’t believe it’s “confused”. Tabooing core concepts will just make it more tedious to explain, probably with no real benefit.
In particular, the only actual observations anyone has are of the form “I have observed X”, and that needs to be the input into Solomonoff. You can’t input a bird’s eye view because you don’t have one.
Anyway, it seems weird that being altruistic affects the agent’s decision as to a purely local bet. You end up with the same answer as me on that question, acting “as if” the probability was 90%, but in a convoluted manner.
Maybe you should taboo probability. What does it mean to say that the probability is 50%, if not that you’ll accept purely local bets with better odds and not worse odds? The only purpose of probability in my ontology is for predictions for betting purposes (or decision making purposes that map onto that). Maybe it is your notion of probability that is confused.
Thanks for the suggestions. Clearly there’s still a lot of potentially fruitful disagreement here, some of it possibly mineable for insights; but I’m going to put this stuff on the shelf for now. Anyway, thanks.
each blob-of-atoms cares only about itself (not about similar/copied agents)
and the original blob-of-atoms is inserted (e.g. uniformly at random) into one of the rooms (along with 19 copy-agents)
it seems that there is in fact a temporal inconsistency: Before the experiment, original-blob would want all agents (including original-blob) in green rooms to accept the bet, but upon waking up and observing green, original-blob would reject the bet. Will update the post to reflect this.
In general it’s not necessary for each blob-of-atoms to care only about itself. It’s enough to have any distinction at all in utility of outcomes between itself and other similar blobs-of-atoms. Caring only about itself is just one of the more extreme examples.
You can start with Bostrom’s book on anthropic bias. https://www.anthropic-principle.com/q=book/table_of_contents/
The bet is just each agent is independently offered a 1:3 deal. There’s no dependence as in EYs post.
It seems to me that, also with the bet you describe, there is no paradox/inconsistency.
To make sure we’re talking about the same thing: The bet I’m considering is:
One likely source of confusion that I see here is: If one thinks about {what the agent cares about} in terms of “I”, “me”, “this agent”, or other such concepts which correspond poorly to the Territory (in this kind of dilemma/situation).
To properly deconfuse that, I recommend Tabooing “this”, “I”, “me”, “you”, etc. (A trick I found useful for that was to consider variations of the original dilemma, where the agents additionally have numbers tattooed on them; either numbers from 1 to 20, or random UUIDs, or etc.; either visible to the agent or not. Then one can formulate the agent’s utility function in terms of “agent with number N tattooed on it”, instead of e.g. “instances of me”.)
For brevity, below I do use “this” and “I” and etc. Hopefully enough of the idea still comes through to be useful.
If what the agent cares about is something like “utilons gained in total, by computations/agents that are similar to the original agent”, then:
Before the experiment: The agent would want agents in green rooms to accept the bet, and agents in red rooms to reject the bet.
Upon waking up in a green room: The agent has received no information which would allow it to distinguish between coin-flip-outcomes, and its probability for coin=1 is still 50⁄50. I.e., the agent is in practically the same situation as before the experiment, and so its answer is still the same: accept the bet. (And conversely if in a red room.)
The above seems consistent/paradox-free to me.(?)
If what the agent cares about is something like “utilons gained, by this particular blob of atoms, and the temporal sort-of-continuation of it, as usually understood by e.g. humans”, then:
Before the experiment: The original description of the dilemma leaves unclear what happens to the original blob of atoms, but here I’ll assume that the original blob-of-atoms is destroyed or somehow split 20 ways. In that case, before the experiment, the agent would not care at all how the other copy-agents bet. They’re not the same blob-of-atoms, after all. The pre-experiment agent’s preferences w.r.t. betting are undefined.
Upon waking up in a green room: the agent’s probability distribution over coin=1 is still 50⁄50. And it is a new blob-of-atoms, not the original pre-experiment blob-of-atoms. It doesn’t care at all how other blobs-of-atoms fare. It rejects the bet.
To the extent that there’s an inconsistency between the pre-experiment agent and in-room agents, I think it’s due to them being different agents with different utility functions. So it doesn’t seem accurate to say that either agent’s individual preferences are inconsistent?
A likely objection I anticipate to the above is something like:
To which I think Blob-of-atoms #N would respond:
Does this make sense to you? If not, what seems (most) wrong/confused?
A couple of things.
If you’re ok with time inconsistent probabilities then you can be dutch booked.
I think of identity in terms of expectations. Right before you go to sleep, you have a rational subjective expectation of “waking up” with any number from 1-20 with a 5% probability.
It’s not clear how the utility function in your first case says to accept the bet given that you have the probability as 50⁄50. You can’t be maximizing utility, have that probability, and accept the bet—that’s just not what maximizes probability under those assumptions.
My version of the bet shouldn’t depend on if you care about other agents or not, because the bet doesn’t affect other agents.
Sure. Has some part of what I’ve written given the impression that I think time-inconsistent probabilities (or preferences) are OK?
I want to give a thumbs-up to the policy of sharing ways-of-thinking-about-stuff. (Albeit that I think I see how that particular way of thinking about this stuff is probably confused. I’m still suggesting Tabooing “I”, “me”, “you”, “[me] waking up in …”, etc.) Thanks.
True, that part of what I wrote glossed over a large bunch of details (which may well be hiding confusion on my part). To try to quickly unpack that a bit:
In the given scenario, each agent cares about all similar agents.
Pretending to be a Solomonoff inductor, and updating on all available information/observations—without mapping low-level observations into confused nonsense like “I/me is observing X”—an agent in a green room ends up with p(coin=1) = 0.5.
The agent’s model of reality includes a model of {the agent itself, minus the agent’s model of itself (to avoid infinite recursion)}.
Looking at that model from a bird’s-eye-view, the agent searches for an action a that would maximize ∑w∈Wutility received by xeroxed agents in the version of w where this agent outputs a, where W is the set of “possible” worlds. (I.e.W is the set of worlds that are consistent with what has been observed thus far.) (We’re not bothering to weight the summed terms by p(w) because here all w are equiprobable.)
According to the agent’s model, all in-room-agents are running the same decision-algorithm, and thus all agents observing the same color output the same decision. This constrains what W can contain. In particular, it only contains worlds w where if this agent is outputting a, then also all other agents (in rooms of the same color) are also outputting a.
The agent’s available actions are “accept bet” and “decline bet”. When the agent considers those worlds where it (and thus, all other agents-in-green) outputs “accept bet”, it calculates the total utility gained by xeroxed agents to be higher, than in those worlds where it output “decline bet”.
The agent outputs “accept bet”.
If the above is not “maximizing utility”, then I’m confused about what (you mean by) “maximizing utility”. Did this clarify anything?
It’s true that (if the rooms are appropriately sealed off from each other) the blobs-of-atoms in different rooms cannot causally affect each other. But given knowledge that all agents are exact copies of each other, the set of “possible” worlds is constrained to contain only {worlds where all agents (in rooms of the same color) output the same decision}. (I’m thinking very loosely in terms of something like Solomonoff induction here.) Thus it seems to me that {operating/deciding as if agents in other rooms “could” decide something different from each other} is like operating with the wrong set of “possible” worlds; i.e. like doing something wrong relative to Solomonoff induction, and/or having an incorrect model of reality.
Maybe: try Tabooing the word “affect”?
I’ve spent a lot of time and written a handful of posts (including one on the interaction between Solomonoff and SIA) building my ontology. Parts may be mistaken but I don’t believe it’s “confused”. Tabooing core concepts will just make it more tedious to explain, probably with no real benefit.
In particular, the only actual observations anyone has are of the form “I have observed X”, and that needs to be the input into Solomonoff. You can’t input a bird’s eye view because you don’t have one.
Anyway, it seems weird that being altruistic affects the agent’s decision as to a purely local bet. You end up with the same answer as me on that question, acting “as if” the probability was 90%, but in a convoluted manner.
Maybe you should taboo probability. What does it mean to say that the probability is 50%, if not that you’ll accept purely local bets with better odds and not worse odds? The only purpose of probability in my ontology is for predictions for betting purposes (or decision making purposes that map onto that). Maybe it is your notion of probability that is confused.
Thanks for the suggestions. Clearly there’s still a lot of potentially fruitful disagreement here, some of it possibly mineable for insights; but I’m going to put this stuff on the shelf for now. Anyway, thanks.
Update: Upon considering the situation where
each blob-of-atoms cares only about itself (not about similar/copied agents)
and the original blob-of-atoms is inserted (e.g. uniformly at random) into one of the rooms (along with 19 copy-agents)
it seems that there is in fact a temporal inconsistency: Before the experiment, original-blob would want all agents (including original-blob) in green rooms to accept the bet, but upon waking up and observing green, original-blob would reject the bet. Will update the post to reflect this.
In general it’s not necessary for each blob-of-atoms to care only about itself. It’s enough to have any distinction at all in utility of outcomes between itself and other similar blobs-of-atoms. Caring only about itself is just one of the more extreme examples.