Anthropic undeath by definition begins when your sensory experience ends. If you end up in an afterlife, the anthropic undeath doesn’t begin until the real afterlife ends. That’s because anthropic undeath is a theoretical construct I defined, and that’s how I defined it.
So why does this even count as not just simple death? Are you assuming this is kind of a cut off state of existence where you still have self-awareness but only that? That still leaves room for utility from internal states.
I probably should’ve expanded on this more in the post, so let me explain.
“Anthropic shadow”, if it were to exist, seems like it should be a general principle of how agents should reason, separate from how they are “implemented”.
Abstractly, all an agent is is a tree of decisions. It’s basically just game theory. We might borrow the word “death” for the end of the game, but this is just an analogy. For example, a reinforcement learning agent “dies” when the training episode is over, even though its source code and parameters still exist. It is “dead” in the sense that the agent isn’t planning its actions past this horizon. This is when Anthropic shadow would apply if it were abstract.
But the idea of “anthropically undead” shows that the actual point of “death” is arbitrary; we can create a game with identical utility where the agent never “dies”. So if the only thing the agent cares about is utility, the agent should reason as if there was no anthropic shadow. And this further suggests that the anthropic shadow must’ve been flawed in the first place; good reasoning principles should hold up under reflection.
I’m still not convinced. Suppose an agent has to play a game: two rounds of Russian roulette with a revolver with one bullet and N chambers. The agent doesn’t know N, and will get a reward R if they survive both rounds; if they give up after one round, they get a consolation prize C1 < R. If they give up immediately, they get one C0 < C1 < R.
The optimal strategy depends on N, and the agent’s belief about it. The question is, suppose the agent already passed the first round, and survived, does this give them any information about the distribution P(N)? Anthropic shadow says “no” because, conditioned on them having any use for the information, they must also have survived the first round. So I suppose you could reframe the anthropic shadow as a sort of theorem about how much useful information you can extract out of the outcomes of a series of past rounds of a game in which loss limits your future actions (death being one extreme outcome). I need to think about formalizing this.
And it is wrong because the anthropic principle is true: we learned that N ≠ 1.
Fair, but no more. And any additional survival doesn’t teach us anything else. All meaningful information only reaches us once we have no more use for it.
Anthropic undeath by definition begins when your sensory experience ends. If you end up in an afterlife, the anthropic undeath doesn’t begin until the real afterlife ends. That’s because anthropic undeath is a theoretical construct I defined, and that’s how I defined it.
So why does this even count as not just simple death? Are you assuming this is kind of a cut off state of existence where you still have self-awareness but only that? That still leaves room for utility from internal states.
I probably should’ve expanded on this more in the post, so let me explain.
“Anthropic shadow”, if it were to exist, seems like it should be a general principle of how agents should reason, separate from how they are “implemented”.
Abstractly, all an agent is is a tree of decisions. It’s basically just game theory. We might borrow the word “death” for the end of the game, but this is just an analogy. For example, a reinforcement learning agent “dies” when the training episode is over, even though its source code and parameters still exist. It is “dead” in the sense that the agent isn’t planning its actions past this horizon. This is when Anthropic shadow would apply if it were abstract.
But the idea of “anthropically undead” shows that the actual point of “death” is arbitrary; we can create a game with identical utility where the agent never “dies”. So if the only thing the agent cares about is utility, the agent should reason as if there was no anthropic shadow. And this further suggests that the anthropic shadow must’ve been flawed in the first place; good reasoning principles should hold up under reflection.
I’m still not convinced. Suppose an agent has to play a game: two rounds of Russian roulette with a revolver with one bullet and N chambers. The agent doesn’t know N, and will get a reward R if they survive both rounds; if they give up after one round, they get a consolation prize C1 < R. If they give up immediately, they get one C0 < C1 < R.
The optimal strategy depends on N, and the agent’s belief about it. The question is, suppose the agent already passed the first round, and survived, does this give them any information about the distribution P(N)? Anthropic shadow says “no” because, conditioned on them having any use for the information, they must also have survived the first round. So I suppose you could reframe the anthropic shadow as a sort of theorem about how much useful information you can extract out of the outcomes of a series of past rounds of a game in which loss limits your future actions (death being one extreme outcome). I need to think about formalizing this.
And it is wrong because the anthropic principle is true: we learned that N ≠ 1.
There is the idea of Anthropic decision theory which is related, but I’m still guessing it still has no shadow.
Fair, but no more. And any additional survival doesn’t teach us anything else. All meaningful information only reaches us once we have no more use for it.