As for 4 - even just remembering anything is a self modification of memory.
That’s for humans, not abstract agents? Don’t think it matters, we talk about other self-modifications anyway.
From your problem description
Not mine :)
utility on other branches
Maybe this interpretation is what repels you? Here’s another 2:
You choose to behave like EDT-agent or like FDT-agent in the situations where it matters in advance, before you got into (1) or (3). And you can’t legibly for the predictors like one in this game decide to behave like FDT agent, and then, in the future, when you got into (1) because you’re unlucky, just change your mind. It’s just not an option. And between options “legibly choose to behave like EDT-agent” and “legibly choose to behave like FDT-agent” the second one is clearly better in expectation. You just not make another choice in (1) or (2), it’s already decided.
If you find yourself in (1) or (2) you can’t differentiate between cases “I am real me” and “I am the model of myself inside predictor” (because if you could, you could behave differently in this two cases and it would be bad model and bad predictor). So you decide for both at once. (this interpretation doesn’t work well for afents with explicitly self-indicated values (or how it is called? I hope it’s clear what I mean.))
The earlier decision to precommit (whether actually made or later simulated/hallucinated) sacrifices utility of some future selves in exchange for greater utility to other future selves.
Yes. It’s like choose to win on a 1-5 on a die roll rather then win on a 6. You sacrifice utility if some future selves (in the worlds, when die roll 6) in exchange for greater utility to other future selves, and it’s perfectly rational.
We can also construct more specific variants of 5 where FDT loses—such as environments where the message at step B is from an anti-Omega which punishes FDT like agents.
Ok, yes. You can do it with all other types of agents too.
But naturally a powerful EDT agent will simply adopt that universal precommitment if when it believes it is in a universe distribution where doing so is optimal!
I think the ability to legibly adopt such precommitment and willingness to do so kinda turns EDT-agent into FDT-agent.
I think the ability to legibly adopt such precommitment and willingness to do so kinda turns EDT-agent into FDT-agent.
Yes. I think we are mostly in agreement then. FDT seems to be defined by adopting a form of universal precomitment, which you can only do once and can’t really undo. Seems that EDT can clearly do that (to the extent any agent can adopt FDT), so EDT can always EDT->FDT, but FDT->EDT is not allowed (or it breaks the universal pre-commitment or cooperation across instances) . That does not resolve the question of whether or not adopting FDT is optimal.
My main point from earlier is this:
In principle it seems wrong to measure utility at the moment in time right before A on the basis of our knowledge; seems we should only measure it based on the agent’s knowledge. This means we need to sum our expectation over all possibly universes consistent with those facts. The set of universes that proceed to B/C is infinitesimal and probably counter balanced by opposites—so the very claim itself that FDT is optimal for 5 is perhaps a form of pascal’s mugging.
The agent in scenario 5 before observing the box and the rules is a superposition of all agents in similar scenarios, and it is only correct for us to judge their performance across that entire set—ie according to the agent’s knowledge, not our knowledge. So it’s optimal to take the FDT precomittment in this specific scenario only if it’s optimal to do so over all similar environments, which in this case is nearly all environments as the agent hasn’t observed anything at all at the start of your scenario 5!
So I think this reduces down to the conclusion that FDT and its universal precomittment can’t provide any specific advantage on a specific problem over regular problem-specific precomittments EDT can make, unless it provides a net advantage everywhere across the multiverse, in which case EDT uses that and becomes FDT.
That’s for humans, not abstract agents? Don’t think it matters, we talk about other self-modifications anyway.
Not mine :)
Maybe this interpretation is what repels you? Here’s another 2:
You choose to behave like EDT-agent or like FDT-agent in the situations where it matters in advance, before you got into (1) or (3). And you can’t legibly for the predictors like one in this game decide to behave like FDT agent, and then, in the future, when you got into (1) because you’re unlucky, just change your mind. It’s just not an option. And between options “legibly choose to behave like EDT-agent” and “legibly choose to behave like FDT-agent” the second one is clearly better in expectation. You just not make another choice in (1) or (2), it’s already decided.
If you find yourself in (1) or (2) you can’t differentiate between cases “I am real me” and “I am the model of myself inside predictor” (because if you could, you could behave differently in this two cases and it would be bad model and bad predictor). So you decide for both at once. (this interpretation doesn’t work well for afents with explicitly self-indicated values (or how it is called? I hope it’s clear what I mean.))
Yes. It’s like choose to win on a 1-5 on a die roll rather then win on a 6. You sacrifice utility if some future selves (in the worlds, when die roll 6) in exchange for greater utility to other future selves, and it’s perfectly rational.
Ok, yes. You can do it with all other types of agents too.
I think the ability to legibly adopt such precommitment and willingness to do so kinda turns EDT-agent into FDT-agent.
Yes. I think we are mostly in agreement then. FDT seems to be defined by adopting a form of universal precomitment, which you can only do once and can’t really undo. Seems that EDT can clearly do that (to the extent any agent can adopt FDT), so EDT can always EDT->FDT, but FDT->EDT is not allowed (or it breaks the universal pre-commitment or cooperation across instances) . That does not resolve the question of whether or not adopting FDT is optimal.
My main point from earlier is this:
The agent in scenario 5 before observing the box and the rules is a superposition of all agents in similar scenarios, and it is only correct for us to judge their performance across that entire set—ie according to the agent’s knowledge, not our knowledge. So it’s optimal to take the FDT precomittment in this specific scenario only if it’s optimal to do so over all similar environments, which in this case is nearly all environments as the agent hasn’t observed anything at all at the start of your scenario 5!
So I think this reduces down to the conclusion that FDT and its universal precomittment can’t provide any specific advantage on a specific problem over regular problem-specific precomittments EDT can make, unless it provides a net advantage everywhere across the multiverse, in which case EDT uses that and becomes FDT.