Reflective modification flow: Suppose we have an EDT agent that can take an action to modify its decision theory. It will try to choose based on the average outcome conditioned on taking the different decision. In some circumstances, EDT agents are doing well so it will expect to do well by not changing; in other circumstances, maybe it expects to do better conditional on self-modifying to use the Counterfactual Perspective more.
Evolutionary flow: If you put a mixture of EDT and FDT agents in an evolutionary competition where they’re playing some iterated game and high scorers get to reproduce, what does the population look like at large times, for different games and starting populations?
Reflective modification flow: Suppose we have an EDT agent that can take an action to modify its decision theory. It will try to choose based on the average outcome conditioned on taking the different decision. In some circumstances, EDT agents are doing well so it will expect to do well by not changing; in other circumstances, maybe it expects to do better conditional on self-modifying to use the Counterfactual Perspective more.
Evolutionary flow: If you put a mixture of EDT and FDT agents in an evolutionary competition where they’re playing some iterated game and high scorers get to reproduce, what does the population look like at large times, for different games and starting populations?