Making a decision at the policy level may be useful for forming habits, but I don’t think that considering others with a similar policy or those who can discern my policy is useful in this example. Those later two are the ones I associate more closely with FDT, and such considerations seem to be very difficult to carry out in practice and perhaps sensitive to assumptions about the world. Honestly I don’t see the motivation for worrying about the decisions of “other agents sufficiently similar to oneself” at all. It doesn’t seem useful to me, right now, making decisions or adopting a policy, and it doesn’t seem useful to build into an A.I. either except in very specific cases where many copies of the A.I. are likely to interact. The heuristic arguments that this is important aren’t convincing to me because they are sufficiently elaborate that it seems other assumptions about the way the environment accesses/includes one’s policy could easily lead to completely different conclusions.
The underlying flaw I see in many pro-FDT style arguments is that they tend to uncritically accept that if adopting FDT (or policy X) is better in one example that adopting policy Y, policy X must be better than policy Y, or at least neither one is the best policy. But I strongly suspect there are no free lunch conditions here—even in the purely Decision Theoretic context of AIXI there are serious issues with the choice of prior being subjective, so I’d expect it to be even worse if one allows the environment read/write access to the whole policy. I haven’t seen any convincing argument that there is some kind of “master policy.” I suppose if you pin down a mixture rigorously defining how the environment is able to read/write the policy then there would be some Bayes optimal policy, but I’m willing to bet it would be deviously hard to find or even approximate.
Making a decision at the policy level may be useful for forming habits, but I don’t think that considering others with a similar policy or those who can discern my policy is useful in this example. Those later two are the ones I associate more closely with FDT, and such considerations seem to be very difficult to carry out in practice and perhaps sensitive to assumptions about the world.
Honestly I don’t see the motivation for worrying about the decisions of “other agents sufficiently similar to oneself” at all. It doesn’t seem useful to me, right now, making decisions or adopting a policy, and it doesn’t seem useful to build into an A.I. either except in very specific cases where many copies of the A.I. are likely to interact. The heuristic arguments that this is important aren’t convincing to me because they are sufficiently elaborate that it seems other assumptions about the way the environment accesses/includes one’s policy could easily lead to completely different conclusions.
The underlying flaw I see in many pro-FDT style arguments is that they tend to uncritically accept that if adopting FDT (or policy X) is better in one example that adopting policy Y, policy X must be better than policy Y, or at least neither one is the best policy. But I strongly suspect there are no free lunch conditions here—even in the purely Decision Theoretic context of AIXI there are serious issues with the choice of prior being subjective, so I’d expect it to be even worse if one allows the environment read/write access to the whole policy. I haven’t seen any convincing argument that there is some kind of “master policy.” I suppose if you pin down a mixture rigorously defining how the environment is able to read/write the policy then there would be some Bayes optimal policy, but I’m willing to bet it would be deviously hard to find or even approximate.