Ouch, sorry! Too much time spent thinking about these things by myself without having to explain what I mean, probably. I think I still haven’t gotten across the simple point I was trying to make; let me try again.
Given an optimization problem like the TSP, there is a well-defined notion of “what utility/score do I get if I choose to do X,” independent of the algorithm you use to compute a solution—so there is a theoretical optimum solution independent of the algorithm you use to make a choice, and you can compare different heuristic algorithms on how close they come to the theoretical optimum. It seems useful to have this sort of description of the problem when trying to find actual practical algorithms.
I used to hope that we could find an equivalent for timeless decision theories, an abstract description of the optimum solution that we could then try to approximate through heuristic algorithms, but in your work, the definition of “what would happen if I did X” is so intimately connected to the algorithm making the decision that it’s at least not obvious to me how this could work.
Most (all?) proofs of statements like “if I did X, utility would be U” don’t actually need to look at the implementation of the agent, only find the call sites in world(). So we can compare different heuristics without things blowing up. Or did you mean something else?
Do you have an example of a decision theory problem where the optimal solution is hard to find, but heuristics help? Or should I use the TSP as an example?
(Sorry for the delayed reply.) I’m a little confused: it’s true that we can easily compare heuristics for world()s that depend on the agent only through easily identifiable call sites, but if we’re only considering this class of problems, what does your algorithm add over the simpler algorithm of replacing all call sites to agent() by a constant, running the result, and choosing the constant that gives the highest value?
(I suppose I was hoping for more direct game theoretical applications than you, so I had situations in mind where the (rest of) the world does depend on the source of the agent.)
Regarding examples, what I had in mind was: Surely your literal algorithm of enumerating all proofs is not efficient enough to run even on toy problems like your implementation of Newcomb! Also, surely for any real-world problem, it won’t be feasible to find the optimal solution. So at some point, we will want to use the theory as a background for looking for practical (and thus necessarily heuristic) algorithms.
what does your algorithm add over the simpler algorithm of replacing all call sites to agent() by a constant, running the result, and choosing the constant that gives the highest value?
Admittedly, not much. If world() calls agent2() that provably returns the same value as agent(), agent() will notice it. But this wasn’t really the point of the post. The point was to show that having perfect knowledge of yourself doesn’t necessarily stop you from considering counterfactuals about yourself. Of course it’s all completely impractical.
Perhaps a closer approximation to reality would be a “listener” algorithm that accepts proofs (e.g. by reading game theory books), instead of trying to find them.
Ouch, sorry! Too much time spent thinking about these things by myself without having to explain what I mean, probably. I think I still haven’t gotten across the simple point I was trying to make; let me try again.
Given an optimization problem like the TSP, there is a well-defined notion of “what utility/score do I get if I choose to do X,” independent of the algorithm you use to compute a solution—so there is a theoretical optimum solution independent of the algorithm you use to make a choice, and you can compare different heuristic algorithms on how close they come to the theoretical optimum. It seems useful to have this sort of description of the problem when trying to find actual practical algorithms.
I used to hope that we could find an equivalent for timeless decision theories, an abstract description of the optimum solution that we could then try to approximate through heuristic algorithms, but in your work, the definition of “what would happen if I did X” is so intimately connected to the algorithm making the decision that it’s at least not obvious to me how this could work.
Most (all?) proofs of statements like “if I did X, utility would be U” don’t actually need to look at the implementation of the agent, only find the call sites in world(). So we can compare different heuristics without things blowing up. Or did you mean something else?
Do you have an example of a decision theory problem where the optimal solution is hard to find, but heuristics help? Or should I use the TSP as an example?
(Sorry for the delayed reply.) I’m a little confused: it’s true that we can easily compare heuristics for world()s that depend on the agent only through easily identifiable call sites, but if we’re only considering this class of problems, what does your algorithm add over the simpler algorithm of replacing all call sites to agent() by a constant, running the result, and choosing the constant that gives the highest value?
(I suppose I was hoping for more direct game theoretical applications than you, so I had situations in mind where the (rest of) the world does depend on the source of the agent.)
Regarding examples, what I had in mind was: Surely your literal algorithm of enumerating all proofs is not efficient enough to run even on toy problems like your implementation of Newcomb! Also, surely for any real-world problem, it won’t be feasible to find the optimal solution. So at some point, we will want to use the theory as a background for looking for practical (and thus necessarily heuristic) algorithms.
Admittedly, not much. If world() calls agent2() that provably returns the same value as agent(), agent() will notice it. But this wasn’t really the point of the post. The point was to show that having perfect knowledge of yourself doesn’t necessarily stop you from considering counterfactuals about yourself. Of course it’s all completely impractical.
Perhaps a closer approximation to reality would be a “listener” algorithm that accepts proofs (e.g. by reading game theory books), instead of trying to find them.