I’m approaching decision theory from from the perspective compilers approach optimizations: no approach is guaranteed to work always, but each one comes with a list of preconditions that you can check. I’m also summarizing some of the relevant work from compilers: automatic provably correct simplification, translation between forms, and a handy collection of normal forms to translate into.
For CDT, the precondition is a partial ordering over observation sets passed to the strategy such that the world program calls the strategy with observation sets only in increasing order, and there are finitely many possible observation sets. Then you can translate the program into continuation-passing style, and enumerate the possible invocations of the strategy function and their ordering. The last one in the order is guaranteed to have a continuation with no further invocations of the strategy function, which means you can try each possibility, simulate the results, and use that to determine the best answer. Then you can look at the second-to-last invocation, substitute the best answer to the last invocation into the continuation, and repeat; and so on for the set of all invocations to the strategy function. This works because you have a guarantee that when you compute your current position within the world-program and come up with a probability distribution over states to determine where you are, and then look at future continuations, changing result of any invocations of the strategy in those continuations does not affect the probability distribution over states.
I also have an example of a formalized decision-theory problem for which no optimal answer exists: name a number and that number is your utility. A corollary is that no decision theory can always give optimal answers, even given infinite computing power. This can be worked around by applying size bounds in various places.
I’m also drawing distinctions between strategies and decision theories (a strategy is an answer to one problem, a decision theory is an approach to generating strategies from problems); and between preference and utility (a preference is a partial order over outcomes; a utility function is a total order over outcomes where the outcomes are complete probability distributions, plus a linearity requirement).
By that, do you mean that it sounds wrong, or that it sounds confused? If the former, I may need to reconsider; if the latter, I’m unsurprised because it’s much too short and doesn’t include any of the actual formalization. (That was not an excerpt from the draft I’m writing, but an attempt to summarize it briefly. I don’t think I did it justice.)
Ok, in that case I’m inclined to think that impression is just an artifact of how I summarized it, since my summary didn’t address the questions, but the longer paper I’m working on does, albeit only after building up proof and formalization techniques, which are the main focus.
As far as I know, there are no cases where UDT suggests a decision and disagrees with mine. The differences are all in cases where UDT alone can’t be used to reach a decision.
Give a quick soundbite without context?
I’m approaching decision theory from from the perspective compilers approach optimizations: no approach is guaranteed to work always, but each one comes with a list of preconditions that you can check. I’m also summarizing some of the relevant work from compilers: automatic provably correct simplification, translation between forms, and a handy collection of normal forms to translate into.
For CDT, the precondition is a partial ordering over observation sets passed to the strategy such that the world program calls the strategy with observation sets only in increasing order, and there are finitely many possible observation sets. Then you can translate the program into continuation-passing style, and enumerate the possible invocations of the strategy function and their ordering. The last one in the order is guaranteed to have a continuation with no further invocations of the strategy function, which means you can try each possibility, simulate the results, and use that to determine the best answer. Then you can look at the second-to-last invocation, substitute the best answer to the last invocation into the continuation, and repeat; and so on for the set of all invocations to the strategy function. This works because you have a guarantee that when you compute your current position within the world-program and come up with a probability distribution over states to determine where you are, and then look at future continuations, changing result of any invocations of the strategy in those continuations does not affect the probability distribution over states.
I also have an example of a formalized decision-theory problem for which no optimal answer exists: name a number and that number is your utility. A corollary is that no decision theory can always give optimal answers, even given infinite computing power. This can be worked around by applying size bounds in various places.
I’m also drawing distinctions between strategies and decision theories (a strategy is an answer to one problem, a decision theory is an approach to generating strategies from problems); and between preference and utility (a preference is a partial order over outcomes; a utility function is a total order over outcomes where the outcomes are complete probability distributions, plus a linearity requirement).
So far, doesn’t sound good.
By that, do you mean that it sounds wrong, or that it sounds confused? If the former, I may need to reconsider; if the latter, I’m unsurprised because it’s much too short and doesn’t include any of the actual formalization. (That was not an excerpt from the draft I’m writing, but an attempt to summarize it briefly. I don’t think I did it justice.)
Doesn’t seem to address relevant questions or give interesting answers.
Ok, in that case I’m inclined to think that impression is just an artifact of how I summarized it, since my summary didn’t address the questions, but the longer paper I’m working on does, albeit only after building up proof and formalization techniques, which are the main focus.
Would something like UDT fit into your framework?
As far as I know, there are no cases where UDT suggests a decision and disagrees with mine. The differences are all in cases where UDT alone can’t be used to reach a decision.