Let T be a tree of height w and T’ a subset of T regarded as the set of choice nodes in T. Now given a function f on T’ assume there is a function P giving a probability measure P_f on the set of paths through T’ coherent with f for each f. Further assume there is a function u taking paths through T to utilities and define U_P(f) to be the integral of u with respect to P_f over all paths coherent with f and U(f|\sigma) to be the integral of u with respect to P_f over all paths extending \sigma coherent with f.
Now either assume that each choice node has finitely many successors and U(f|\sigma) is always finite. By compactness we can always achieve a upper bound. We now define the value of information of some set S of non-choice nodes in T (i.e. the benefit of getting to look at those nodes in making future decisions).
Let E(T) (the informed maximum) be the max of U(f) for f a function on T’ and let E(T/S) be the max of U(f) over those functions f satisfying f(\sigma)=f(\tau) if {x|\sigma(x) \neq \tau(x) } is a subset of S, i.e. those functions unaware of the result of nodes in S.
The value of getting to look at the results of nodes in S is thus E(T) - E(T/S). Note that to match the concept in the article S should be an initial segment of T (i.e. you get all that info immediately while our model includes the value of buying info that will only be revealed later).
So the original tree T models a tree of “states of affairs”,
and the original partition or subset T’ models each node being either under the decision-maker’s control or not.
Then the function f would go from elements of T’ to successors of those elements—nodes of T, to be sure, but there’s a side condition there.
Then the probability measure P is a somewhat more powerful way of attaching probabilities to the non-choice nodes—that is, if you have a distribution over the successors of each non-choice node, then you can obtain a probability measure, but a probability measure over paths would allow some additional correlations.
The function u can be understood (in a finite tree) as labelling leaves with utilities, because paths of maximum length in a finite tree are isomorphic with leaves—but by describing it the way you did, you leave the door open to applying this formalism to an infinite tree.
UP(f) would be the utility of a particular strategy (f), and U(f|\sigma) would be… the utility given a certain initial sequence of events? So \sigma is a finite path segment?
I don’t understand the grammar of “Now either …. and …”—should it be “Now either … or …”? Or is it really “Now assume … and …”?
When you use U(f) later, I am guessing that’s either UP(f) with the P elided, or U(f | the empty path segment) - regardless, we’re going to have to fix a P in order to get a utility, right?
Then the phrase “f(sigma)=f(tau) if the set of x such that \sigma(x) is not equal to tau(x) is a subset of S”—If I understand correctly, sigma and tau are finite path segments, which are isomorphic to nodes in the tree—but are they functions from nodes? If they are functions, wouldn’t they be from, say, initial sections of the integers 0...n TO nodes? If I understand correctly, they’re going to diverge at at most one point—once diverged, since it’s a tree, they’re not going to be able to rejoin. Or were you saying tree and thinking ‘dag’?
I worry about the types of these things; coding it up in something like Haskell or Ocaml might make everything sharper and perhaps suggest simplifications. I’m sure that you can carry through the basic intuition.
Here is a nice formal model.
Let T be a tree of height w and T’ a subset of T regarded as the set of choice nodes in T. Now given a function f on T’ assume there is a function P giving a probability measure P_f on the set of paths through T’ coherent with f for each f. Further assume there is a function u taking paths through T to utilities and define U_P(f) to be the integral of u with respect to P_f over all paths coherent with f and U(f|\sigma) to be the integral of u with respect to P_f over all paths extending \sigma coherent with f. Now either assume that each choice node has finitely many successors and U(f|\sigma) is always finite. By compactness we can always achieve a upper bound. We now define the value of information of some set S of non-choice nodes in T (i.e. the benefit of getting to look at those nodes in making future decisions).
Let E(T) (the informed maximum) be the max of U(f) for f a function on T’ and let E(T/S) be the max of U(f) over those functions f satisfying f(\sigma)=f(\tau) if {x|\sigma(x) \neq \tau(x) } is a subset of S, i.e. those functions unaware of the result of nodes in S.
The value of getting to look at the results of nodes in S is thus E(T) - E(T/S). Note that to match the concept in the article S should be an initial segment of T (i.e. you get all that info immediately while our model includes the value of buying info that will only be revealed later).
Interesting..
So the original tree T models a tree of “states of affairs”, and the original partition or subset T’ models each node being either under the decision-maker’s control or not. Then the function f would go from elements of T’ to successors of those elements—nodes of T, to be sure, but there’s a side condition there. Then the probability measure P is a somewhat more powerful way of attaching probabilities to the non-choice nodes—that is, if you have a distribution over the successors of each non-choice node, then you can obtain a probability measure, but a probability measure over paths would allow some additional correlations. The function u can be understood (in a finite tree) as labelling leaves with utilities, because paths of maximum length in a finite tree are isomorphic with leaves—but by describing it the way you did, you leave the door open to applying this formalism to an infinite tree. UP(f) would be the utility of a particular strategy (f), and U(f|\sigma) would be… the utility given a certain initial sequence of events? So \sigma is a finite path segment?
I don’t understand the grammar of “Now either …. and …”—should it be “Now either … or …”? Or is it really “Now assume … and …”?
When you use U(f) later, I am guessing that’s either UP(f) with the P elided, or U(f | the empty path segment) - regardless, we’re going to have to fix a P in order to get a utility, right?
Then the phrase “f(sigma)=f(tau) if the set of x such that \sigma(x) is not equal to tau(x) is a subset of S”—If I understand correctly, sigma and tau are finite path segments, which are isomorphic to nodes in the tree—but are they functions from nodes? If they are functions, wouldn’t they be from, say, initial sections of the integers 0...n TO nodes? If I understand correctly, they’re going to diverge at at most one point—once diverged, since it’s a tree, they’re not going to be able to rejoin. Or were you saying tree and thinking ‘dag’?
I worry about the types of these things; coding it up in something like Haskell or Ocaml might make everything sharper and perhaps suggest simplifications. I’m sure that you can carry through the basic intuition.
Thanks for thinking about this.