Proofs Section 2.2 (Isomorphism to Expectations)

Previous proof post is here.

Theorem 2: Isomorphism Theorem: For (causal, pseudocausal, acausal, surcausal) or which fulfill finitary or infinitary analogues of all the defining conditions, and are (causal, pseudocausal, acausal, surcausal) hypotheses. Also, and define an isomorphism between and , and and define an isomorphism between and .

Proof sketch: The reason this proof is so horrendously long is that we’ve got almost a dozen conditions to verify, and some of them are quite nontrivial to show and will require sub-proof-sketches of their own! Our first order of business is verifying all the conditions for a full belief function for . Then, we have to do it all over again for . That comprises the bulk of the proof. Then, we have to show that taking a full belief function and restricting it to the infinite/​finite levels fulfills the infinite/​finite analogues of all the defining conditions for a belief function on policies or policy-stubs, which isn’t quite as bad. Once we’re done with all the legwork showing we can derive all the conditions from each other, showing the actual isomorphism is pretty immediate from the Consistency condition of a belief function.

Part 1:Let’s consider . This is defined as:

We’ll show that all 9+2 defining conditions for a belief function are fulfilled for . The analogue of the 9+2 conditions for a is:

1: Stub Nirvana-free Nonemptiness:

2: Stub Closure:

3: Stub Convexity.

4: Stub Nirvana-free Upper-Completion.

5: Stub Restricted Minimals:

6: Stub Normalization: and

7: Weak Consistency:

8: Stub Extreme Point Condition: for all :

9: Stub Uniform Continuity: The function is uniformly continuous.

C: Stub Causality:

where the outcome function is defined over all stubs.

P: Stub Pseudocausality:

Let’s begin showing the conditions. But first, note that since we have weak consistency, we can invoke Lemma 6 to reexpress as Where is the n’th member of the fundamental sequence of .

Also note that, for all stubs, . We’ll be casually invoking this all over the place and won’t mention it further.

Proof: By Lemma 6 with weak consistency, Now, m can be anything we like, as long as it’s finite. Set m to be larger than the maximum timestep that the stub is defined for. Then no matter what n is (since it’s above m) and projection from a stub to itself is identity, so the preimage is exactly our original set .

We’ll also be using another quick result. For all stubs , given stub causality,

Proof: Fix an arbitrary point . By causality, we get an outcome function which includes , giving us that there’s something in that projects down onto . Use weak consistency to get the other subset direction.

Condition 1: Nirvana-free Nonemptiness.

Invoking Stub Nirvana-free Nonemptiness, so we get nirvana-free nonemptiness for .

Now, assume is not a stub. By stub-bounded-minimals, there is some bound on the set of minimal points, regardless of stub. Let be

This contains all the minimal nirvana-free points for . This set is nonempty because we have stub nirvana-free nonemptiness, so a nirvana-free point exists. We have stub-closure and stub minimal-boundedness, so we can step down to a minimal nirvana-free point below , and it obeys the bound.

Further, by weak consistency and projection preserving and and nirvana-freeness,

Invoking Lemma 9, the intersection of preimages of these is nonempty. It’s also nirvana-free, because if there’s nirvana somewhere, it occurs after finite time, so projecting down to some sufficiently large finite stage preserves the presence of Nirvana, but then we’d have a nirvana-containing point in a nirvana-free set , which is impossible. This is also a subset of the typical intersection of preimages used to define . Pick an arbitrary point in said intersection of preimages of clipped subsets.

Bam, we found a nirvana-free point in and we’re done.

Time for conditions 2 and 3, Closure and Convexity. These are easy.

The preimage of a closed set (stub-closure) is a closed set, and the intersection of closed sets is closed, so we have closure.

Also, is linear, so the preimage of a convex set (stub-convexity) is convex, and we intersect a bunch of convex sets so it’s convex as well.

Condition 4: Nirvana-free upper completion.

Let . Let’s check whether (assuming that’s an a-measure and is nirvana-free) also lies in the set. A sufficient condition on this given how we defined things is that for all , , as that would certify that is in all the preimages.

is linear, so

The first component is in , obviously. And then, by stub nirvana-free-upper-completion, we have a nirvana-free a-measure plus a nirvana-free sa-measure (projection preserves nirvana-freeness), making a nirvana-free a-measure (projection preserves a-measures), so is in , and we’re done.

Condition 5: Bounded-Minimals

So, there is a critical value by restricted-minimals for

Fix a , and assume that there is a minimal point in with a value that exceeds the bound. Project down into each . Projection preserves and so each of these projected points lie above some .

Now, invoke Lemma 7 to construct a (or the nirvana-free variant) that lies below , and projects down to . Repeat this for all n. All these points are a-measures and have the standard bound so they all lie in a compact set and we can extract a convergent subsequence, that converges to , which still obeys the bound.

is below because (or the nirvana-free variant) is a closed set. Further, by Lemma 10, is in the defining sequence of intersections for . This witnesses that isn’t minimal, because we found a point below it that actually obeys the bounds. Thus, we can conclude minimal-point-boundedness for .

Condition 6: Normalization. We’ll have to go out of order here, this can’t be shown at our current stage. We’re going to have to address Hausdorff continuity first, then consistency, and solve normalization at the very end. Let’s put that off until later, and just get extreme points.

Condition 8: Extreme point condition:

The argument for this one isn’t super-complicated, but the definitions are, so let’s recap what condition we have and what condition we’re trying to get.

Condition we have: for all :

Condition we want: for all ,

Ok, so By the stub extreme point condition, there’s a , where, for all that fulfill , there’s a , where .

Lock in the we have. We must somehow go from this to a that projects down to our point of interest. To begin with, let be the n’th member of the fundamental sequence for . Past a certain point m, these start being greater than . The which projects down to that we get by the stub-extreme-point condition will be called . Pick some random-ass point in and call it .

all obey the and values of , because it projects down to . We get a limit point of them, , and invoking Lemma 10, it’s also in . It also must be nirvana-free, because it’s a limit of points that are nirvana-free for increasingly late times. It also projects down to because the sequence was wandering around in the preimage of , which is closed.

Condition 9: Hausdorff Continuity:

Ok, this one is going to be fairly complicated. Remember, our original form is:

“The function is uniformly continuous”

And the form we want is:

“The function is uniformly continuous”

Uniform continuity means that if we want an Hausdorff-distance between two preimages, there’s a distance between partial policies that suffices to produce that. To that end, fix our . We’ll show that the we get from uniform continuity on stubs suffices to tell us how close two partial policies must be.

So, we have an . For uniform continuity, we need to find a where, regardless of which two partial policies and we select, as long as they’re or less apart, the sets (and likewise for ) are only apart. So, every point in the first preimage must have a point in the second preimage only distance away, and vice-versa. However, we can swap and (our argument will be order-agnostic) to establish the other direction, so all we really need to do is to show that every point in the preimage associated with is within of a point in the preimage associated with .

First, is the time at which two partial policies apart may start differing. Conversely, any two partial policies which only disagree at-or-after m are apart or less. Let be the policy stub defined as follows: take the inf of and (the partial policy which is everything they agree on, which is going to perfectly mimic both of them up till time m), and clip things off at time m to make a stub. This is only apart from and , because it perfectly mimics both of them up till time m, and then becomes undefined (so there’s a difference at time m) Both and are .

Let be some totally arbitrary point in . is also in , because projects down to some point in that’s nirvana-free.

Let , where , be the n’th stub in the fundamental sequence for . These form a chain starting at and ascending up to , and are all distance from .

Anyways, in , we can make a closed ball of size around . This restricts and to a small range of values, so we can use the usual arguments to conclude that is compact.

Further, because is or less away from , the two sets and are within of each other, so there’s some point of the latter set that lies within our closed -ball.

Consider the set

the inner intersection is an intersection of closed and compact sets, so it’s compact. Thus, this is an intersection of an infinite family of nonempty compact sets. To check the finite intersection property, just observe that since preimages of the sets get smaller and smaller as n increases due to weak-consistency but always exist.

Pick some arbitrary point from the intersection. it’s away from since it’s in the -ball. However, we still have to show that is in to get Hausdorff-continuity to go through.

To begin with, since lies in our big intersection, we can project it down to any . Projecting it down to stage n makes . Let be the point in defined by:

Well, we still have to show that this set is nonempty, contains only one point, and that it’s in , and is nirvana-free, to sensibly identify it with a single point.

Nonemptiness is easy, just invoke Lemma 9. It lies in the usual intersections that define , so we’re good there. If it had nirvana, it’d manifest at some finite point, but all finite projections are nirvana-free, so it’s nirvana-free. If it had more than one point in it, they differ at some finite stage, so we can project to a finite to get two different points, but they both project to , so this is impossible. Thus, is a legit point in the appropriate set. If the projection of didn’t equal , then we’d get two different points, which differ at some finite stage, so we could project down to separate them, but they both project to for all n so this is impossible.

So, as a recap, we started with an arbitrary point in , and got another point that’s only or less away and lies in This argument also works if we flip and , so the two preimages are only or less apart in Hausdorff-distance.

So, given some , there’s some where any two partial policies which are only apart have preimages only apart from each other in Hausdorff-distance. And thus, we have uniform continuity for the function mapping to the set of a-measures over infinite histories which project down to Hausdorff-continuity is done.

Condition 7: Consistency.

Ok, we have two relevant things to check here. The first, very easy one, is that

From earlier, we know that , and from how is defined, this is a tautology.

The other, much more difficult direction, is that

We’ll split this into four stages. First, we’ll show one subset direction holds in full generality. Second, we’ll get the reverse subset direction for causal/​surcausal. Third, we’ll show it for policy stubs for pseudocausal/​acausal, and finally we’ll use that to show it for all partial policies for pseudocausal/​acausal.

First, the easy direction.

If we pick an arbitrary , it projects down to for all stubs below . Since , it projects down to all stubs beneath . Since projections commute, projected down into makes a point that lies in the preimage of all the where , so it projects down into .

This holds for all points in , so . This works for all , so it holds for the union, and then due to closure and convexity which we’ve already shown, we get that the closed convex hull of the projections lies in too, establishing one subset direction in full generality.

Now, for phase 2, deriving in the causal/​surcausal case.

First, observe that if , then . Fix some and arbitrary . We’ll establish the existence of a that projects down to .

To begin with, projects down to in . Lock in a value for n, and consider the sequence that starts off , and then, by causality for stubs and , you can find something in that projects down onto , and something in that projects down onto that, and complete your sequence that way, making a sequence of points that all project down onto each other that climb up to . By Lemma 10, we get a . You can unlock n now. All these have the same and value because projection preserves them, so we can isolate a convergent subsequence converging to some .

Assume . Then we’ve got two different points. They differ at some finite stage, so there’s some n where can project down onto to witness the difference, but from our construction process for , both and project down to , and we get a contradiction.

So, since , this establishes the other direction, showing equality, and thus consistency, for causal/​surcausal hypotheses.

For part 3, we’ll solve the reverse direction for pseudocausal/​acausal hypotheses in the case of stubs, getting

Since we’re working in the Nirvana-free case and are working with stubs, we can wield Lemma 3

So, if we could just show that the union of the projections includes all the extreme minimal points, then when we take convex hull, we’d get the convex hull of the extreme minimal points, which by Lemma 3, would also nab all the minimal points as well. By Lemmas 11 and 12, our resulting convex hull of a union of projections from above would be upper-complete. It would also get all the minimal points, so it’d nabs the entire within it and this would show the other set inclusion direction for pseudocausal/​acausal stubs. Also, we’ve shown enough to invoke Lemma 20 to conclude that said convex hull is closed. Having fleshed out that argument, all we need that all extreme minimal points are captured by the union of the projections.

By our previously proved extreme minimal point condition, for every extreme minimal point in , there’s some and in that projects down to , which shows that all extreme points are included, and we’re good.

For part 4, we’ll show that in the nirvana-free pseudocausal/​acausal setting, we have

Fix some arbitrary . Our task is to express it as a limit of some sequence of points that are mixtures of stuff projected from above.

For this, we can run through the same exact proof path that was used in the part of the Lemma 21 proof about how the nirvana-free part of is a subset of closed convex hull of the projections of the nirvana-free parts of , . Check back to it. Since we’re working in the Nirvana-free case, we can apply it very straightforwardly. The stuff used in that proof path is the ability to project down and land in (we have that by how we defined ), Hausdorff-continuity (which we have), and stubs being the convex hull, not the closed convex hull, of projections of stuff above them (which we mentioned recently in part 3 of our consistency proof).

Thus, consistency is shown.

Condition 6: Normalization.

Ok, now that we have all this, we can tackle our one remaining condition, normalization. Then move on to the two optional conditions, causality and pseudocausality.

A key thing to remember is, in this setting, when you’re doing , it’s actually , because if Murphy picked a thing with nirvana in it, you’d get infinite value, which is not ok, so Murphy always picks a nirvana-free point.

Let’s show the first part, that . This unpacks as: and we have that

Projections preserves values, so we can take some nirvana-free point with a value of nearly 0, and project it down to (belief function of the empty policy-stub) where there’s no nirvana possible because no events happen.

So, we’ve got values of nearly zero in there. Do we have a point with a value of exactly zero? Yes. it’s closed, and has bounded minimals, so we can go “all positive functionals are minimized by a minimal point”, to get a point with a value of exactly zero. Then, we can invoke Lemma 21 (we showed consistency, extreme points, and all else required to invoke it) to decompose our point into the projections of nirvana-free stuff from above, all of which must have a value of 0. So, there’s a nirvana-free point in some policy with 0 value.

Now for the other direction. Let’s show that .

This unpacks as:

We’ll show this by disproving that the sup is <1, and disproving that the sup is >1.

First, assume that, regardless of , Then, regardless of we can pick some totally arbitrary , and there’s a nirvana-free point with a value of or less. By consistency, we can project it down into , to get a nirvana-free point with a value of or less. Thus, regardless of the stub we pick, there’s nirvana-free points where Murphy can force a value of or less, which contradicts

What if it’s above 1? Assume there’s some where

From uniform-continuity-Hausdorff, pick some to get a Hausdorff-distance or lower (for stuff obeying the bound, which all minimal points of do for all ). This specifies some extremely large n, consider . Now, consider the set of every policy above . All of these are or less away from . Also, remember that the particular sort of preimage-to-infinity that we used for Hausdorff-continuity slices away all the nirvana.

So, Murphy, acting on , can only force a value of or higher. Now, there can be no nirvana-free point in with . The reason for this is that, since is or less away from , there’s a nirvana-free point in that’s away, and thus has , which is impossible.

Ok, so all the nirvana-free points in where have .

Now, since we have Lemma 21, we can go “hm, equals the convex hull of the projections of . Thus, any minimal point with is a finite mix of nirvana-free stuff from above, one of which must have . But we get a contradiction with the fact that there’s no nirvana-free point from above with a value that low, they’re all

So, since we’ve disproved both cases, . And we’re done with normalization! On to causality and pseudocausality.

Condition C: Causality.

An “outcome function” for is a function that maps a to a point in , s.t. for all .

Causality is, if you have a , you can always find an outcome function where . Sadly, all we have is causality over stubs. We’ll be using the usual identification between and .

Anyways, fix a and a point in . Project down to get a sequence . By causality for stubs, we can find an where, for all , . Observe that there are countably many stubs, and no matter the n, all the and values are the same because projection preserves those. We can view as a sequence in

By stub closure, and a and bound, this is a product of compact sets, and thus compact by Tychonoff (no axiom of choice needed, its just a countable product of compact metric spaces) so we can get a limiting (because it’s only defined over stubs).

An outcome function for stubs fixes an outcome function for all partial policies, by

We’ve got several things to show now. We need to show that is an outcome function, that is well-defined, that and that it’s actually an outcome function.

For showing that is an outcome function, observe that projection is continuous, and, letting n index our convergent subsequence of interest, regardless of stub , . With this,

Now, let’s show that is well-defined. Since is an outcome function, all the points project down onto each other, so we can invoke Lemma 9 to show that the preimage is nonempty. If the preimage had multiple points, we could project down to some finite stage to observe their difference, but nope, they always project to the same point. So it does pick out a single well-defined point, and it lies in ) by being a subset of the defining sequence of intersection of preimages.

Does ? Well, projected down to all the . If , then So, the limit specification has for all n. The only thing that projects down to make all the is itself, so .

Last thing to check: Is an outcome function over partial policies? Well, if , then for all n, . Assume . Then, in that case, we can project down to some and they’ll still be unequal. However, since projections commute, it doesn’t matter whether you project down to and then to , or whether you project down to (making ), and then project down to (making ). Wait, hang on, this is the exact point that projects down to, contradiction. Therefore it’s an outcome function.

And we’re done, we took an arbitrary and , and got an outcome function with , showing causality.

Condition P: Pseudocausality: If , and ’s support is on , then .

But all we have is, if and ’s support is on , then .

There’s a subtlety here. Our exact formulation of pseudocausality we want is the condition , so if the measure is 0, then support is the empty set, which is trivially a subset of everything, then pseudocausality transfers it to all partial policies.

Ok, so let’s assume that , and the measure part has its support being a subset of but yet is not in . Then, since this is an intersection of preimages from below, there should be some finite level that you can project down to (it’s present in , just maybe not in ) where the projection of (call it ) lies outside (lying outside the intersection of preimages)

This is basically “take , chop it off at height n”. However, since , you can project it down to . Which does the exact same thing of chopping off at height n, getting you exactly. We can invoke stub-pseudocausality (because with full measure, the history will land in , then with full measure, the truncated history will land in as the latter is prefixes of the former, or maybe the full measure is 0 in which case pseudocausality transfer still works) to conclude that actually lies inside , getting a contradiction. This establishes pseudocausality in full generality.

Ok, so we have one direction. is a hypothesis, if fulfills analogues of the hypothesis conditions for the finitary stub case. Our proof of everything doesn’t distinguish between causal and surcausal, and the arguments work for all types of hypotheses, whether causal, surcausal, pseudocausal, or acausal. Ok, we’re 14 of the way through. Now we do the same thing, but for building everything from infinitary hypotheses.

PART 2: Infinitary hypotheses. We now consider , defined as

We’ll show that with 8+2 defining conditions, the 9+2 defining conditions for a hypothesis hold for . The the 8+2 conditions for a are:

1: Infinitary Nirvana-Free Nonemptiness:

2: Infinitary Closure:

3: Infinitary Convexity.

4: Infinitary Nirvana-free Upper-Completeness

5: Infinitary Bounded Minimals:

6: Normalization: and

7: Nirvana-free consistency.

8: Infinitary Uniform Hausdorff Continuity:

The function is uniformly continuous.

C: Infinitary Causality: Regardless of and , there’s an outcome function over full policies s.t. , and for all and ,

P: Infinitary Pseudocausality:

Let’s begin showing the conditions.

Condition 1: Nirvana-free Nonemptiness.

This one is trivial. Pick some . There’s a nirvana-free point. Project it down. You get a nirvana-free point and you’re done.

Conditions 2 and 3, Closure and convexity. We explicitly took the closed convex hull when defining everything, these are tautological.

Condition 4: Nirvana-free upper completion.

For the pseudo/​acausal case, it’s doable by Lemmas 10, 11, and 12. The projection of an upper-complete set (by infinitary nirvana-free upper-completion) is upper-complete, so the union of projections is upper-complete, and then the convex hull is upper-complete, and then the closure is upper-complete and we’re done.

We’ll have to loop back to the causal case of Nirvana-free Upper Completion later, because we need Lemma 21 to make it go through and that requires consistency and the extreme point condition to make it work.

Condition 5: Bounded Minimals.

We can break down into three phases. First is showing that all points in the projection set have something under them that respects the bound. Second is showing that all points in the convex hull of the union of projection sets have something under them that respects the bound. Third is showing that all points in the closure have something under them that respects the usual bound. The reason we have to phrase it this way is that we don’t necessarily know that our sets of interest are closed until the end, so we can’t find a minimal point, just a bounded one that is lower, but that suffices to show that a “minimal point” that violates the restricted minimal condition isn’t actually minimal.

For part 1, let . Then, is the projection of some point . By infinitary bounded-minimals, we can find a minimal point below that obeys the bound, so . Projecting down is linear, so we get , and is below and fulfills the bound.

For part 2, let We can rewrite as , and then, by part 1, decompose the into (not actually minimal, just a point obeying the bound) and . Then we can decompose further into . The former is an a-measure (mix of a-measures) and obeys the bound since all its components do, and it’s in the relevant convex hull, witnessing that has a point below it in the convex hull that obeys the bounds.

For part 3, let be in the closure of the convex hull. There’s some sequence in the convex hull that limits to . Below each we can find a (again, not actually minimal) that obeys the bound. Invoke Lemma 16 to get a point below that respects the bounds, and we’re done.

Condition 6: Normalization.

We literally have the exact phrasing of normalization we need already, this is a tautology.

Condition 7: Consistency.

Ok, one direction is trivial because , so we can just use the definition of .

The other direction, that everything equals the intersection of preimages of stuff below it, is trickier. One subset direction isn’t too bad, the one that

If we take a that wasn’t added in the final closure step, it’s expressible as , and all the come from points in where . Projecting the down to a instead makes , which mix together in the same way to make . Because projections are linear and commute, is the projection of . So, any point in (without the closure step) projects down to lie in for any .

Then, for the closure step, we just fix a sequence limiting to . The can project down to whichever you wish, and by continuity of projection, the comes along for the ride as a limit point. However, is closed, so projects down to land in that set as well. Bam, any old projects down to land in any set you wish with , certifying that lies in the intersection of preimages of stubs below.

Now, we just have to establish which splits into two cases. The causal/​surcausal case, and the pseudocausal/​acausal case where you don’t have to worry about nirvana.

For the nirvana-free case… We can use the same proof strategy as the last part of Lemma 21, where we were showing the result for partial policies. It may be a bit nonobvious why it works. We do need to swap things around a bit, and will mention important changes without fleshing out the fiddly details, which are already given in the last part of Lemma 21.

Start with a in the intersection of preimages of stubs below. To show it’s in , we need a sequence limiting to it, where each member of the sequence is a mix of finitely many points projected down from policies above . The end part of Lemma 21 gives how to construct such a sequence. The fact that we’re working in a nirvana-free setting means you can ignore all fiddly details about points being nirvana-free and preimages of only the nirvana-free parts, because everything fulfills that. The key steps in that proof path are:

1: being able to project down to make a sequence . We trivially have this by being defined as “in the intersection of preimages of stubs below it”.

2: Having uniform Hausdorff-continuity for the policies. This is our condition 8 we’re assuming, so we’re good there.

3: The ability to shatter our into finitely many which are the projections of various points from above. This is the key difference. The proof of Lemma 21 had to set up that fact beforehand. However, in our case, we have the Nirvana-free consistency condition, which says

But, since we’re working in the nirvana-free setting, this turns into:

And that right-hand term… is just the definition of ! So, swapping that out, and specializing our arbitrary stub to , we have:

So, since our lie in , they can be written as a finite mix of nirvana-free things from above projected down, and the Lemma 21 argument goes through.

Now, for the nirvana cases where we can assume infinitary causality. We’ll do this by showing a little sublemma, that, if , then

First, we’ll show that if , then

Fix an arbitrary in the projection of , we can get a preimage point . Then, by infinitary causality, we can make a point that projects down to . Just make an outcome function where , feed in , that gets you your point , the two agree when you project them down to , and is further down than that and projections commute so they both hit the same point if you project down. Flipping and shows our equality.

Alright, so now can be written as where is arbitrary above .

Projection is linear, so the projection of a convex set is convex. To get the closure points, just take a sequence in the projection limiting to some . Take preimage points . There’s a bound on the and values of this sequence because projections preserve and and our sequence converges, so we can apply the Compactness Lemma and get a convergent subsequence limiting to a point , which must be in because closure. Projection is continuous, so projects down to . And we have proved! Wow, that was a sublemma of case 2 of part 3 of the proof of condition 7 in part 2 of the proof of the Isomorphism theorem, we’re really in the weeds at this point.

Moving on, how can we use this to show for the causal case, which is the last bit we need to show consistency?

Well, fix a in the intersection of preimages, and an arbitrary . projects down to make some . Since , and we have our sublemma, there’s a point in that projects down to some , and further down to .

This sequence all has the same and value since projection preserves those, so by the Compactness Lemma and closure, there’s a convergent subsequence and limit point . Does project down onto ? (witnessing that ?

Well, let’s say it didn’t and projecting down gets you a distinct point. Then there’s some n where projecting down further to would keep the points distinct, since they have to differ at some finite time. But… after time n, our sequence is roaming around entirely in the preimage of , so the limit point is in there too, and it projects down to and we have a contradiction. Therefore, projects down onto , witnessing that , and was arbitrary in the intersection of preimages.

So, we have for the nirvana-containing causal/​surcausal case, which is the last piece we needed to show consistency.

Condition 8: Extreme point condition.

The thing we want is that an extreme nirvana-free minimal point in is the projection of a nirvana-free point from a policy above it. By the nirvana-free consistency property, lies in

is extreme, so it lies in

So, there’s some point in some that projects down to and we’re done.

Condition 9: Hausdorff Continuity:

Ok, this one is going to be complicated. We’ll work with the Lemma 15 version of Hausdorff-continuity, where a difference between two policies means that if you start off in one preimage, you’ve gotta travel distance or less to get to the preimage associated with the other policy, and vice-versa.

We split into two parts. Part 1 is showing that if , and the distance between and is or less, the distance between their respective preimages is low. Part 2 is showing that if and are apart, then we can exploit part 1 to get that the distance between the preimages is low, and will be pretty easy after we get part 1.

Our Hausdorff-continuity condition links and . So, when we fix an and are like “how close do the policies have to be to guarantee the preimages are apart”, pick the that gets you distance w.r.t our original Hausdorff-continuity condition, and also have .

Our first part uses and for points in , for a point in , and for points in (that are expressible as a finite mix of points from for varying ), and for two more general points in .

So, one half of showing the two preimages are close to each other is trivial. Everything in projects down into by consistency, and projection preserving nirvana-freeness, so the preimage associated with is a superset of the preimage associated with , so there’s distance 0 from a point in the preimage to a point in the preimage.

The other half is trickier. Pick an arbitrary point in and is the value of this point. projects down to some point in . From nirvana-free consistency, , can then be produced by (keeping in mind that it doesn’t matter whether we mix before or after projecting) finitely many points in varying sets that are mixed to make a point , and then projected down.

Important note: is not necessarily equal to, or even close to, .

Because is within of , every policy above has a corresponding policy within that lies above . Thus, we can perturb the component points (indexed by i) that mix to make by (infinitary Hausdorff-uniform-continuity Lemma 15 variant, was assumed to be small enough for that to be the case), and mix them, to get a in

projects down to to make a , and further projects down to to make a . Because is within of , projecting down (and projecting being nonexpansive) means that is within of .

Now, we can take and fill in all the missing measure data to get a that projects down onto (certifying that it’s in the preimage of as follows. Our most important constraint is that, when extending , it should perfectly mimic so it can project down onto it. Our second constraint is that, if doesn’t specify what comes after a finite history and it doesn’t conflict with the first constraint, it should exactly mimic the conditional probabilities of . Also, our fixes a first time n (ie, ) at which is defined where isn’t, so all conflicts of the second constraint with the first constraint must happen after then. This does the following: We can slice the histories assigned measures by into three parts.

Part 1 is prefixes of histories in . There’s only difference in these between and (after all, projecting down to leaves these unchanged, and /​ project down to and which are only apart).

Part 2 is histories which have as a prefix something in less than length n. In that case, we’re mimicking the conditional probabilities of .

Part 3 is histories which have as a prefix something in of length n or higher. Because this is the threshold where and start differing, we’ve got to obey the probabilities. But this only occurs after time n.

Let’s analyze the difference between and , shall we? Our two relevant results are Vanessa’s folk result that two distributions that differ by an amount will differ by the same amount if we extend them with the same conditional probabilities, and the result from the proof of Lemma 15 that arbitrarily reshuffling the measure/​amount of dirt after time n takes effort, where is the value of the measure you’re reshuffling.

So, we start off with a distance (includes the term) between and . Then, extending up further to fill in everything up till time n, and mimic the conditional probabilities of each other. Still a distance between them at this stage. Finally, after time n, may go its own arbitrary way because it’s gotta be compliant with , and to reshuffle this around, it takes effort. So, the net distance between (arbitrary point in the preimage of , and (specially crafted point in the preimage of is below .

Wait, n (time of first difference) was since and are only apart, and can be at most because values are preserved by projections, and and are only distance apart, so no more than that amount of dirt is the difference between the two. Finally, we assumed . So, we get:

And we have our appropriate distance bound between preimages! Now to use this in part 2, which should go a lot faster.

Time for part 2, to get full generality. Pick two partial policies and and assume the distance between them is . Then, the stub given by ” but cut it off so it’s undefined after time n (where n is )” is within distance of both and . Further, . Then, take some point in the preimage of . It’s also in the preimage of . Because is at a distance of from , we only have to go distance to get a point in the preimage of , and then reverse and and we’re done!

By Lemma 15, this establishes uniform continuity for the function mapping partial policies to the preimage of their nirvana-free part in the space of all nirvana-free measures over infinite histories.

Condition 4: Nirvana-free upper completeness (causal case)

Now that we’ve nabbed every nice condition other than this one, we can invoke Lemma 21 (we only require upper completion on the infinite levels, which we have) to get that the nirvana-free part is the (closed) convex hull of the projections of nirvana-free stuff from above. Then, just appeal to lemmas 11, 12, and 13, that the closed convex hull of projections of nirvana-free upper-complete sets is nirvana-free upper-complete.

Condition C: Causality.

We showed part of this all the way back in our consistency argument. For causal/​surcausal, regardless of which we picked. We’ll be using this.

Pick some arbitrary and . has a preimage point where . We get an outcome function mapping policies to points in their associated sets s.t. . Extend this to all points by defining

Ok, we need to show that: This actually singles out a unique point and isn’t an invalid definition, said point is in , that , and that it’s an outcome function.

Assuming this is actually well-defined, is in trivially because it’s a projection of a point from above. Also, which clean ups that part. Now for showing that it’s an outcome function.

So, we got everything assuming the extension is well-defined, let’s show that. Pick any two above any . We’ll show that they project to the same point.

And we’re done with causality! Now for pseudocausality.

Condition P: Pseudocausality.

We’ll do this in two steps. One is showing that for stubs, points which meet the appropriate conditions are also present in all the requisite other stubs. Step 2 is generalizing this to all points in .

Let’s say you have some . By Nirvana-free consistency, so we can shatter it into finitely many that are projections of stuff from above, . The support of the measure component of is a subset of , so the same must apply to all the .

Now, what we can do is make a that mimics the behavior of for all prefixes and extensions of strings in , but otherwise mimics , and extends if needed in some random-ass way, and is above .

The reason we can do this is because, if there’s a contradiction in this construction, it would be from and behaving differently on some prefix or extension of a string in . But, can’t specify what to do for any nodes in or later (because is basically a coat of leaf observation nodes around the extent of ), and if and differ on a strict prefix of something in , then that means that and branch different ways so there’s no node in both and after the branch point, so again we get a contradiction.

Anyways, we’ve crafted our finitely many which lie above , and mimic going forward. Our is an extension of whose measure component is only supported on . Also, before and past that, mimics perfectly, so we can transfer to by infinitary pseudocausality. Do this for all the i. Then, projecting all those down to , we get that all the lie in , and mixing them together, we get that itself lies in .

Now for part 2, where we show it for partial policies in general. Let be arbitrary in . Project down to all the to make a sequence of . Since the support of the measure component of is a subset of (pseudocausality assumption) the support of the measure component of is a subset of , so by pseudocausality for stubs which we’ve shown, is also present in . Then, take the preimage in of all those points. By consistency and Lemma 9 and the usual argument about “there can only be one preimage point for a series of points”, we get that itself (the only thing that could project down on for all n) lies in and we’re done with pseudocausality.

Alright, that’s most of the proof out of the way, all that’s left is showing that the full belief function conditions imply the finitary and infinitary versions, respectively, and getting isomorphism. Let’s begin.

Let’s check whether makes a stub-hypothesis, and whether makes an infinitary-hypothesis, if is a hypothesis/​fulfills all the conditions. is just “restrict to only reporting sets for stubs”, and is just “restrict to only reporting sets for full policies”

The variants of nonemptiness, closure, convexity, nirvana-free upper-completion, bounded minimals, hausdorff-continuity, and pseudocausality for the finite and infinite case are trivially implied by the corresponding condition for hypotheses, leaving the four moderately nontrivial cases of the analogues of normalization, consistency, the extreme point condition, and causality.

Extreme point condition: The infinitary case doesn’t have an analogue of the extreme point condition. So that leaves the finitary case. What we can do is take a nirvana-free extreme minimal point in some , apply the general extreme point condition to get a nirvana-free for some suitable that projects down to , and, clipping away the infinite parts by , the projections of fill the role of the points in all below some policy that project down to .

Causality. The finite case is that we can take a point associated with some stub, and craft an outcome function for stubs that matches up with our point. This is trivially implied by the general case of causality, where you can take any partial policy and point and get an outcome function that matches up with it. The infinite case is that we can take a point in , and get points for all the other that project down appropriately. For this, again, we just take an outcome function for and clip it off to the infinite levels.

Consistency: The finite case of weak consistency is pretty easy. We get

Where the subset came from full consistency because everything is the closed convex hull of projections from above, so projecting down gets you a subset. For the Nirvana-free consistency condition for the infinite case, it’s a simple consequence of Lemma 21.

Normalization: and

To begin, the normalization condition for infinitary hypotheses and general hypotheses is the exact same, so we can ignore that and work on the stub hypothesis case. The inf one is pretty easy. From general normalization, at the infinite level, there are and nirvana-free points in with a value at-or-near zero, and you can just project them down to any stub you want.

The sup one is a bit trickier. It’s obviously not above 1, because no matter what policy you pick, you’ve got a nirvana-free point with in , which you can project down to whichever stub you’re looking at, to certify that the expectation of 1 is 1 or less. Showing that it isn’t below 1 is a bit harder.

Let’s say there’s some where (or arbitrarily close to 1, doesn’t really matter, although we’ll show later that there is indeed a maximizing policy where Murphy can only force a value of 1)

From Hausdorff-continuity, pick some to get an Hausdorff-distance or lower. This specifies some extremely large n, consider . Now, consider the set of every policy above . All of these are or less away from .

By Hausdorff-continuity, there can’t be a nirvana-free point in any with , because we could do an perturbation to get a point in with , because small changes in induce small changes in and . Or, we can add a little bit of wiggle room if the minimizing value of in is slightly less than 1

However, any nirvana-free point in must originate as a mix of finitely many points from (varying as long as it’s above ) that have been projected down. This is because, by our earlier proof of nirvana-free consistency from consistency in general,

All of these projected points have , so the mix point has , so Murphy can only force a value of or higher. And we can make as small as we wish to get a stub below (n extremely large) where is as small as we wish, so the sup of the values Murphy can force over all stubs can’t be below 1. So it must be 1.

Isomorphism! Let’s go! As a quick recap,

And /​ is just “clip down your hypothesis to full policies/​stubs”.

So, two parts of this are trivially easy. From earlier in the proof (the start of the first section for the stub one, and an obvious corollary of definitions for the full policy one), we established that and . Using this, and

So, and

Let’s get fancier and show the other two.

The first two equalities are unpacking definitions, the third is consistency for .

Again, first two equalities are unpacking definitions, the third is consistency for . So, and

Putting it together, and make an isomorphism between and , and and make an isomorphism between and . We’re finally done!

Proposition 1: If fulfills the causality condition, nonemptiness, closure, and convexity, then is a nonempty, closed, convex set of a-environments or a-survironments. . Also, .

Ok, what is, is the set of a-environments where, regardless of , lies in . For nonemptiness, pick some arbitrary point in one of your , use causality to get an outcome function, and then you fill in the conditional probabilities for an action-observation sequence with your outcome function points. This never produces a contradiction anywhere because if there was a contradiction, you’d be able to project two specified points down and have them disagree somewhere, which is impossible because we have an outcome function.

For closure, if you take a limit of a-environments, this makes a limiting sequence in all the , which are all closed, so the limit point environment has all its induced distributions lying in the usual , and is in

For convexity, if you take a mix of a-environments, this makes the same mix in all the which are all convex, so the mixed environment has all its induced distributions lying in the usual , and is in .

For equality, if , then it originated from some a-environment made from an outcome function for , which… just gets your original point so . In the other direction, if , by causality, we can project down and extend the specification and make an a-environment that acts like on , and then going back gets you .

In the other direction, if , then it induces an outcome function and you can go back from that to , so

Theorem 3.1: Pseudocausal Translation: For all pseudocausal hypotheses defined only on policy stubs, is a causal hypothesis only defined on policy stubs. . For all causal hypotheses defined only on policy stubs, is a pseudocausal hypothesis only defined on policy stubs.

Theorem 3.2: Acausal Translation: For all acausal hypotheses defined only on policy stubs, is a surcausal hypothesis only defined on policy stubs. . For all surcausal hypotheses defined only on policy stubs, is an acausal hypothesis only defined on policy stubs.

Both these theorems have highly similar proofs, so let’s group them together. First, we’ll need to set up how and work, and then knock out two lemmas we’ll need before we can proceed to the main result. is defined by is defined identically, just with instead of , and closed convex hull permitting us to mix with probability.

where (this is like the inverse of projection, it’s going up instead of down) is a function defined by: If , then . If and isn’t in , then

from is just pushing through the mapping . You keep the term the same, and push the measure terms up. is defined identically on the measure part, except that it has the rule that all nirvana events in and not in with 0 measure get measure instead.

Intuitively, what and are doing, is capping off whatever they need to (in order to extend appropriately) with Nirvana. is capping off positive-probability histories with guaranteed Nirvana immediately afterwards, where is more paranoid and caps off every 0-probability Nirvana history that got added with “it is possible that Nirvana occurs here”.

Let’s go over some properties that and fulfill. is an injective continuous map , and is an injective continuous map . and are undone by projecting back down, . Both and are linear, the latter in the stronger sense that it’s linear when you mix stuff with probability, it doesn’t matter whether you mix before or after injecting up. Further, injections up commute, , and the same for .

In order to make progress, we want to get two important lemmas. The first one, Lemma 22, is that slicing away the nirvana from this thing recovers the original pseudocausal hypothesis. The second one I call the “Diamond Lemma”, and it says that injecting up and projecting down is the same as projecting down and then injecting up, and if you sketch it out, it looks like a diamond.

Lemma 22: , and the same holds for .

Proof sketch: One direction is trivial, the other direction that doesn’t add any new nirvana-free points is trickier. Working in the pseudocausal-to-causal setting, we can take some that’s nirvana-free in the closed convex hull, and get a sequence limiting to it where each is in the convex hull. Now, indexing stubs below by i, the can all be viewed as a mix of points projected up from below. The problem is, the mix varies as n does. What we can do is separate into “good” i where we can get a suitable limit point and limit probability, and “bad” i that we have to treat as a special chunk, and reexpress as a sum of a probabilistic mix of “good” injected up, and an additional “bad” chunk. We can show that the “good” can all be transferred up to itself by pseudocausality and mixed in there, and the “bad” chunk is a nirvana-free a-measure. So, is the sum of a point in , plus a nirvana-free a-measure, so lies in by nirvana-free upper completion.

Working in the surcausal-to-acausal setting, we take our in the closed convex hull and a sequence , but injection up in this setting is much more effective at adding Nirvana, and the surmetric is much more sensitive than the usual metric for noticing the presence or absence of Nirvana. So, only an initial segment of is “contaminated” with Nirvana since the limit point is Nirvana-free, and we can clip that part off, and the “uncontaminated tail” can only have come from itself because injection up is very aggressive with adding Nirvana, so we get it from just closure on .

Proof: For , just observe that the identity injection leaves completely unchanged and adds no nirvana, so any point in also lies in the closed convex hull of the injections up, and is nirvana-free because the original point that we mapped through identity was nirvana-free. This works with the surcausal case too.

Now for the considerably more difficult reverse direction, for the pseudocausal-to-causal case first.

If , then unpacking that, is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence in the convex hull of injections up that limits to . Index the stubs below by i, there’s only finitely many of them.

The can be written as where . may be 0. This is because, if there’s multiple points in the injection of a particular stub that are mixed, you can mix them before injecting up to get a single that’s injected up, because injections are linear and we’re injecting a convex set.

Blessed by the gift of finitely many i to worry about, use repeated picking of subsequences to get a subsequence of n where:

For all i, converges. Call the limiting values . Now split the i into good i where , and bad i where . The will sum up to 1.

For all good i, always. The fact that they all limit to above 0 helps you out because you only have to trim off an initial segment.

For all good i, converges, call the limit point . This is because injection up preserves and , and is bounded above 0, so the value of the is upper-bounded by , which is a finite nonnegative number divided by a finite positive number, and we can apply the Compactness Lemma to establish that a convergent subsequence exists. In this case, is the bound as a whole for the sequence , which converges so it must have a bound of that form, and not the bound on minimal points.

Finally, converges. This is doable because the sequence has bounded and because it converges to something, so the partial sum of bad i has the same bound, so we can invoke the Compactness Lemma to get our convergent subsequences.

Putting all this together (we kept selecting from compact sets so that is what let us build a subsequence with all these great properties at once) we have a decomposition of itself into:

Now, since (all the bad i had their probability components limit to 0), that first sum part looks like an actual mixture of points injected up! Since is nirvana-free, both parts must be nirvana-free, and the sums are also a-measures.

First, by closure of all our original , all the components (where i is good) do lie in . And when we inject the up, since the mix of them is nirvana-free, this means that each individual must be nirvana-free after injection.

Now, what injection does, is it caps Nirvana on everything that is in and not in that has positive probability. So, if is nirvana-free after injection, this must mean that its measure component is only supported on . Via pseudocausality, this means that lies in itself! Also, .

So, our sum over good i components (by convexity), is actually a probabilistic mixture of stuff in itself! Abbreviating as , which lies in by convexity, and rewriting the sum, we can reepress as:

This is a nirvana-free a-measure in , plus a nirvana-free a-measure, so, by nirvana-free upper-completion, lies in and we’re done. Now, let’s hit up the surcausal case.

Assume . is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence in the convex hull of injections up that limits to . Index the stubs below by i, there’s only finitely many of them, reserve i=0 for itself.

The can be written as where . may be 0 or . This is because, if there’s multiple points in the injection of a particular stub, you can mix them before injecting up to get your single point, one for each , because injections are linear and we’re injecting a convex set.

Note that is nirvana-free, and there’s only finitely many spots where nirvana could be since we’re working in a stub, so past a certain point all the will be nirvana-free due to the surmetric we’re using. Let’s clip off that initial segment that’s contaminated with Nirvana. Now, we can get something very interesting. If , then injecting up anything at all is going to stick nirvana (maybe with measure) somewhere. Having be doesn’t help you, because mixing with a nirvana-containing thing with probability means the mixture contains the nirvana-spots of that thing you mixed in. So, past a certain point, all the can only be written as (the identity injection, anything else either has exactly 0 probability so it gets clipped out of the sum, or it has Nirvana somewhere and can’t be present).

Therefore, in the tail, the sequence of limiting to is the same as limiting to some , so and it’s actually an a-measure, not an a-surmeasure. This establishes Lemma 22 for the sur-case.

Lemma 23/​Diamond Lemma: For any , and any , and any , then: (and same for and the sur variants)

it’s called the Diamond Lemma because if you sketch out the injections as going diagonally up and the projections as going diagonally down, the commutative diagram looks like a diamond.

To begin with, we can go “hm, there’s an upper bound on and . For every finite history in , there’s an extension of that history in , which has a prefix in , and vice-versa. This establishes that for all the finitely many histories in , either a prefix of that history lies in , or an extension of that history lies in , and vice-versa for

Now, we can split into three possible cases and show that up-then-down equals down-then-up in terms of what measure is assigned to a history in by mapping through the injections and projections, which shows the diamond lemma in full generality.

In the first case, our history in is also in (the equality case)In this case, also lies in . Projecting down to inf does nothing to the measure on , and embedding up also does nothing to the measure on . Embedding up to also does nothing to the measure on , and projecting down doesn’t affect it either.

In the second case, our history in isn’t in , but there are strict extensions that lie in (this requires to be nirvana-free). is still assigned a measure by , though, being a prefix of stuff with measure. In this case, also lies in . The same analysis from our first case works, doesn’t have its measure disrupted.

In the third case, our history in isn’t in , but a strict prefix lies in . We can distinguish three subcases. In the first subcase, is of the form . In the second subcase, still ends with Nirvana, but it isn’t immediately after happens, some stuff happens in the meantime first. In the third subcase, doesn’t end with Nirvana. Also, lies in .

For the first subcase where is of the form , injecting up means now has the measure originally associated with and nirvana is marked as “possible” there (if we’re using the sur-injection). Projecting down leaves this alone. Projecting down leaves the measure on alone, and injecting up means now has the measure originally associated with and nirvana is marked as “possible” there (if we’re using the sur-injection). In both paths, ends up with the measure that started with, and nirvana marked as “possible” in the sur-case.

For the second subcase where is of the form ”, then some stuff happens, then Nirvana occurs”, then in the causal case, the injection up would assign 0 measure (all the measure of got channeled into instead of ), and then projecting down, it stays the same. Similarly, projecting down means has some measure, then it’s all channeled into on the injection up, so itself gets 0 measure. For the surcausal case, the injection up assigns measure (by the same argument and sur-injections tagging every freshly-added nirvana outcome with measure). and projecting down, it remains with measure. Projecting down leaves alone and then injecting up tags with measure.

For the third subcase where is an extension of that doesn’t add any Nirvana, we can run through the same argument as the second subcase to conclude that we get 0 measure for both the causal case and the surcausal one.

Thus, no matter whether we inject up and project down, or project down and inject up, the measure assigned to by the measure component of the p-(sur)measure will agree.

An important thing to note with this is that we can use any stub above and for the injection up, but we must use for the projection down.

Now, we can finally embark on the proof of the two translation theorems! There’s enough similarities between the proofs that we can just do one big proof and remark on any differences we come across. The things we must show are that slicing off the Nirvana from a causal/​surcausal hypothesis makes a pseudocausal/​acausal hypothesis, and that adding in those injections up can turn a pseudocausal/​acausal hypothesis to a causal/​surcausal one, and that going nirvana-free to nirvana-containing back to nirvana-free is identity.

Proof sketch: While at first this may look like the proof will be almost as long as the Isomorphism theorem because we’re verifying a list of 9 conditions twice over, it’ll actually be considerably shorter. The only nontrivial part of the first part where we check that slicing off the Nirvana makes a pseudocausal/​acausal hypothesis is deriving pseudocausality from causality, and even that is fairly easy.

Going from pseudocausal/​acausal to causal/​surcausal is trickier, though thankfully most conditions are trivially true, there’s only three notable ones. There’s the bound on minimal points, which is done by taking a sequence limiting to a that violates the bound, using the definition of the causal translation to get a point below each which obeys the bound (fairly nontrivial), and appealing to Lemma 16 to construct a limit point below that obeys the bound. Showing weak consistency (projecting down makes a subset) requires the Diamond Lemma to write the projection of your point of interest as a mix of injections up from below, and the last tricky one is causality. Which requires first showing that injecting up to a higher stub won’t add any new points, and then coming up with a clever way of building our outcome function, and using the Diamond Lemma to show that it indeed an outcome function.

Finally, nirvana-free to causal to nirvana-free is instant by Lemma 22.

Proof: Referring back to the conditions for a hypothesis on policy stubs, we’ll show that they’re fulfilled when you slice away the Nirvana, and that pseudocausality can be derived from causality if we’re just dealing with a-measures and not a-surmeasures.

Stub Nirvana-free Nonemptiness was a property already possessed by the causal hypothesis, so it’s preserved when we clip away the Nirvana. Stub closure and convexity also hold because we’re intersecting with the closed convex set (of nirvana-free a-measures). Nirvana-free upper completion also holds. Bounded minimals holds because a minimal point in the Nirvana-free part must also be a minimal point in the original set, because adding anything to a nirvana-containing a-measure makes a nirvana-containing a-measure, so there can be no nirvana below our minimal point in the nirvana-free part, so it’s minimal in general. Normalization holds because the expectation values only depend on the nirvana-free part. Nirvana-free stuff projects down to nirvana-free stuff, getting stub-consistency. The stub extreme point condition carries over due to the preexisting intersection with nirvana-free used to define it, and the same applies to uniform continuity. This wraps up surcausal-to-acausal, and just leaves deriving pseudocausality from causality for causal-to-pseudocausal.

Let’s say we have a nirvana-free , where the measure part of is supported over , and we want to show that it’s also present in . Then is present in , because it’s supported entirely over histories that both the different policies produce, so it’s supported over histories that the intersection of the policies produces, just project it down to the inf. Now, by causality, we can find something in that projects down onto , which must be itself because the measure part of is supported entirely on histories in . This gets pseudocausality.

Now, let’s show that and fulfill the defining conditions for finitary causal.

1: Stub Nirvana-Free Nonemptiness: This one is trivial, because is present as a subset via the identity injection , and is nirvana-free.

2,3: Stub Closure/​Convexity: We took a closed convex hull, these are tautological.

4: Stub Nirvana-Free Upper-Completeness: Just apply Lemma 22 to get that the nirvana-free part of (and same with surcausal) is just the original , which is nirvana-free and upper-complete by stub nirvana-free upper-completeness, so we’re good there.

Condition 5: Stub Bounded Minimals:

By stub bounded minimals on the we have a bound on for minimal points in for all stubs. Pick a (or the surcausal analogue) with . There’s a sequence of points limiting to that lie in the convex hull. All these (or ) can be written as where . Decompose into a minimal point and something else, getting .

Then, do a further rewrite as

Note that the value of the sum of the first two terms is bounded above by , because obeys that bound, and for the second term, it deletes exactly as much from the term as it adds to the term. Also, since is an a-measure, adding in just the negative component to doesn’t make it go negative anywhere, so the sum of the first two terms is an a-measure, and by nirvana-free upper completeness, it lies in . The third component of the sum is also an a-measure.

By linearity of or , injecting up the first two terms and the last term, and adding them afterwards, is the same as injecting up the bulk of them (we can only inject up a-measures). Let’s abbreviate as and abbreviate as Now, we can rewrite as:

The first component is in and has the bound (injection up preserves and ) and lies below the bound, and mixing stuff below the bound produces a point below the bound. Abbreviate the first component as .

So, isn’t minimal, it lies above . Because the have a bound, and there’s only finitely many places where nirvana could be, we can extract a convergent subsequence, limiting to some which obeys the bound, and by Lemma 16, lies below .

Therefore, isn’t minimal, and it was an arbitrary point that violated the bound, so all minimal points in any obey the same bound, and we get bounded minimals.

6: Stub Normalization. By Lemma 22, we didn’t introduce any new nirvana-free points, so stub normalization of carries over.

Condition 7: Weak Consistency.

This is “projecting down makes a subset”. All the following arguments work with sur-stuff. Fix some . It’s a limit of points in the convex hull. We can decompose the as . Then,

Which, by the Diamond Lemma, can be rewritten as:

The projections of the lie in by weak consistency for . So, actually, the projection of down to can be written as a mix of injections up from stubs below , so the projection of lies in . Then, just use continuity of projection, and closure, to get itself projecting down into , so we’re good on weak consistency.

8: Stub Extreme Point Condition: By Lemma 22, we didn’t introduce any new nirvana-free points, so any nirvana-free extreme minimal point was present (and nirvana-free extreme minimal) in already, so the extreme point condition carries over from there.

9: Stub Hausdorff Continuity: By Lemma 22, we didn’t introduce any new nirvana-free points, and the preimages for Hausdorff continuity are of the nirvana-free parts, so this is completely unaffected and carries over.

Condition C: Causality: As a warmup to this result, we’ll show that if , then

Pick a . It’s a limit of which are finite mixtures of injections of stuff from below, and can be written as Then,

And then, by commutativity of injections, the injection of rewrites as All the are below so they’re also under , witnessing that the injection of lies in . Then, just appeal to closure and continuity of or to get that injects up into Again, all this stuff works for the sur-situation as well.

With this out of the way, fix some . Let’s try to make an outcome function from this, shall we? Let’s do

Yup, that does indeed specify one point for everything. It obviously spits out when you feed in because both the injection and projection turn into identity. Further, by weak-consistency, the projection of down lies in , and by our freshly-proved result, injecting up lands you in .

So, all that’s left is showing that it’s an outcome function! That, for any two and where , that Let’s begin.

And then, by the Diamond Lemma, this equals

And then, because . Rewriting a bit, and grouping the two projections together because they commute, we have:

And we’re finally done with everything, we showed causality.

Lemma 24: Given a (maybe just defined on stubs or full policies) that fulfills all hypothesis conditions except normalization, if it’s normalizable, then all belief-function conditions are preserved (works in the sur-case too)

Nirvana-free nonemptiness, closure, convexity, nirvana-free upper-completion, and bounded-minimals are all obviously preserved by scale-and-shift/​Proposition 7 in section 1 of proofs. For consistency, due to projections preserving and , the scale-and-shift in the stubs (or full policies) is perfectly reflected in whichever partial policy you’re evaluating, so consistency holds too. For the extreme point condition, any nirvana-free minimal extreme point post-renormalization is also nirvana-free minimal extreme pre-renormalization, so we can undo the renormalization, get a point in the nirvana-free component of a full policy that projects down accordingly, and scale-shift that point to get something that projects down to the scale-shifted extreme point. For Hausdorff-continuity, the scaling just scales the distance between two sets by the scale term, so Hausdorff-continuity carries over. Pseudocausality is preserved by normalization (un-normalize, transfer over to the partial policy of interest, then normalize back again), and so is causality (unnormalize the point and outcome function, complete it, normalize your batch of points back again).

Proposition 2: Given a nirvana-free , the minimal constraints we must check of to turn it into an acausal hypothesis are: Nonemptiness, Restricted Minimals, Hausdorff-Continuity, and non-failure of renormalization. Every other constraint to make a hypothesis can be freely introduced.

Ok, we have our , and we want to produce an acausal hypothesis. The obvious way to do it is: We’ll use to refer to the set before renormalization.

Proof sketch: We basically just run through the infinitary hypothesis conditions, and show that they’re fulfilled by , and then appeal to Lemma 24 that we didn’t destroy our hard work when we normalize. As for the hypothesis conditions themselves, they’re all pretty simple to show except for bounded-minimals and Hausdorff-continuity, which is where the bulk of the work is.

1: Nirvana-free nonemptiness. Trivial, because all the are nonempty.

2: Closure: Appeal to Lemma 2, not in this document, but of section 1 in basic inframeasure theory, that the upper completion of a closed set is closed. Then we just intersect with the cone of a-measures (closed) to get our set of interest, so it’s closed.

3: Convexity: If you have and which decompose into and ( and lie in the upper completion and /​ lie in the closed convex hull), then

The first component lies in the closed convex hull because it’s a mix of two points from the closed convex hull, the second component is an sa-measure, and by our upper completion, then lies in our

4: Nirvana-free Upper-completeness: Trivial, we took the upper completion.

Condition 5: Bounded Minimals:

This can be shown by demonstrating that, if is our bound for (ie, every point in , regardless of , either respects the bound or lies strictly above a point in that respects the bound), then every point in lies above a point that obeys the bound.

Take a point . We don’t have to worry about points in the upper completion that weren’t part of the original closed convex hull, because they’re above something in the closed convex hull, so we just have to show that everything in the closed convex hull lies above something that respects the bounds.

can be written as a limit of points , which split into a mixture of finitely many . We can then split the into , where respects the appropriate bounds (everything in either obeys the bounds or is above a point which obeys the bounds). Now, we can rewrite as:

That first sum term is a mixture of stuff that respects the bound, so it respects the same property and lies below by the addition of the second term making . All these lower points lie in the closed convex hull, and obey the bound, so there’s a convergent subsequence that limits to some limit point that’s also in the closed convex hull, respects the bound, and by Lemma 16, is below .

So, any , regardless of , which violates the bound on minimal points, has a point lower than it which does respect the bound, showing Minimal Boundedness.

We normalize at the end, and need uniform Hausdorff Continuity to show nirvana-free consistency, so let’s skip to that one, which is hard.

Condition 8: Uniform Hausdorff Continuity:

We’ll be working with the Lemma 15 variant of Hausdorff-continuity, that given any , there’s a where two policies being apart or less guarantees that if is in , then there’s a point in that’s only away, where is the value associated with , and establish that variant for .

Fix an . How close do two policies have to be to guarantee that for any , there’s a point in within ? Well, for our original Hausdorff-continuity condition, pick a that forces a “distance”, and .

Since we’ve got closure and bounded-minimals, write as where respects the bound, and it lies in the closed convex hull and is a limit of points, which decompose into a mixture of finitely many points.

Now, each of these points in , by Hausdorff-continuity of , have a point in , that’s only away, by and being or less apart.

We can mix the in the same way as usual to make a that’s only or less away from

Because the sequence converges, there’s some bound on the and values, and the (at most) change to make still keeps the and values of our new sequence bounded, so by the Compactness Lemma, the sequence has a convergent subsequence, with a limit point , that lies in by closure. Also, for all n,

The two distances on either side limit to 0, and the middle distance limits to or less, because eventually the value of gets really close to the value of , which is subject to the constraint that it can’t be bigger than due to being picked to have its value below , so

Ok, so those two things are pretty close to each other. But what we really want is to find a point in that’s close to , ie, . We can invoke the proof path from direction 2 of Lemma 15 (we have enough tools to do it, most notably upper completion) to craft a where

Further, . So, we get and we’re done.

7: Nirvana-free Consistency: We’re working in a nirvana-free setting, so we can simplify things. Our formulation that we’re going to show is, regardless of stub , is closed. Just invoke Lemma 20 and we have it and that’s the last one we needed besides renormalization. Now all we have to do is to show that every property is preserved when we do the necessary rescaling. Invoke Lemma 24.

Proposition 3: Given some arbitarary which can be turned into an acausal hypothesis, turning it into has for all and .

The steps to make your full are convex hull, closure, upper completion, and renormalization. For convex hull, because induces a positive functional, which is linear, convex hull doesn’t affect the worst-case value (mixing a-measures mixes the score you get w.r.t the function), closure just swaps inf out for min, and upper completion doesn’t add any new minimal points so it preserves the same minimal values for everything. Let’s use for the upper completion of the closed convex hull (no renormalization) so, unpacking definitions, for all :

Continuing onwards, let’s use for our shift constant and for our rescaling constant.

So, regardless of your and ,

and we’re done. In particular, since this scale and shift is completely uniform across everything, it keeps the set of optimal policies unchanged.

Proposition 4: For all hypotheses and ,

We can use that Murphy never picks something with Nirvana in it, and to rewrite our desired property as:

one direction of this is pretty easy, if the belief functions are identical when you slice off the nirvana, then regardless of and , Murphy forces the same value. The other direction of this can be done by Theorem 3 from Section 1. Fixing a , the property implies , but this holds for all , so we get

Proposition 5: For all hypotheses , and all continuous functions from policies to functions , then exists and is closed.

Proof sketch: We’ll prove this in four phases, where is some arbitrary sequence of policies limiting to the policy .

Our first phase will be establishing that

Our second phase will be establishing that

Our third phase will be establishing that

Putting phase 2 and 3 together, and then, in conjunction with phase 1, This establishes that the function is continuous. Phase 4 is then deriving our desired result from the continuity of that function.

Begin phase 1. To begin with, because is a function from a compact metric space to a metric space, by the Heine-Cantor theorem, it’s uniformly continuous. So, there’s some difference between policies that guarantees an difference between the functions produced, and our distance metric on functions in this case is , the distance metric associated with uniform convergence.

By Proposition 3 (Section 1), every positive functional (and, by Proposition 1 (Section 1), continuous functions induce positive functionals) is minimized within the set of minimal points, so we can fix an and within (specifically, the nirvana-free component) which minimize the positive functionals associated with and , respectively. Being able to get an actual minimizing point follows from minimal-boundedness, so the closure of the set of minimal points (due to having bounds) is compact, and a continuous function from a compact set to has an actual minimizing point, so we can pick such a point and then step down to a minimal point if needed

Note that, by the way we picked these, and also

First, we’ll bound the following two terms. and The same arguments work for both, so we’ll just show one of them.

The argument for this is that the first is because is a measure (never negative), so an upper-bound on the absolute value of the expectation is given by the expectation of the absolute value of the distance between the two functions. For the second , if n is large enough to make and be only apart, then and are only apart, so that absolute value is upper bounded by , getting us an upper bound of Then, because the total amount of measure for minimal points is upper bounded by some regardless of which policy we picked (minimal-point boundedness for ), we can finally impose an upper bound of on the distance. The sort of argument works for the second thing, and gets us the exact same upper bound.

Further, and . This is because is specialized to minimize and is specialized to minimize . Therefore, in one direction:

so

In the other direction,

so

Thus, putting the two parts together,

We can make n go to infinity, which makes (distance between policies) go to 0, which makes (distance between functions) go to 0, and is a constant, so we get that the distance between the two expectations limits to 0 and we’re done with the first phase.

Time for phase 2, showing that

Fix some in that minimizes the positive functional associated with . By Hausdorff-Continuity, we can find a sequence of points that limit to . By continuity of , this means that limits to , which is . However,

Thus,

Now for phase 3, showing that

Assume it’s false, we’ll get a proof-by contradiction. That is,

Then, we can get some subsequence where the expectations converge to the liminf. For each n in that subsequence, fix a that minimizes the positive functional associated with within . Ie,

By bounded minimals, there’s some bound on all of these, so we can isolate another convergent subsequence (the expectation values still limit to the liminf), where the limit to some . For the following arguments, we’ll use n to denote numbers from our original sequence (ranges over all natural numbers) and j to denote numbers from our convergent subsequence of interest (where the expectations converge to liminf and our sequence of minimizing points converges to a limit point)

First, this limit point lies in , because it’s arbitrarily close to points that are arbitrarily close to (Hausdorff-continuity), so the distance to that set shrinks to 0, and is closed so said point limits to be in it. Now, we can go

By , and the j’s making a subsequence where we attain the liminf value in the limit, and then the second equality is a convergent sequence of a-measures having their expectation value limit to the expectation value of the limit point.

But then we get something impossible. , and yet somehow (by our original assumption that the liminf undershot the expectation value of in ),

Which cannot be. This shows that the liminf is .

Now for phase 4. Again, from the proof sketch, phases 1, 2, and 3 establish that is a continuous function Let’s abbreviate this function as . Since we’re mapping (which is compact) through a continuous function, the image is compact. Thus, it has a maximum value, which is attained by some policy. Take that maximum value (it’s a single point so it’s closed), take the preimage (which is a nonempty closed set of policies), and that’s your set. And unpacks as Thus showing our result.

A quick corollary of it is, if just returns the constant 1 function, you can find a policy where is 1, by normalization, so we can use in normalization instead of .

Lemma 25: In the nirvana-free setting, with all the being nonempty and upper complete, then is upper-complete.

The proof of this is nearly identical to the proof of Lemma 12. Except in this case, our aren’t finitely many points selected from a nonconvex , they’re countably many points selected from the various . Apart from that difference, the proof path works as it usually does.

Lemma 26: A belief function fulfilling all conditions except normalization, which is renormalized by subtracting and scaling by , only has the renormalization fail if, for all , has a single minimal point of the form , with the same for all . (works in the sur-case too)

First, fixing an arbitrary , then, if there’s a divide-by-zero when scaling,

However, always, so the two terms are equal.

Now, just invoke Proposition 6 from Section 1, which says that if , then there’s a single minimal point of the form , and the is , which is the same for all policies . The converse is, if all are of this form, then renormalization fails.

Let’s define “nontriviality” for a belief function . A is nontrivial if there exists some where

In other words, there’s some policy you can feed in where the minimal points of aren’t just a single point. This is a very weak condition. Also, for the upcoming proposition 6, mixing just doesn’t interact well with nirvana-free consistency, so we have to do it just for pseudocausal and acausal hypotheses.

Proposition 6: For pseudocausal and acausal hypotheses where and there exists a nontrivial , then mixing them and renormalizing produces a pseudocausal or acausal hypothesis.

Mixing is defined on the infinite levels by , and is then extended down to the finite levels by the usual process of projecting down and taking the closed convex hull. Then, we can renormalize if we wish. We’ll distinguish these by “raw” or “renormalized” mix. Thus, the only conditions we need to check are the infinite conditions. For everything else, we can just go “the infinite conditions work, so we can extend to a full belief function” by the Isomorphism theorem.

If you want what mixing does at finite levels, it’s So it isn’t “mix all the finite levels”, it’s “mix all the projections individually and then take convex hull”

Proof sketch: Neglecting normalization (because Lemma 24 shows we can just renormalize and all nice conditions are preserved), we just need to verify all the relevant infinitary conditions, and then we can extend to lower levels by isomorphism, and get our result. We also need to show that nontriviality implies that the renormalization doesn’t fail, but that’s easy. As for the conditions, our lemmas let us get most of them with little trouble. Bounded-minimals from just the bound is slightly more difficult and relies on showing that is in all the sets regardless of i and by normalization to eliminate the term, and Hausdorff-continuity is also fairly nontrivial (we have to split our mix into three pieces and bound each one individually via a different argument) and relies on the same is in all the result. For causality, we’ll knock it out with Tychonoff in a slightly more complicated way than usual so we just use a countable product and don’t have to invoke the full Axiom of Choice.

We’ll take a detour and show that for all , we need this in a few places. First, pick an arbitrary i and and look at . Find a minimal point that minimizes . Now, consider . This is . However...

(by minimizing , and normalization, respectively)

So, by nirvana-free upper-completion for , we get the point for all i and . Then, mixing these gets that lies in .

1: Infinitary Nirvana-Free Nonemptiness, it’s easy, all our contain the a-measure , which is nirvana free.

2,3: Closure, convexity: Closure follows from the proof of closure in Proposition 11 of section 1, and we mixed convex sets so the mixture is convex.

4: Nirvana-free Upper-Completion: Just invoke Lemma 25.

5: Bounded minimals: Since for all i, any a-measure with isn’t minimal (add whatever you want to ), so we have a bound on the values of minimals. What about a bound on the values?

Well, just take a in with , and split it into a mixture of points from the . Then, by bounded minimality for the , we can take each and find a minimal point below it that fulfills the bound on the values. Mixing those minimals produces a point that’s below , and has a value below , which, by assumption, is finite. So, every minimal point in any has a value below and we have bounded-minimals.

Nirvana-free consistency is something we’ll have to loop back to after Hausdorff-continuity.

Condition 8: Hausdorff-continuity:

Pick an arbitrary . It shatters into . We’ll be showing the Lemma 15 variant, which is that for all , there’s a where if , then there’s a point in that’s away.

First, we’ll shuffle around what our are supposed to be, we need a certain decomposition to make it work. Reindex your probability distribution so the highest-probability thing is assigned to . All the can be decomposed as a . Now, let our new for be defined as: ,

and our for is defined as:

These new still lie in (nirvana-free upper-completion), and they’re all a-measures (the negative part isn’t enough to cancel out the positive measure on otherwise our old wouldn’t be an a-measure). Further, mixing them together still makes , and if , then (because we start off at a minimal point with , and then add something to it that saps some measure from it).

Fixing some … well, , so find a j where . For … well, using the Lemma 15 variant of Hausdorff-continuity, note that fixing a gets you a different value for Hausdorff-continuity of each . We only have to worry about the , though, and there’s finitely many. So, pick a where the induced is below , and for all , .

So… we take our in , and go to nearby in . We should break down exactly how this is done. For , the value relative to is at most (in the degenerate case where it contributes all the measure to the mixture ), so, by Hausdorff-continuity for , the gap between and is at most because we picked low enough to get that scale factor on the front.

For where , the gap between and is at most Because we picked low enough to guarantee that scale factor on the front, and is made by adding a minimal point and an sa-measure where the measure component is entirely negative, so is a bound on the value for .

And finally, for , we can specially craft a where the gap between and is at most . This is because has measure below due to being a minimal point that lost some measure. So, we can expend effort to completely reshuffle it however we wish, and then add to our reshuffled a-measure to make an a-measure that lies above , which must be in , so our reshuffled a-measure plus lies in by nirvana-free upper-completion, and we only had to spend effort to travel to it (first term is the reshuffling, second term is adding 1 to the term)

Now, let’s analyze the distance between and the point which lies in . equals...

And bam, we’ve got Hausdorff-continuity!

9: Nirvana-free consistency: Invoke Lemma 20, notice that we’re in the nirvana-free setting, we’re done.

Pseudocausality. Fix some , whose support is a subset of . shatters into . All of them have their support being a subset of (otherwise there’d be measure-mass outside of there), so all the transfer over to by pseudocausality for the , and then we can mix them back together to get .

And we’re done, mixing works just fine, as long as we can show that the renormalization preserves everything and the renormalization doesn’t fail. Renormalization fails iff (By Lemma 26), for all , then

We can then go and similar for 0, (by the definition of the mix of belief functions and Proposition 10 in Section 1) and then we can use that to go

So, then, for all and i, However, this is incompatible with the existence of a nontrivial , because nontriviality just says that there’s a where .

So, nontriviality for some means that the renormalization of your mix can be done. It’s a very weak condition, just saying “there’s some possibility of starting with a hypothesis () which has a policy , where murphy actually has to pay attention to what function you’re maximizing.

Now, we just need to show that renormalization preserves all nice conditions. Just invoke isomorphism to complete to a full psuedo/​acausal belief function lacking normalization, apply Lemma 24 and renormalize everything, and we’re done.

Proposition 7: For pseudocausal and acausal hypotheses,

Proof: by Proposition 10 of Section 1.

Proposition 8: For pseudocausal and acausal hypotheses,

This is an easy one. If , then there’s a preimage point , which then decomposes into . These project down to , which then mix to make (project then mix equals mix then project because of linearity) witnessing that

Now for the reverse direction. If , then it shatters into , which have preimage points . The mix to make a , which projects down to , by project-mix equaling mix-project, witnessing that .

Next proof post!

No comments.