Proofs Theorem 4
post by Diffractor · 2021-12-03T18:48:07.952Z · LW · GW · 0 commentsContents
No comments
Theorem 4: Pseudocausal IPOMDP's: If an infra-POMDP has a transition kernel which fulfills the niceness conditions, and there's a starting infradistribution , unrolling it into an infrakernel via has the infrakernel being pseudocausal and fulfilling all belief function conditions except for normalization.
This takes a whole lot of work to show. The proof breaks down into two phases. The first is showing that the fulfill the niceness conditions (which splits into four subphases of showing 1-Lipschitzness, lower-semicontinuity, compact-shared CAS, and moving constants up/mapping 1 to 1). This is the easy part to do. The second phase is much longer. It splits into five pieces, where we show our list of conditions for . Our list of conditions to show is:
1:The have a uniform upper bound on their Lipschitz constants.
2: is lower-semicontinuous for all
3: is supported entirely on histories compatible with .
4: All the agree on the value of 1 (or infinity).
5: is pseudocausal.
Parts 1 and 4 can be done with little trouble, but parts 2, 3, and 5 split into additional pieces. To show pseudocausality, ie, condition 5, we have to start with our proof goal and keep rewriting it into an easier form, until we get something we can knock out with an induction proof, which splits into a base case and an induction step. To show sensible supports, ie, part 3, we have to split it into three parts, there's one induction proof for the finite , showing sensible supports for , and then finally extending this to the result we actually want, which is fairly easy.
And then there's lower-semicontinuity, part 2, which is an absolute nightmare. This splits into three parts. One is explicitly constructing a sequence of compact sets to use in the Almost-Monotone Lemma for any policy whatsoever, and then there's trying to show that is lower-semimicontinuous which requires a massive amount of messing around with integrals and limits, and then it's pretty easy to show lower-semicontinuity for from there.
T4.1 Our first order of business is showing the five niceness conditions for all the infrakernels, so we can take the infinite semidirect product and show that inherits the five niceness conditions (from the proof of Theorem 1).
Remember that is defined as
Obviously it produces inframeasures, since does.
T4.1.1 For lower-semicontinuity in the input, we have
and then, as limits to , we have forever after some finite , because is a finite space. So, it turns into
and then, by lower-semicontinuity for K (one of the niceness conditions)
and then we pack this back up as
and lower-semicontinuity has been shown.
T4.1.2 For 1-Lipschitzness, observe that is 1-Lipschitz because is 1-Lipschitz, and this is the semidirect product of a 1-Lipschitz (all probability distributions are 1-Lipschitz, and this is a dirac-delta distribution) infradistribution with a 1-Lipschitz infrakernel.
T4.1.3 For compact-shared compact-almost-support, observe that is a continuous function, so any compact subset of (input to ) maps to a compact subset of (corresponding inputs for ). Then, we can apply compact-shared compact-almost-support for (one of the niceness conditions) to get a compact almost-support on , and take the product of that subset with all of itself, to get a shared-compact almost-support for .
T4.1.4 For mapping constants to above constants, we have
By increasing constants for (one of the niceness conditions). This same proof works with equality when , because of the "map 1 to 1" condition on the type signature. So all the have the niceness properties, and we can form the infinite semidirect product at all, and the infinite semidirect product inherits these niceness conditions.
T4.2 Now we can get working on showing the belief function conditions.
T4.2.1 For showing a bounded Lipschitz constant, we have
and then since is an infradistribution, it must have a finite Lipschitz constant, . So we then get
and then, since all the are 1-Lipschitz (infinite semidirect product has niceness conditions), we get
and we're done, we showed a Lipschitz bound on the that is uniform in .
T4.2.2 Now for lower-semicontinuity for , which is the extremely hard one.
The way this proof will work is that we'll first show a variant of the Almost-Monotone Lemma that works for all the infrakernels simultaneously. Then, using that, we'll show that
is lower-semicontinuous. And then, it's pretty easy to wrap up and get that
is lower-semicontinuous, from that.
T4.2.2.1 Anyways, our first goal is a variant of the Almost-Monotone Lemma. The particular variant we're searching for, specialized to our particular case, is:
We let our be an arbitrary compact subset of states, and be arbitrary.
Now, let's assume there's a sequence of compact sets C where, regardless of n or , is a shared -almost-support for all the inframeasures (where ).
Then our proof goal becomes
The proof of the Almost-Monotone Lemma goes through perfectly fine, the key part where our choice of compact sets matters is in the ability to make the assumption that is an -almost-support for all the with , as our is. And we assumed our sequence of compact sets fulfilled that property.
So, where we're at is that our variant of the Almost-Monotone-Lemma works, as long as our assumption works out, that there's a sequence of compact sets where, regardless of n or or , is a -almost-support for .
In order to get our variant of the AML to go through, we really need to prove that there is such a sequence, for any and . For this, we'll be looking at a function
What this does is that is defined to return a compact subset of which is an -almost-support for all the inframeasures, when . There are always compact subsets like this for any input because of shared almost-compact-support for the infrakernel (one of the assumed niceness conditions), so there does indeed exist a function with these properties.
Then, we recursively define the as:
and
All these sets are compact, by induction. For the first one, since always returns compact sets, we're taking a finite union of compact sets (is compact), and taking the product with a finite set (is compact). For induction up, since the product of compact sets is compact, is compact, and then projections of compact sets are compact, so then returns a compact set, and again, we take a finite union of compact sets and then the product with a compact set to make another compact set.
Now, let's get started on showing that C^{\top}_{1:n+1}[C^{S},\eps] is an \eps\left(1-\frac{1}{2^{n+1}}\right)-almost-support for K^{\pi}_{:n}(s_0), for arbitrary \pi and s_0\in C^{S}. This proof will proceed by induction.
For the base-case, we want that is a -almost-support for with . Well, assume are equal on , and .
Now, is a -almost-support for because , so we can apply Lemma 2 from LBIT, and 1-Lipschitzness of to split this up as
That second term can be upper bounded by , so we get an upper bound of
and then we observe that since on , and that set is defined as , we have that they agree on
So that term turns into 0, the two functions are equal on our set of interest, so our net upper bound is
Showing that is a -almost support for all the , where is arbitrary and s_0\in C^{S}.
Now for induction. We want to show that is a -almost-support for with , assuming that is a -almost-support for with .
To begin, assume are equal on and . Then we can go:
and then unpack to get
and unpack the semidirect product and substitute the dirac-delta value in to get
We apply a Lemma 2 from LBIT decomposition, with as an -almost support for , which works by induction assumption. Since is also 1-Lipschitz (the niceness conditions we already know), the Lemma 2 upper bound breaks down as
By 1-Lipschitzness of , we get an upper-bound of
And then this is upper-boundable by , so we get
Now, for that top term, we know that comes from . We apply another Lemma 2 decomposition on this time, with the compact set of interest being
which is a -almost-support for . Pairing this with 1-Lipschitzness of , we have an upper bound of
That second term is upper-bounded by no matter what, so we have
Pull out the constant and combine, to get
Now, were assumed to agree on , which factors as
and then expands as
which all our we're feeding in lie within. So those functions are identical for those inputs, and our upper-bound just reduces to
Which witnesses that is a -almost-support for any (arbitrary ) as long as . So our induction step goes off without a hitch, and we now know that the sequence is suitable to get our modification of the Almost-Monotone-Lemma to work out. We have the result
In particular, since any defining sequence for has an actual limit, this can be trivially reshuffled to attain the result:
And remember, the sequence of compact sets only depends on and , not on the policy. This works for any policy. Now that we have this result in the bag, we can move on to phase 2.
T4.2.2.2 For phase 2, we'll show that is lower-semicontinuous, and dear lord this is going to be hard and take a lot of integrals and limits to show. We start with
Which unpacks as
If this liminf is infinity, we're just done, infinity beats everything. So we'll be assuming that it settles down to a finite value. We can unpack the as a challenge of picking points in , so we get
And, since we can pick amongst minimal points, we can rewrite this as
In the case, the set of minimal points is precompact. In the case, it may not be precompact, but the only way it fails to be precompact is having b values running off to infinity. Said function can't be unbounded below, because it is above , and infinite semidirect products map constants to greater than their usual value, and fulfills this condition. Thus, if the b value of a selected minimal point is far higher than the finite liminf value plus , the value said minimal point assigns to the function of interest is far higher than the true minimal value. So, when minimizing, we can restrict to a compact subset of to attain the min, call it . With this, we get
And then we unpack this as an integral.
Now, we can pass to two subsequences. First, we can pass to a subsequence of the m that actually limits to the liminf value. On this subsequence, the policies still limit to . Also, for each member of this subsequence, we can find an true minimizing pair for it since we're minimizing the lower-semicontinuous function
over a compact set. Said function is lower-semicontinuous because
is lower-semicontinuous, it's one of the niceness properties that the infinite infrakernel inherits.
Since we're selecting a-measures from the compact set , we can find a subsequence of our subsequence where the a-measures converge. Now we'll be using i as our index, and using it to index the pairs. The limit a-measure is . So, we have a rewrite as
Further, all numbers in this limit are finite, because we're operating in the case where the liminf is finite, and i indexes a subsequence which actually limits to the liminf. For said limit, limits to , so we can pull that out of the limit, to get
At this point, since the limit to , the set of the measure components is compact in the space of measures on . In particular, this implies that there is some compact set where all the only have or less measure outside said set (a necessary condition for precompactness of a set of measures)
Now, we split the measures into the component of measure on , and the component outside of it.
Because is a compact subset of , we can use it as a seed set for our sequence that attains almost-monotonicity. To economize on space for the lines to fit, we'll write
as just
We unpack the defining limit of the infinite semidirect product with our relevant sequence of compact sets to get
We'll now be splitting up the second integral as two more integrals. Fix an arbitrary .
We're really in the weeds here, let's see if we can make any dent whatsoever. What we'll be showing is that, regardless of ,
It is exceptionally important to note here that we can't just throw our variant of the Almost-Monotone Lemma at this, because we don't have a promise that .
An easy way to show this is to show it for all . So let . Then, our proof goal is that
First, observe that the functions
and
can only differ on the same input by at most . Since the finite stages are all 1-Lipschitz or less, we have
And then, by monotonicity of the infrakernel stages for functions which cut off at a finite point (it was from Theorem 1), we have
And we're done, we have shown that, regardless of , we have
Now, applying this fact to our giant list of integrals, we can take
and apply our lower-bound to get
Also, since we picked so only has or less measure outside of it, regardless of i, we can impose a lower bound of:
Now, to break down the integral on the top. Since the behavior of the interior of that integral only depends on , we can apply our good old Almost-Monotone-Lemma variation, and lower-bound the limit by the finite stage, minus (independently of the choice of ) yielding
Now, we pull that constant piece out of the integral, and since both functions in the integral are now identical, we can stitch the measure components back together to get:
Now, is a fragment of , which came from a minimal point in or a limit of such, so the maximal amount of measure present would be . This lets us get a lower bound of
Doing a little bit of regrouping, we have
Recapping what we've done so far, we showed that
For all . Regardless of , this liminf value in i is finite because it's upper-bounded by the liminf in m, which is assumed to be finite (we disposed of the infinite case already). Because the sequence and the compact sets were fixed before we picked , we can take a limit of the without worrying about it affecting any of the stuff on the inside of the integral, to get
This liminf in i of the integral is finite for all n. At this point, we can notice something interesting. For fixed n, when we take the limit as i approaches infinity, eventually will become and never go back, because of convergence of the policies, and the fact that the infrakernel has a cutoff at time n, so the behavior of the policies before that time will stabilize on a partial policy. So, we can rewrite our lower bound as:
Again, for all n, this liminf in i quantity is finite. Now, the function
is lower-semicontinuous in (as all the finite stages of the infinite infrakernel have that niceness property). And is lower-semicontinuous when is. Also, by now, the only thing i is controlling is the measure term. Therefore, that liminf of the integral (in i) is greater than what you'd get if you just passed to . So, we get
Again, for all n, the integral is finite. Now, since the functions
converge pointwise in n to the function
(any sequence of compact sets works to define the infinite semidirect product) and for all n, the integrals are finite (this is why we kept restating it over and over!), we can apply Fatou's Lemma to move the liminf into the integral, to get
And since we have pointwise limiting, this is
and then, packing it up,
And then, pack it up as
and one last step. Since lies in , we can get
which reexpresses as
Our net result is
But because was arbitrary and and are finite, we have our desired net result of
And lower-semicontinuity is shown for
T4.2.2.3 Now for our phase 3, getting lower-semicontinuity of . Let limit to . We have
and then this unpacks as
Reexpressing the projection, we have:
And then, we can apply lower-semicontinuity of which we've shown in phase 2, to get
Which then packs up as
and then
and we have lower-semicontinuity for ! Finally!
T4.2.3 Time to return to easier fare. We need that is supported entirely on histories compatible with . To do this, we proceed in three steps.
First, we do an induction proof that, for any given and , is supported on the subset of where the actions and observations so far are compatible with .
Second, we use our first result to show that for any two continuous bounded utility functions which only depend on the action/observation sequence, and agree on all histories compatible with ,
Finally, we use this result to show that if agree on all histories compatible with , , showing that is supported on histories compatible with .
T4.2.3.1 Let's start with the induction proof that has the subset of where the history of actions and observations is compatible with as a support. The way to do that is to pick two functions that are identical on said subset, and show they have equal expectation values. For the base case of the induction proof, fix an and which are identical on (as that's the subset of which is compatible with ), and observe:
and then by how is defined, we have
Which, by unpacking the semidirect product and subsituting the dirac-delta value in, we have
and then since when the initial action equals , we have
and then this just wraps back up by our sequence of rewrite steps in reverse as
Now that we have the base case out of the way, we'll attempt to show the induction step. Let and be identical on the subset of which is compatible with , and we'll show they have equal expectation value. Our job is to unpack
this unpacks, by how sequences of semidirect products are defined, as
Now, since, by induction assumption, is supported on the subset of where the history is compatible with , we just need to show that the inner functions are identical on that subset to get that the difference is 0, so our proof goal is now
We now unpack as to get our new proof goal of:
We unpack the semidirect product and substitute the dirac-delta value in to get a proof goal of
at this point, we realize that if is compatible with , then regardless of and , is compatible with , so the functions
and
are equal, and we get our desired proof goal. So the induction goes through, and the subset of where the history is compatible with will always be a support of for all n.
T4.2.3.2 Now that's in place, we'll show that any two continuous bounded functions and which only depend on the action/observation sequence and are identical on histories compatible with will be assigned equal value by . Assume and only depend on action/observation sequences and are identical on histories compatible with . We have a desired proof target of
and, unpacking the limit, it turns into
The limits exist, so we'd be able to show that if we were able to show that
Now, we can use the Lemma 2 from LBIT decomposition on , since we know that the subset of where the history is compatible with is a support of , to get an upper bound on
of
For the first part, the 1 is there because all the infrakernels are 1-Lipschitz, and we're assessing distance of the inner functions on the support. For the second part, it's there because we don't care how different the two functions are outside the support, they must be assigned equal value. So anyways, this upper bound reduces to
Now, for all choices of n and , fix an arbitrary extension which is compatible with (which can always be done). Then use the triangle inequality twice to get an upper-bound of
We can observe that, since and are equal on all histories compatible with , and was picked specifically to have , that second absolute value term must always be 0, to get
and then distribute the sups and limsups into the addition to get
Since this quantity is an upper bound on the
quantity we wanted to show equals 0 to hit our proof target, we just need to show that it equals 0.
Here's what we do now. Since and are defined over , a compact space, they must both be uniformly continuous. You can take any , and find some where two histories that only differ by are assigned values or less apart. And for any , there's some n where any two histories that agree on the first n steps only differ by . So, in the limit, both of these terms become 0, and our result that
for any and which agree on all histories compatible with follows since we hit our proof target.
T4.2.3.3 Finally, it's time to apply this result to get that is supported only on histories compatible with . Fix any , and which are identical on histories compatible with . We can start unpacking.
And then, for all ,
as we've already shown, since and agree on histories compatible with . Thus the functions in are identical, and we have
and so we're done.
T4.2.4 The next belief function condition to check is agreement on max value. For the type signature, fixing any and
And for the type signature, any infinite semidirect product fulfilling the niceness conditions (which is the case) must map 1 to 1, so this turns into
And we're done, and in fact get the stronger result that regardless of because because we assumed was an infradistribution. For the type signature, fixing any and , we have
Regardless of type signature, an infinite semidirect product fulfilling the niceness conditions (which we're assuming) must map constants to an equal or greater constant, and infinity can be regarded as the supremum of putting constants in, so we get
Importantly, this does not mean that must be .
T4.2.5 Now for the very last condition, checking that is pseudocausal. This takes a whole lot of work (though less than lower-semicontinuity), and we'll do it by repeatedly showing equivalence of our desired result to some slightly simpler result, until we ground out in an induction proof we clean up in the same way, which splits into a base case and induction step.
T4.2.5.1 We want to show that
We can unpack and , to get the equivalent proof target of:
First, we expand the update, to get the equivalent proof target of:
Then we rewrite the projections to get the equivalent proof target of
Then we rewrite the semidirect products to get the equivalent proof target of
By monotonicity for , we'd be able to prove this if, for any initial state, we had
So we'll now try to prove this, let be arbitrary. We can explicitly unpack the infinite semidirect product, and get a proof target of
We'd be able to prove this if we could prove the inequality for all the n individually, as then the inequality would carry over to the limit. So, our new proof target is, for arbitrary n,
We can observe something interesting here for the update. if is compatible with , then the inf can be modeled as always picking some continuation compatible with , because doing otherwise would make the function equal 1 (or infinity). In such a case, the inner function of the top would turn into . If is incompatible with , then it doesn't matter the extension, any extension will always be incompatible with , and return 1 (or infinity). So, this can be rephrased as
Now, we can pack up the fact that we're updating on "partial history is compatible with " to get an equivalent proof target of
Now, since these inner functions are identical, we'd be able to prove this proof target if we proved the following for arbitrary functions (or ), and arbitrary as before.
So, since we're trying to prove this result for arbitrary n, the obvious thing to do is an induction proof. If we can show this by induction, we hit our proof target and we're done.
T4.2.5.2 For our base case, we'd want to show
We reexpress the update to get a proof target of
Now, the only way for to be incompatible with is if . Doing that reexpression, we get
We recall that for the iterated semidirect product, the thing is just , and . Making these substitutions, we get an equivalent proof target of
Reexpressing the semidirect product and subsituting the dirac-delta value in, we get the equivalent
We have two possible cases. In case 1, , and then our proof target turns into
Which is just true, they're equal. In case 2, , and we then get the proof target of
Which is true because
Because of mapping 1 to 1. Or, for the type signature, we'd have
By Constants Increase for . So, we've finally finished up our proof target, setting up the base case for the induction.
T4.2.5.3 Now for the induction step, which is all we need to finish up the proof. We need to somehow show that
For all functions , assuming that
holds for all functions . Let's begin. Our proof target is
We reexpress the update, to get the equivalent statement
We unpack the definition of to get
And then substitute in the definition of to get
And unpack the semidirect product and substitute in the dirac-delta value to get the proof target of
Now, looking at that gigantic first term... if is incompatible with , then the really big function is just going to turn into 1 (or infinity), which is assigned a value of 1 (or infinity) by , from the usual constants increase/1-normalization argument.
If , then we're back in the usual business except we don't need to worry about incompatibilities of the history with within , we only need to worry about those incompatiblities for the final action, . So, we can rewrite that first term with indicator functions.
And then rewrite this as just an update on , getting the equivalent proof goal
Now, we can define our as
To derive, from our induction assumption, that
So, as long as we're able to show that this quantity still lies above
,we can just put the two inequalities together, and we'll attain our proof target and the induction will go through and we'll be done. So, our new proof target is
Splitting into two cases, it could be that . In this case that big inner function in the attempted upper bound turns into just 1, and then . Or, for the type signature, we'd get .
In the second case where , we'd get
as the value of the function inside , due to the indicator function working out appropriately, and matching up with what does. This ability to break stuff up into an indicator function yields a new equivalent proof target of
We would be able to show this via inframeasure monotonicity (as the starting inframeasures are equal now) if, for all , the first function was larger than the second one. Let be arbitrary, and our proof goal is now
In the first subcase, where , because of 1-normalization (or constants moving up), our goal would be that
(or infinity). Infinity beats everything, and the inframeasures in the type signature can report a maximum value of 1, so this proof goal is just obviously true.
In the second subcase, where , our proof goal would turn into:
But they're equal. We hit our proof goal, the induction goes through, and this hits the proof goal for establishing pseudocausality, which was the last condition we needed to show our result. We're done!
0 comments
Comments sorted by top scores.