Proofs Section 1.2 (Mixtures, Updates, Pushforwards)

post by Diffractor · 2020-08-27T07:57:27.622Z · LW · GW · 0 comments

Contents

No comments

The previous proofs are here. [AF · GW]

Proposition 5: If , then the condition "there is a  where, " is equivalent to "there is a compact  s.t. "

Proof sketch: One direction is immediate from the Compactness Lemma. For showing that just a bound on the  values suffices to be contained in a compact set, instead of a bound on the  and  values to invoke the Compactness Lemma, we use a proof by contradiction where we can get a bound on the  values of the minimal points from just a bound on the  values.

Proof: In one direction, assume there's a compact  s.t. , and yet there's no upper-bounding  on the  values. This is impossible by the Compactness Lemma, since .

In the other direction, assume there's a  bound on  for the minimal points. Fix some arbitrary  for the rest of the proof. Now, we will show that all minimal points  have , and  letting us invoke the Compactness Lemma to get that everything is in a suitable compact set . The first bound is obvious. Since  came from a minimal point, it must have  as an upper bound.

For the other one, by contradiction, let's assume that there's a minimal point  where . Then, we can write  as: 

The first component,  is our fixed minimal point of interest. The second component is an sa-measure, because , due to the  upper bound on the  value of minimal points. The third component is also a nonzero sa-measure, because  is nonnegative (it came from a minimal point), and by assumption, . Hang on, we wrote a minimal point  as another minimal point , plus two sa-measures (one of which is nonzero), so  can't be minimal, and we have a contradiction.

Therefore, all  have . Now that we have bounds on  and  for minimal points, we can invoke the Compactness Lemma to conclude that everything is in a compact set.

 

Proposition 6:  only occurs when there's only one minimal point of the form .

Proof: Unpacking the expectations, and in light of Proposition 3,

 and 

So, take a minimal a-measure  that minimizes . One must exist because we have  and  bounds, so by the Compactness Lemma, we can restrict our attention to an actual compact set, and continuous functions from a compact set to  have a minimum, so there's an actual minimizing minimal point.

 must be 0, because otherwise  which contradicts . Further, since , said  must be the lowest  possible amongst minimal points. 

So, we have a minimal point of the form  where  is the lowest possible  amongst the minimal points. Any other distinct minimal point must be of the form , where . This other minimal point can be written as , where the latter component is an sa-measure, so it's not minimal. Thus, there's only one minimal a-measure and it's of the form .

 

Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution, if renormalization doesn't fail.

Proof sketch: Our first order of business is showing that our renormalization process doesn't map anything outside the cone of sa-measures. A variant of this argument establishes that the preimage of a minimal point in  must be a minimal point in , which quickly establishes positive-minimals and bounded-minimals for . Then, we verify the other conditions of a bounded infradistribution. Nonemptiness, closure, and convexity are very easy, upper-closure is shown by adding appropriately-scaled sa-measures such that, after renormalization, they hit whatever sa-measure you want. Then, finally, we just have to verify that our renormalization procedure is the right one to use, that it makes  and .

Proof: First up, we need to show that after renormalization, nothing gets mapped outside the cone of sa-measures. Observe that the renormalization process is injective. If two points are distinct, after a scale-and-shift, they'll still be distinct.

Let B be our original set and  be our renormalized set. Take a point in , given by . Undoing the renormalization, we get .

By decomposition into a minimal point and something else via Theorem 2, we get that

where . Renormalizing back, we get that

, obviously, because  is the minimal  value amongst the minimal points. So, the first component is an a-measure, the second component is an sa-measure, so adding them is an sa-measure, and then we scale by a nonnegative constant, so  is an sa-measure as well.

This general line of argument also establishes positive-minimals and bounded-minimals, as we'll now show. If the  isn't 0, then we just wrote  as

And the first component lies in , but the latter component is nonzero, witnessing that  isn't minimal. So, if  is minimal in , then , so it must be the image of a single minimal point  by injectivity. Ie, the preimage of a minimal point in  is a minimal point in .

Scale-and-shift maps a-measures to a-measures, showing positive-minimals, and the positive scale constant of  just scales up the  upper bound on the  values of the minimal points in , showing bounded-minimals.

For the remaining conditions, nonemptiness, closure, and convexity are trivial. We're taking a nonempty closed convex set and doing a scale-and-shift so it's nonempty closed convex.

Time for upper-completeness. Letting  be our original set and  be our renormalized set, take a point  in . By injectivity,  has a single preimage point . Undoing the renormalization by multiplying by  (our addition of  is paired with  to undo the renormalization on that one), consider  This lies in  by upper-completeness, and renormalizing it back produces , which is in , so  is upper-complete.

That just leaves showing that after renormalizing, we're normalized. 

For the other part,

And we're done.

 

Lemma 6:  is a continuous linear operator.

Proof sketch: First show linearity, then continuity, for the operator that just maps a signed measure through , using some equation-crunching and characterizations of continuity. Then, since  is just the pair of that and the identity function, it's trivial to show that it's linear and continuous.

We'll use  to refer to the function  defined by , where  is a measurable subset of  and . Ie, this specifies what the measure  is in terms of telling you what value it assigns to all measurable subsets of .

We'll use  to refer to the function  given by .

Our first order of business is establishing the linearity of . Observe that, for all measurable , and  being real numbers, and  being signed measures over ,

So,  and we have linearity of .

Now for continuity of . Let  limit to . The sequence  converging to  in our metric on  is equivalent to:  

So, if  fails to converge to , then there is some continuous function  that witnesses the failure of convergence. But, because  is a continuous function , then , and also , so:

The key step in the middle is that  limits to , so  limits to , by our characterization of continuity. Thus, we get a contradiction, our  that witnesses the failure of convergence actually does converge. Therefore,  limits to  if  limits to , so  is continuous.

To finish up, continuity for  comes from the product of two continuous functions being continuous ( which we showed already, and  because duh), and linearity comes from:

 

Proposition 8: If  and  is a continuous function , then 

 

Proposition 9:  is a (bounded) inframeasure if  is, and it doesn't require upper completion if  is surjective.

Proof sketch: Nonemptiness is obvious, and showing that it maps sa-measures to sa-measures is also pretty easy. Closure takes a rather long argument that the image of any closed subset of sa-measures over , through , is closed, which is fairly tedious. We may or may not invoke upper completion afterwards, but if we do, we can just appeal to the lemma that the upper completion of a closed set is closed. Convexity is immediate from linearity of .

For upper completion, we can just go "we took the upper completion" if  isn't surjective, but we also need to show that we don't need to take the upper completion if  is surjective, which requires crafting a measurable inverse function to g via the Kuratowski-Ryll-Nardzewski selection theorem, in order to craft suitable preimage points.

Then we can use LF-Duality to characterize the induced  function, along with Proposition 8, which lets us get positive-minimals, bounded-minimals, and normalization fairly easily, wrapping up the proof.

Proof: Nonemptiness is obvious. For showing that it takes sa-measures to sa-measures, take an , and map it through to get  is an sa-measure, so . Now, we can use Lemma 5 to get:

So the  term is indeed big enough that the image of  is an sa-measure.

For closure, fix a sequence of  limiting to some , with preimage points . Due to convergence of  there must be some  bound on the  preserves those values, so  is an upper bound on the . Since the  are sa-measures,  is a lower bound on the  values. Since  converges to  converges to , so there's a  upper bound on the  values. Further,

So, for all n, , so we have an upper bound on the  and  values. Now we can invoke the Compactness Lemma to conclude that there's a convergent subsequence of the , with a limit point , which must be in  since  is closed. By continuity of  from Lemma 6,  must equal , witnessing that . So,  is closed. Now, if we take upper completion afterwards, we can just invoke Lemma 2 to conclude that the upper completion of a closed set of sa-measures is closed.

Also,  is linear from Lemma 6, so it maps convex sets to convex sets getting convexity.

Now for upper completion. Upper completion is immediate if  isn't surjective, because we had to take the upper completion there. Showing we don't need upper completion if  is surjective is trickier. We must show that  is a surjection from  to .

First, we'll show that  where  is an open subset of  is a measurable subset of . In metrizable spaces (of which  is one), every open set is a  set, ie, it can be written as a countable union of closed sets. Because our space is compact, all those closed sets are compact. And the continuous image of a compact set is a compact set, ie closed. Therefore,  is a countable union of closed sets, ie, measurable.

 is a Polish space (all compact metric spaces are Polish), it has the Borel -algebra, and we'll use the function . Note that  is closed and nonempty for all  due to  being a continuous surjection. Further, the set  equals  for all open sets . In one direction, if the point  is in the first set, then there's some point  where . In the other direction, if a point  is in , then there's some point  where  so  is nonempty.

Thus,  is weakly measurable, because for all open sets  of  and  is measurable. Now, by the Kuratowski-Ryll-Nardzewski Measurable Selection Theorem, we get a measurable function  from  to  where  so , and  is an injection.

So, we can push any sa-measure of interest  through  (which preserves the amount of negative measure due to being an injection), to get an sa-measure that, when pushed through  recovers  exactly. Thus, if , and you want to show , just consider

So, since  due to upper-completeness, then  And we have shown upper-completeness of  if  is a surjection.

We should specify something about using LF-Duality here. If you look back through the proof of Theorem 5 carefully, the only conditions you really need for isomorphism are (on the set side)  being closed, convex, and upper complete (in order to use Proposition 2 to rewrite  appropriately for the subsequent arguments, we have these properties), and (on the functional side),  being concave (free),  if  (by proof of Theorem 4, comes from upper completeness), and continuous over  (showable by Proposition 8 that , and the latter being continuous since  is an infradistribution)

It's a bit of a pain to run through this argument over and over again, so we just need to remember that if you can show closure, convexity, upper completeness, and the expectations to be continuous, that's enough to invoke LF-Duality and clean up the minimal point conditions. We did that, so we can invoke LF-Duality now.

Time for normalization. From Proposition 8, the  function we get from  is uniquely characterized as: . So,

and normalization is taken care of.

For bounded-minimals/weak-bounded-minimals, since  is the LF-dual of , we can appeal to Theorem 5 and just check whether  is Lipschitz/uniformly continuous. if , then  according to the sup metric on  and , respectively, which (depending on whether we're dealing with Lipschitzness or uniform continuity), implies that , or  for uniform continuity. So, we get:  (or  for uniform continuity), thus establishing that  and  being sufficiently close means that  doesn't change much, which, by Theorem 5, implies bounded-minimals/weak-bounded-minimals in .

For positive-minimals it's another Theorem 5 argument. If , then , so:  And we have monotonicity for , which, by Theorem 5, translates into positive-minimals on .

 

Lemma 7: If , then for all decompositions of  into 

This is easy. Decompose  into . To derive a contradiction, assume there exists a nonminimal  that decomposes into  where . Then,

Thus, we have decomposed our minimal point into another point which is also present in , and a nonzero sa-measure because there's a nonzero  so our original "minimal point" is nonminimal. Therefore, all decompositions of a minimal point in the mixture set must have every component part being minimal as well.

 

Proposition 10: 

Done.

 

Proposition 11: A mixture of infradistributions is an infradistribution. If it's a mixture of bounded infradistributions with Lipschitz constants on their associated  functions of , and , then the mixture is a bounded infradistribution.

Proof sketch: Nonemptiness, convexity, upper completion, and normalization are pretty easy to show. Closure is a nightmare.

The proof sketch of Closure is: Take a sequence  limiting to . Since each approximating point is a mixture of points from the , we can shatter each of these  into countably many  . This defines a sequence in each  (not necessarily convergent). Then, we take some bounds on the  and manage to translate them into (rather weak) i-dependent bounds on the  sequence. This lets us invoke the Compactness Lemma and view everything as wandering around in a compact set, regardless of . Then, we take the product of these compact sets to view everything as a single sequence in the product of compact sets, which is compact by Tychonoff's theorem. This is only a countable product of compact metric spaces, so we don't need full axiom of choice. Anyways, we isolate a convergent subsequence in there, which makes a convergent subsequence in each of the . And then, we can ask "what happens when we mix the limit points in the  according to ?" Well, what we can do is just take a partial sum of the mixture of limit points, like the i from 0 to 1 zillion. We can establish that  gets arbitrarily close to the upper completion of a partial sum of the mixture of limit points, so  lies above all the partial sums of our limit points. We show that the partial sums don't have multiple limits, then, we just do one more invocation of Lemma 3 to conclude that the mixture of limit points lies below . Finally, we appeal to upper completion to conclude that  is in our mixed set of interest. Whew!

Once those first 4 are out of the way, we can then invoke Theorem 5 to translate to the  view, and mop up the remaining minimal-point conditions.

First, nonemptiness. By Theorem 5, we can go "hm, the  are monotone on , and  everywhere else, and , so the affine functional  lies above the graph of ". This translates to the point  being present in all the . Then, we can just go: , so we have a point in our  set.

For normalization, appeal to Proposition 10 and normalization for all the  and .

Convexity is another easy one. Take a . They shatter into . Then, we can just go:

and then, by convexity of the , so we wrote  as a mixture of points in .

Upper completion is another easy one, because, if , then you can go

And  by upper completion.

That leaves the nightmare of closure. Fix a sequence  limiting to . You can think of the  as . We can shatter the  into , where  can be thought of as 

Now, since  converge to something, there must be an upper bound on the  and  terms of the sequence, call those  and . Now, for all n and all  so, for all n and i, .

Also, for all n and  and reshuffling, we get  which then makes  . Further, due to  being  a sa-measure, , so for all n and i, .

Ok, so taking stock of what we've shown so far, it's that for all i, the sequence  is roaming about within  And, by the Compactness Lemma, this set is compact, since it's got bounds (weak bounds, but bounds nonetheless). Defining

where , we can view everything as one single sequence  wandering around in the product of compact sets. By Tychonoff's theorem (we've only got a countable product of compact metric spaces, so we don't need full axiom of choice, dependent choice suffices), we can fix a convergent subsequence of this, and the projections of this subsequence to every  converge.

Ok, so we've got a subsequence of n where, regardless of i,  converge to some  (by closure of ). How does that help us? We don't even know if mixing these limit points converges to something or runs off to infinity. Well... fix any j you like, we'll just look at the partial sum of the first j components. Also fix any  you please. On our subsequence of interest, the  converge to , and in all i, the  converge to . So, let n be large enough (and in our subsequence) that , and , we can always find such an n.

Now,  is a well-defined point (because it's a finite sum of points plus a convergent sequence as witnessed by the well-definedness of  which breaks down as ) It also lies in the upper completion of the single point . We'll show that this point is close to . Since we're working in a space with a norm,

This will come in handy in the later equations.

So,  is less than  away from the upper completion of the point , which is a closed set (Minkowski sum of a closed and compact set is closed).  can be shrank to 0 with increasing n, so  has distance 0 from the upper completion of said partial sum, and thus lies above the partial sum!

Abbreviating  as , we get that all the  lie in , and are all sa-measures. Thus, if the sequence  converges to a unique point, then said limit point is , and all the , so  would lie in . Further, by Lemma 3, , since that set is compact, so  lies above , and would lie in  by upper-completeness.

So, all that's left to wrap up our closure argument is showing that the sequence  has a single limit point. Since it's wandering around in  which is compact by Lemma 3, there are convergent subsequences. All we have to show now is that all convergent subsequences must have the same limit point.

Assume this is false, and there's two distinct limit points of the sequence , call them  and . Because it's impossible for two points to both be above another (in the minimal-point/adding-points sense), without both points being identical, either , or vice-versa. Without loss of generality, assume . Since the latter is a closed set,  must be  away for some . Fix some j from the subsequence that  is a limit point of, where . There must be some strictly greater  from the subsequence that  is a limit point of.

Further, the  are nonzero. Also, no  can be the 0 point, because , and if , then , which is impossible by normalization. So,  lies strictly below . Also,  lies below , because for all the ,

so  for all . The sequence that limits to  is roaming around in this set, which is closed because the sum of a compact set (a single point) and a closed set is closed. So,  lies above  which lies above . Thus, . However,  is  or less distance from , which must be  distance from , and we have a contradiction.

Ok, so the sequence of partial sums  has a single limit point, which is , and all the , so , and by Lemma 3, , since that set is compact, so  lies above , and lies in  by upper-completeness. We're done!

For minimals, by our argument about what it takes to invoke LF-Duality in Proposition 9,  we only need convexity, closure, and upper completion (which we have), and that the  induced by  is continuous. By Proposition 10, . We might as well go for uniform continuity since all the  are infradistributions, and so fulfill weak-bounded-minimals, so their  are uniformly continuous. Then, this continuity lets you invoke LF-Duality, and transfer uniform continuity for the  induced by  to weak-bounded-minimals for  

For uniform continuity/weak-bounded-minimals, given an arbitrary , we can pick a finite j where , and a finite  where, for all  with  implies . Monotonicity and normalization for the  ensures that, no matter what, , so regardless of the . Then, we can go: Ok, if , then

And by our earlier argument, we invoke LF-Duality and pick up weak-bounded-minimals. 

For positive-minimals, we can just observe that, if , then

By monotonicity for the  because  had positive-minimals. Going back to , since its associated  is monotone, it must have positive-minimals as well.

For bounded minimals assuming the Lipschitz constants aren't too big, fix some . We know that , where  is the Lipschitz constant of . So, if , then:

So,  is a finite constant, and is an upper bound on the Lipschitz constant of the mixture of the , so the  corresponding to  has a Lipschitz constant, which, by Theorem 5, translates to bounded-minimals. And we're done.

 

Proposition 12: 

Let's use Theorem 5 to translate this into the concave functional setting. We want to show that  Now, given any function ,

and we're done! The two concave functionals corresponding to those two sets are the same, so the sets themselves are the same.

 

Lemma 8: The "raw update"  defined by  is a continuous linear operator.

For linearity,

Now for continuity.  limits to  if, for all  limits to . Observe that , and  is continuous.

Now, for any  we can go

establishing continuity in the first vector component, by  limiting to . For the second vector component,

So we have continuity in the second vector component as well, and we're done.

 

Lemma 9: 

As a recap, the raw update function  is: 

Take a point . Now there must be a preimage point  that, when we apply , produces . Because  is in an infradistribution, we can decompose it into a minimal point and something else, . Then,

This was done by using linearity of  via Lemma 8.

Note that, since we have written  as a sum of a different point also in  and an sa-measure, but  is minimal in , the sa-measure must be 0, so , and we're done.

 

Proposition 13: When updating a bounded infradistribution over , if the renormalization doesn't fail, you get a bounded infradistribution over the set . (for infradistributions in general, you may have to take the closure)

Proof sketch: It doesn't matter whether you take upper-completion before or after renormalization, so we can appeal to Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution (if the renormalization doesn't fail).

So, we just have to show nonemptiness, convexity, upper-completion (trivial), positive-minimals/bounded minimals (by Lemma 9, the preimage of a minimal point contains a minimal point, so we can transfer over the properties from the minimal point in the preimage), and closure. The set of minimal points in  is contained in a compact set, so we can take a sequence in , split into a component in  and something else, take preimage points, get minimals below all of them, isolate a convergent subsequence, map the limit point back through, and show that the limit point lands under your point of interest. That establishes all conditions for a bounded inframeasure, so then we just have to check that our renormalization is the right one to do.

Proof: Nonemptiness is trivial,  isn't a partial function. Upper-completion is also trivial, because we explicitly took the upper completion. For convexity, observe that  is a linear operator by Lemma 7, so it maps convex sets to convex sets, and the Minkowski sum of two convex sets is convex.  maps sa-measures to sa-measures, because

For positive-minimals and bounded-minimals, we invoke Lemma 9, . All minimal points in  must have a preimage minimal in , which is an a-measure. Chopping down a measure by  keeps it a measure, so we still have no negative components post-update, and all minimal points in  are a-measures. Similarly, chopping down a measure by  reduces the  value, and we had an upper bound of  originally, so the upper bound still works post-update. This gets bounded-minimals.

This just leaves closure. Fix a sequence  in  limiting to . The  break down into , where  further breaks down into , where . By Proposition 5, the  sequence is wandering around in a compact set since we have bounded-minimals on , so there's a convergent subsequence which has a limit point . Map that convergent subsequence and limit point through  which is continuous by Lemma 8 to get a sequence of points  limiting to . Fix some really big n where  and .

Now,  lies in the upper completion of the point . We'll show that this sum of 3 terms is close to . Since we're working in a Banach space, , by norm arguments.

So, is within  of the upper completion of  for all , and it's a closed set, so  lies above , so , and we have closure.

Now that all prerequisite conditions have been established, we just need to show that  and  are the proper renormalization constants to use.

The proper renormalization to use is:  for the scale, and  for the shift. So let's unpack these quantities.

So, our shift constant checks out, it's the proper shift constant to use. In the other direction,

For the scale constant, observe that 

So our scale constant is also the right scale constant to use. Now, we can invoke Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution if the renormalization doesn't fail.

 

Proposition 14: 

Proof: if , then

Now, if , then  so, for any  by monotonicity for the  induced by , and , so . Therefore,

and we get our same result.

 

Proposition 15: 

Proof sketch: First, we do some shuffling around of the stars to get a lemma that will help. Then, we can use the link between updated sets and their associated concave functionals h, getting the identity purely on the concave functional level, where it's much easier to approach.

Proof: First, the star shuffling. For any , we'll show that

.

Let's begin. First, let's deal with points  where , because that gets you a divide-by-zero error.

and we're done with the divide-by-zero case. In the other case, we can safely assume there's no divide-by-zero errors.

Ok, so we've established our crucial  identity. Let's proceed. Updates for concave functionals are: 

Importing Proposition 14,  and rearranging it (and unpacking the definition of ), we get 

So, updating fulfills the positive functional definition of update, because this transfers into  which is exactly our concave functional definition of updating. So, in order to verify that the two updates equal the one big update, we could just show that their concave functional definitions are equivalent.  would, on the concave functional level, turn into:

and now we can use our earlier star identity to rewrite as:

establishing our identity of updating twice, vs one big update of a different form.

 

Corollary 2: Regardless of L and  and , then 

Just use Proposition 15, and notice that:  getting us our result.

 

Corollary 3: If  and  are clopen sets, then, abusing notation by glossing over the difference between indicator functions and sets, 

Invoke Corollary 2, and observe that .

 

Lemma 10: 

Proof: Invoke Proposition 10 to go:

 

Theorem 6:  If the update doesn't fail.

Proof: Let  be defined as  It is a probability distribution, because if all , then , and so by Lemma 10, , which would cause the update to fail.

The left-hand-side corresponds to  on the concave functional level, and the right-hand-side corresponds to  on the concave functional level. Let's begin unpacking. Lemma 10 will be used throughout, as well as the definition of .

So,  as desired, which shows our result.

0 comments

Comments sorted by top scores.