Infra-Bayesian physicalism: proofs part I

post by Vanessa Kosoy (vanessa-kosoy) · 2021-11-30T22:26:33.149Z · LW · GW · 0 comments

Contents

No comments

This post is an appendix to "Infra-Bayesian physicalism: a formal theory of naturalized induction [AF · GW]".

Lemma 1: occurs for all iff, for all and ,

This lemma will be implicitly used all over the place, in order to deal with the "presence in the bridge transform" condition algebraically, in terms of function expectations. The bulk of the conditions for a contribution to lie in the bridge transform is this endofunction condition (the support condition is trivial most of the time), so it's advantageous to reformulate it. First-off, we have that iff, for all , we have By LF-duality for ultradistributions, proved in Less Basic Inframeasure theory. A contribution lies in an ultracontribution set iff its expectations, w.r.t. all functions, are less than or equal to the ultracontribution expectations.

Now, we just need to unpack the left-hand-side into our desired form. Start off with Apply how projections work Now, we can move the indicator function into the function we're taking the expectation again, because there's no difference between deleting all measure outside of an event, or taking the expectation of a function that's 0 outside of that event. So, we get Then we use how pushforwards are defined in probability theory. And that's our desired form. So, we've shown that for any particular , we have iff, for all , And so, we get our desired iff statement by going from the iff for one to the iff for all .

Proposition 2.1: For any , and , exists and satisfies . In particular, if then .

Proof sketch: We'll show that for any particular contribution in , there's a contribution which lies within that projects down to equal . And then, show the other direction, that any contribution in lands within when you project it down. Thus, the projection of must be exactly.

For the first direction, given some , let . Note that since , this means that , so the type signatures line up. Clearly, projecting down to makes again. So that leaves showing that . Applying Lemma 1, an equivalent way of stating the bridge transform is that it consists precisely of all the s.t. for all and , and also, .

Clearly, for our given , everything works out with the support condition, so that leaves the endofunction condition. Let be arbitrary. In order, the equalities were unpacking the semidirect product, substituting the dirac-delta in, reexpressing the condition for the indicator function, using that inside the indicator function, applying monotonicity of , using that , and then packing up the definition of . And that inequality has been fulfilled, so, yes, lies within . Since was arbitrary within , this shows that the projection set is as-big-or-bigger than .

Now to show the reverse direction, that anything in the projection set lies within . For any particular , remember, it must fulfill, for all , that So, in particular, we can let be the identity function, and be anything, and we'd get And, since is in , it's supported on the pairs s.t. , so that indicator function drops out of existence, and we get And then this can be written as a projection And since this holds for all the various , this means that And we're done with the other direction of things, the projection must be exactly equal to since projections of contributions in always land in , and we can surject onto .

Proposition 2.2: Let be a poset, . Then, if and only if there exists s.t.: For all , : .

Vanessa proved this one, with a very nice proof involving the max-flow min-cut theorem.

Make the following directed graph: The nodes are the elements of . Basically, two copies of the finite poset , a source node , and a sink node . will be used to denote a variable from , while a subscript of 0 or 1 denotes that the 0 or 1 version of that , respectively.

As for the edges, there is an edge from to every point in . And an edge from every point in to . And, for some and , there's an edge iff . The capacities of the edges are, for all , , and for all , , and for s.t. , .

To work up to applying min-cut max-flow, we must show that the cut of all the edges between and is a min-cut of the network.

Remember, all cuts are induced by partitions of the nodes into two sets, and , where includes the source and includes the sink . And the value of a cut is Implicitly converting both of the following sets to be subsets of , we have for any non-infinite cut. Why? Well, if it was false, then there'd be some where , and yet . So . Then, the cost of the cut would include , contradiction. So, now, we have that for any non-infinite cut, and holds.

Now, we'll change our cut. Given some cut , let our new cut be defined as (which still respects that superset property listed above since both sets would be the same), and be the complement. Letting some stuff cancel out, we get Now, because by that subset property, it means that there's no , so we get As for that other minus term, we have that by how it was defined, so , so that other minus term is 0, and we get Rearranging this a bit, we have .

And now we'll show that is underscored by . Let's go. We have This simplifies a bit as This can be rewritten a bit as And then, using what is defined as, we can get and using what the costs are, it's This holds because and the indicator function for is monotone. And so, we get . And we previously showed that . So, this means that the cut around is a minimal cut, it underscores all other cuts.

By the max-flow min-cut theorem, there's a way of having stuff flow from to that saturates the capacities of all the edges that are cut. will be used to denote the flow from to according to this max-flow way. Let's finish things up.

Define as follows. For some , if , then If , let be any probability distribution on . Note that is always a probability distribution supported on , by fiat if , and otherwise, This is because, the flow out of must equal the flow into from the source. And is supported on because the only edges out of go to where . Now that we've got this demonstrated, we'll show our desired inequality. Fix an arbitrary . We have And then we use that all the capacities of the edges cut in the min-cut are saturated according to the max-flow plan, so , and we have Now, if , then because the flow in equals the flow out, that means that , and otherwise we can cancel out, so we can rewrite as Where the first equality came from flow in equals flow out, the inequality came from the flow plan respecting the capacity of the paths, and the equality came from how the capacities were defined. So, for all , we have so we have .

For the reverse direction, if there's some s.t is supported on s.t. , then for any monotone function , we have And then we use that, since is supported on , and is monotone, is less than or equal to the expectation of w.r.t (remember, for you have a 100 percent chance of drawing something at-or-above , which guarantees that f of whatever you picked is above ). And so, we get And since this holds for every monotone , we have .

Proposition 2.3: Let be a poset, . Then, if and only if there exists s.t. For all , : .

Proof: Because , we can get a maximum flow in exactly the same way as Proposition 2.2. Then, just flip the direction of all flows, which will be denoted by swapping the order of the subscripts. Now, define And, again, if it's 0, it should be an arbitrary probability distribution supported on . Note that is always a probability distribution supported on , by fiat if , and otherwise, This is because, the flow out of must equal the flow into from the source (the old sink). And is supported on because the only edges out of go to where . Now that we've got this demonstrated, we'll show our desired inequality. Fix an arbitrary . We have Then use that (because even with the flow reversed, the flow through a path must be less than or the same as the capacity. Accordingly, we get And we're done, inequality established, using definitions and the flow saturating all the paths out of .

So, for all , we have so we have .

For the reverse direction, if there's some s.t is supported on s.t. , then for any monotone function , we have And then we use that, since is supported on , and is monotone, is greater than or equal to the expectation of w.r.t (remember, for you have a 100 percent chance of drawing something at-or-below , which guarantees that f of whatever you picked is below ). And so, we get And since this holds for every monotone , we have .

Proposition 2.4: For any , and , is downwards closed w.r.t. the induced order on . That is, if and then .

Remember, the condition for some to lie in is that it be supported on , and that, for all , and , So, given that (lower score for all monotone functions) we'll show that fulfills both conditions. The support thing is taken care of by . As for the other one, we have That inequality occurs because as you make larger, ie, go up in the partial ordering, the value assigned to the relevant point increases since is more likely now, so the function value increases from 0 to something that may be greater than 0. So, it's a monotone function, and we then use that .

Proposition 2.5: Consider a finite set , , , and . Then, . Conversly, consider any . Then, there exist and s.t. for .

Ok, so let be some arbitrary function . We have an alternate characterization of the total variation distance as And then from there we can go And since , we have and are done.

Conversely, let the value of be , and be . It is clear that this is a probability distribution because of how was defined. The is the minimum/common overlap of the two probability distributions. Then, let , and similar for the 1. Well, as long as . If , it can be any probability distribution you want. It's a probability distribution because And since , everything works out. Now we just need to show that these add up to make and that is the same as the total variation distance. And this works symmetrically for , showing that we indeed have equality. As for showing that is the total variation distance, referring back to what we've already proved, we have And now, since and , the supports of these two probability distributions are disjoint, which implies that the total variation distance is 1, so we have .

Proposition 2.6: Consider any and . Denote (the event ``program is unrealized''). Let . Then,

This will take some work to establish. One direction, showing that the bridge transform value exceeds the total variation distance value, is fairly easy. As for the other direction, it'll take some work. Let's begin. We'll make a particular contribution . It is defined as Or, put another way, restricting to , it's , and conditioning on , it's . So, for a given , we have Now, we'll go through two exhaustive possibilities for what could be. If it maps 0 to 0, then it's And our desired inequality is established. If maps 0 to 1, then it's And that is taken care of. So, our lies in .

Now, where were we? Ah right, we were at But now we can continue with Which unpacks as And then, the value of the overlap between two distributions is . So, we've shown one direction, we have In the other direction of the equality we're trying to prove, we need to constrain what might be. Starting off with what we definitely know, we have Now it's time to show that for any , that the measure on can't be too high. Specifically, to proceed further, we'll need to show that Time to establish it. On , either , or vice-versa. Without loss of generality, assume that it's that's lower.

Then, let be the constant-0 function, and be 1, and 0 everywhere else. Then, we have Just abbreviating, using that lies in the bridge transform, deabbreviating with what is, unpacking things a little bit and canceling out. At the end we used that .

Ok, so, now that we've established the key fact that We can resume our work. We were previously at and we can proceed to and from there, using the inequality we proved, proceed to And we're done, we established that it's an upper bound as well a lower bound, so we have equality.

Lemma 2: If there's a function and , then for all , is supported on the set .

Assume that the conclusion is false, that there is some nonzero probability of drawing a pair where . In particular, there must be some special value that witnesses that isn't a subset, by and . Remember that for all and , that since , we have Now, in particular, let , and be the constant function that maps everything to . Then, this turns into which is impossible. The equality as the end was our starting assumption. The middle inequality was just specializing our inequality to a particular pair of functions. And it's greater than 0 because there's a nonzero probability of drawing . And was selected to lie in and outside of , so there's a nonzero probability of getting a nonzero value. Therefore, the result follows.

Lemma 3: For any , and , if , then .

So, the support condition is trivially fulfilled, because we're restricting to the event . That just leaves the endofunction condition. Let be arbitrary, and be arbitrary. Then we have And, since , we can use that to get And we're done, we've established that our modified version of remains in .

Proposition 2.7: If is a poset and , then will denote downward closure of . For any , and if and , then . Moreover, define by . Then, (we slightly abuse notation by treating as a mapping that doesn't depend on the first argument, and also playing loose with the order of factors in the set on which our HUCs live).

This proof splits into two parts. Part 1 will be proving the support statement. Part 2 will be proving the thing with .

Part 1: Here's the basics of how this part will work. There's some tuple in the support of . Since the support of is the union of the supports for the , there's some that assigns nonzero probability to that event. Also, . The key part of the proof is showing there's some that assigns nonzero measure to the event . Once we've got that, since we know by Proposition 2.1 that projects down to equal , that means that the projection of will land in , and it will assign nonzero measure to the event , so the event lies in the support of .

So, that just leaves appropriately defining our and showing that it lies in and assigns nonzero probability to our event of interest. Our particular choice of will be as follows. Let be the constant function mapping everything to . Applying Lemma 3, this is in , so that just leaves showing that it assigns nonzero probability to the event . We have We can simplify this a bit, because if the first three properties hold, that tells you something about what is. unpack the pushforward Then we use that it's a constant function Now, since by the problem setup, and is a tautology, we can remove those two events. and use an inequality Because assigned nonzero probability to that event. So, we know that our assigns nonzero measure to the event of interest. And that's it! It fulfills the appropriate properties to carry the proof through.

Proof part 2: First off, we can reexpress as the equivalent statement that, for all , we have Although, since and are downwards-closed, we actually only need to demonstrate this inequality for monotone functions .

The reason we only need to demonstrate this inequality for monotone is because of Where the equalities follow from Proposition 3.1 and the downward closure of the two ultracontributions, and the inequality is what we're trying to prove (since in the sense of Proposition 3.1 is always a monotone function.

So, let be monotone, and we're trying to prove the desired inequality. We'll unpack it bit by bit. And now we remember that the set is a support for , because it's the same as , ie, the support of . So, by Lemma 2, we can conclude that any is supported on pairs s.t. . In particular, this means that , and since is monotone, swapping out for always produces an increase in expected value, so we get and then, since all are supported on s.t. , we can go And then, since all the fulfill the endofunction property, we can let be identity and be , and go And rewrite that as and then since is monotone This holds because all contributions added when you take the downward closure can only produce lower expectation values than the existing contributions due to monotonicity of , and so they're ignored.

And now we're done! We got the inequality going appropriately to hit our proof target.

Proposition 2.8: For any , and

For notation, we'll use for the set which is being treated as part of the background environment, and as the usual set.

The way to establish this is to show equal expectation values for all functions monotone in the last argument (the relevant set-based one), which is all we need to do as both sets are downwards-closed. We'll do this by establishing the two inequalities separately. Our first order of business is showing that for all monotone in the last coordinate, we have Let's begin. Pick an that's monotone in the last argument. We'll set this to the side for a moment to show that if , then .

The support part is easy. Since , it's supported on pairs where . Taking semidirect product with identity means that always happens, because always happens. So, that leaves showing the endofunction condition. We have And start packing things up a bit And, since , this update of the pushforward of lands in by Lemma 3, so we get Now, since we've established that inequality for all choices of , we have that Resuming where we last left off, the last place we were at in our chain of inequalities was Since these things are always in , we can go And we're done with that inequality direction.

Now to show the reverse direction, which actually will use that is monotone in the last argument. We start with And now, we can use that is supported on pairs where , along with Lemma 2 applied to id, to conclude that all the must be supported on the event . Since big sets go more towards the top and get a higher loss, and is monotone in the last argument, we get Now, we can use the endofunction property of all the w.r.t. to get a uniform upper bound of and then go and we're done, we got the inequality going in the other direction, so we have equality for arbitrary monotone functions, and thus equality.

Proposition 2.9: For any , and , if then .

Proof: This is really easy to show, we just need to take some contribution and show that it lies in . The support condition is easily fulfilled, so that leaaves showing the endofunction condition. Let's begin. And now, since , we have Done.

Proposition 2.10: For any , , , and

As usual, we just need to establish that for all , the expectation value is lower in the first function than the second function, so our proof goal will be Let's begin trying to show this. Now we will show that all the contributions of this form lie in . Clearly the condition is always fulfilled for these pushforwards, so that just leaves the endofunction condition. Let's begin. And we're done, we established our desired result that the pushforward lands in the appropriate set. So, we can proceed by going And we're done!

Proposition 2.11: Consider any , , , , and s.t. . Then, In particular,

Well, let's get started on proving these various things. To begin with, to prove We need to prove that, for all , Let's establish this. Unpacking the left-hand side, we have Now we'll show that this lies in . The support condition is trivial, so that leaves the endofunction condition. Then, by homogenity of , we can pull a constant out, the indicator function, to get Then, by the endofunction condition, we get And bam, we've showed that lies in the appropriate set. We were last at So we can impose an upper bound of And we're done, we've established our desired inequality. Now for proving another inequality. This can be proved by applying Proposition 2.10. Start out with Proposition 2.10. Specialize to , yielding The two pushforwards can be folded into one. Now use that is identity Apply pullback along to both sides

On the other hand, we also have

Combining the last two lines, we get the desired result:

That just leaves showing the equality result, now that we've got both subset inclusions.

We start out with Applying projection yields For any given , we have Unpacking the projection on the two ends, we get And then, for the smallest and largest quantities, the pullback or pushforward only affects the or coordinates, everything else remains the same. In particular, doesn't depend on such coordinates. So, we get The left and right hand sides are equal, so we have Then, packing the projection back up, we have And since there's equality for all functions, the two ultradistributions are equal, so we have And we're done.

Proposition 2.12: For any , , and

We'll do this by showing that all contributions of the form with and lie within . The support property is trivial, so that leaves the endofunction property. Done, the mix lies in .

Proposition 2.13: Consider some , , , , and . Regard as subsets of , so that . Then,

We already established one subset direction in Proposition 2.12. We just need the other direction, to take any , and write it as where and .

We trivially have equality when or 1, so that leaves the cases where . With that, our attempted will be , and our attempted will be .

Now, clearly, these mix to make , because Remember, and are the two components of a space made by disjoint union.

They clearly both fulfill the support condition, because they're made from which fulfills the support condition. This leaves the endofunction condition, which is somewhat nontrivial to show, and will require some setup. Clearly, without loss of generality, we just need to show it for , and follows by identical arguments. For some , will denote the function that mimics on , and is 0 on .

Also, note that , as defined, is supported entirely on , and , as defined, is supported entirely on . Let's get started on showing our endofunction condition. Let and be arbitrary. This is because the expectation of an all-0 function is 0. Then, Why does this happen? Well, is supported entirely on , and mimics perfectly on . And is supported entirely on , and mimics perfectly on . So it doesn't matter that we changed the functions outside of the contribution support. Now we use the endofunction condition on to get And then we use that is supported on and is supported on , and on , , and on , , to get And we're done! The endofunction condition goes through, which is the last piece of the proof we needed.

Proposition 2.14: Consider some , , and . Let be defined by . Then, Moreover, let be an injection and be the injection induced by . Then,

We'll have to prove both directions separately. Notice that for our proofs so far, they involve transforming one side of the equation to a certain form, then there's a critical step in the middle that's tricky to show, then we just transform back down. So our first step will be unpacking the two sides of the equation up to the critical inequality.

Also, intersecting with the top distribution corresponding to a particular set is the same as the raw-update on that set, we'll use that. We'll suppress identity functions and unused coordinates for in the notation, so if some coordinate isn't mentioned, assume it's either the identity function for pushforwards, or that the set for is the specified set times the entire space for unmentioned coordinates. With that notational note, we have Suppressing notation And rewriting the bridge transform, This is our unpacking in one direction.

In the other direction, we have Suppress some notation, Unpack the pushforward and intersection And unpack the bridge transform Pack back up So, our goal to show equality overall is to establish the equality This can be done by establishing the following two things. First, is, if , then . That establishes the inequality direction.

For the reverse direction, we'll need to show that for any , also lies in , and that . This establishes the inequality direction, since any choice of for the left hand side can be duplicated on the right hand side.

Let's switch our proof target to showing these two things. First off, assume . The goal is to show that

For once, the support condition is not entirely trivial. However, notice that always has holding (because we updated) and always holds (because fulfills the support condition due to being in ). So, it's guaranteed that . Then, applying the pushforward that turns into doesn't change that there's a guarantee that lies in the set it's paired with.

Now for the endofunction condition. And we're done, that half of the proof works out.

Now for the reverse direction, establishing that for any , also lies in , and that .

The first part of this is easy. And we're done. For the second part, it's a bit trickier. In short, what happens is that a piece of measure from is deleted if , and then gets mapped to . So, if we knew that was supported on , we'd know that neither of the two transformations do anything whatsoever, and so you just get out again.

So, let's show that is supported on . For this, the way we do it is use Proposition 2.7 to get an upper bound on the bridge transformation, and show that said upper bound is also supported on . By proposition 2.7, The downwards arrow means that we're passing from larger subsets to smaller subsets, so it doesn't matter if we remove that, it won't affect whether the support is a subset of . Clearly, is supported on . And, for sus, we have So clearly it can only produce sets of which are subsets of since that's an upper bound on the support of . So we have support on , and we're done with this half of the theorem.

So now we move onto the second half of the theorem with the injections. Without loss of generality, we can reduce this problem to a slightly simpler form, where we assume that is supported over . This is because any supported over has , and also for any , is always supported over .

Accordingly, assume that is supported over . Suppressing identity functions, our goal is to prove that This is suprisingly aggravating to prove. Again, we'll try to rewrite things until we get to a point where there's just one equality to show, and then put in the work of showing the endofunction condition in both directions. Time to start rewriting. Subscripts will be used to denote when a particular is part of , or of . Similar with and . Rewriting the other direction, we have Now, for pullback, it's defined like this. Maximum over the empty set is 0. And so we now must embark on showing these two things are equal, by showing that every value that one of the maxes is capable of producing, the other is capable of producing too. Our proof goal is Now, for one direction of this, we need to show that if , then .

The support condition is easily fulfilled, because the only time the pullback works is when , and is an injection, so there's a unique preimage point , and since occurs always,then occurs always. That leaves the endofunction condition. Let , . Let be defined as follows. For points in , it maps to . Everywhere else, it's the identity. Also, , for points in , maps to , and is 0 everywhere else. Both of these are uniquely defined because is an injection. We have We can partially write the maximum as an indicator function for checking whether and (because in such a case the maximum is taken over an empty set and 0 is returned in those cases). And also, since there's only one point in that inverse, since is an injection, there's a canonical inverse and we can swap everything out for that, yielding Now, since is an injection, applying it to the point and the set in the indicator function don't affect whether the point is in the relevant set, so we can go And we can also apply and to the point in the function, as that's just identity. Use our and abbreviations since we know that the relevant point is in if it made it past the indicator function. Also do some canceling out of identity functions around the . And use our abbreviation of Remove the indicator function Apply endofunction property Now, is 0 when , and is otherwise, so we have And we're done here, the endofunction condition has been shown, establishing one direction of the inequality.

For the reverse direction, we'll be working on showing that if , then , though it will take a little bit of extra work at the end to show how this implies our desired equality. The support property obviously holds, we're just applying to our point and set, so that leaves the endofunction property. and are as usual. Now, if , it might pass the indicator function, but if , it definitely won't. So let's place that indicator function. Now, since we know this stuff is in the image of (and definitely is), and is an injection, we can safely take the inverse of everything, yielding In order, this was because we could reverse the injection as long as we're past that indicator function, reversing the injection doesn't affect whether the point is an element of the set, and is identity. Cancel out some of the identity.. Now, at this point, let be defined as when , and arbitrary otherwise. And let be defined as Now, using these abbreviations, we can go Now we can safely strip away the indicator function. And apply the endofunction condition Unpacking the pullback, we get Now unpack the definition of Use that is supported on the image of , so that max is always nonempty. Then, things cancel out to yield And we're done there.

Ok, so... we've shown one inequality direction already, all that remained to be shown was How does it help to know that for any , then ? Well, for any particular chosen on the left-hand side, you can let the choice of for the right hand side be , which lies in the appropriate set. Then, the contribution for the right-hand-side would be , which happens to be . So anything the left-hand-side can do, the right-hand-side can do as well, showing our one remaining direction of inequality, and concluding the proof.

Well, actually, I should flesh out that claim that . This happens because Now,that initial indicator condition is always true because is supported on , so we have And then, since is injective, applying it then taking the partial inverse is just identity, so we get And that's why we have equality. Alright, proof finally done.

Proposition 2.15: Consider some , , , , . Let be given by . Then,

This can be established by Proposition 2.14. Taking proposition 2.14, and using as an abbreviation for , and as the , we get We'll need to abbreviate in a bit, so ignore some of the identity functions to get Now, deabbreviating, ie is already supported on , so intersecting with does nothing. Now, since is onl supported on , the bridge transform will only be supported on . So, we can pull back along the function and pushforward and it will be identity, as our ultracontribution is entirely supported on the area that can be pulled back. So, we get Applying the second part of Proposition 2.14, this can be rewritten as Since deabbreviates as , it's supported on , so intersecting with does nothing. Getting rid of that part, and deabbreviating, we get Now, since is an injection, pushforward-then-pullback is identity, so we get Then, since the pushforward through is supported on , the intersection does nothing Then note that any set pushed forward through must be a subset of , so intersecting with does nothing, and we get And the result is proven.

0 comments

Comments sorted by top scores.