Self-Referential Probabilistic Logic Admits the Payor's Lemma

randomwalks

Self-Referential Probabilistic Logic Admits the Payor's Lemma

post by Yudhister Kumar (randomwalks) · 2023-11-28T10:27:29.029Z · LW · GW · 14 comments

  Background
      Proof:
  Setup
  Proof
    Rules of Inference
  Bots
      Proof:
  Acknowledgements
None
14 comments

In summary: A probabilistic version of the Payor's Lemma [AF · GW] holds under the logic proposed in the Definability of Truth in Probabilistic Logic. This gives us modal fixed-point-esque group cooperation even under probabilistic guarantees.

EDIT 10/24: I think the way the way this post is framed is somewhat confused and should not be taken literally. However, I do stand by some of the core intuitions here.

Background

Payor's Lemma: If then $⊢ x .$

We assume two rules of inference:

Necessitation: $⊢ x ⟹ ⊢ □ x,$
Distributivity: $⊢ □ (x \to y) ⟹ ⊢ □ x \to □ y .$

Proof:

$⊢ x \to (□ x \to x),$ by tautology;
$⊢ □ x \to □ (□ x \to x),$ by 1 via necessitation and distributivity;
$⊢ □ (□ x \to x) \to x$ , by assumption;
$⊢ □ x \to x,$ from 2 and 3 by modus ponens;
$⊢ □ (□ x \to x),$ from 4 by necessitation;
$⊢ x,$ from 5 and 3 by modus ponens.

The Payor's Lemma is provable in all normal modal logics (as it can be proved in $K,$ the weakest, because it only uses necessitation and distributivity). Its proof sidesteps the assertion of an arbitrary modal fixedpoint, does not require internal necessitation ( $⊢ □ x ⟹ ⊢ □ □ x$ ), and provides the groundwork for Lobian handshake-based cooperation without Lob's theorem.

It is known that Lob's theorem fails to hold in reflective theories of logical uncertainty [LW · GW]. However, a proof of a probabilistic Payor's lemma [AF · GW] has been proposed, which modifies the rules of inference necessary to be:

Necessitation: $⊢ x ⟹ ⊢ □_{p} x,$
Weak Distributivity: $⊢ x \to y ⟹ ⊢ □_{p} x \to □_{p} y .$ where here we take $□_{p}$ to be an operator which returns True if the internal credence of $x$ is greater than $p$ and False if not. (Formalisms incoming).

The question is then: does there exist a consistent formalism under which these rules of inference hold? The answer is yes, and it is provided by Christiano 2012.

Setup

(Regurgitation and rewording of the relevant parts of the Definability of Truth)

Let $L$ be some language and $T$ be a theory over that language. Assume that $L$ is powerful enough to admit a Godel encoding and that it contains terms which correspond to the rational numbers $Q .$ Let $ϕ_{1}, ϕ_{2} \dots$ be some fixed enumeration of all sentences in $L .$ Let $┌ ϕ ┐$ represent the Godel encoding of $ϕ .$

We are interested in the existence and behavior of a function $P : L \to [0, 1]^{L},$ which assigns a real-valued probability in $[0, 1]$ to each well-formed sentence of $L .$ We are guaranteed the coherency of $P$ with the following assumptions:

For all $ϕ, ψ \in L$ we have that $P (ϕ) = P (ϕ \land ψ) + P (ϕ \lor \neg ψ) .$
For each tautology $ϕ,$ we have $P (ϕ) = 1.$
For each contradiction $ϕ,$ we have $P (ϕ) = 0.$

Note: I think that 2 & 3 are redundant (as says John Baez), and that these axioms do not necessarily constrain $P$ to $[0, 1]$ in and of themselves (hence the extra restriction). However, neither concern is relevant to the result.

A coherent $P$ corresponds to a distribution $μ$ over models of $L .$ A coherent $P$ which gives probability 1 to $T$ corresponds to a distribution $μ$ over models of $T$ . We denote a function which generates a distribution over models of a given theory $T$ as $P_{T} .$

Syntactic-Probabilistic Correspondence: Observe that $P_{T} (ϕ) = 1 ⟺ T ⊢ ϕ .$ This allows us to interchange the notions of syntactic consequence and probabilistic certainty.

Now, we want $P$ to give sane probabilities to sentences which talk about the probability $P$ gives them. This means that we need some way of giving $L$ the ability to talk about itself.

Consider the formula $B e l .$ $B e l$ takes as input the Godel encodings of sentences. $B e l (┌ ϕ ┐)$ contains arbitrarily precise information about $P (ϕ) .$ In other words, if $P (ϕ) = p,$ then the statement $B e l (┌ ϕ ┐) > a$ is True for all $a < p,$ and the statement $B e l (┌ ϕ ┐) > b$ is False for all $b > p .$ $B e l$ is fundamentally a part of the system, as opposed to being some metalanguage concept.

(These are identical properties to that represented in Christiano 2012 by $P (┌ ϕ ┐) .$ I simply choose to represent $P (┌ ϕ ┐)$ with $B e l (┌ ϕ ┐)$ as it (1) reduces notational uncertainty and (2) seems to be more in the spirit of Godel's $B e w$ for provability logic).

Let $L^{'}$ denote the language created by affixing $B e l$ to $L .$ Then, there exists a coherent $P_{T}$ for a given consistent theory $T$ over $L$ such that the following reflection principle is satisfied: $\forall ϕ \in L^{'} \forall a, b, \in Q : (a < P_{T} (ϕ) < b) ⟹ P_{T} (a < B e l (┌ ϕ ┐) < b) = 1.$ In other words, $a < P_{T} (ϕ) < b$ implies $T ⊢ a < B e l (┌ ϕ ┐) < b .$

Proof

(From now, for simplicity, we use $P$ to refer to $P_{T}$ and $⊢$ to refer to $T ⊢ .$ You can think of this as fixing some theory $T$ and operating within it).

Let $□_{p} (ϕ)$ represent the sentence $B e l (┌ ϕ ┐) > p,$ for some $p \in Q .$ We abbreviate $□_{p} (ϕ)$ as $□_{p} ϕ .$ Then, we have the following:

Probabilistic Payor's Lemma: If $⊢ □_{p} (□_{p} x \to x) \to x,$ then $⊢ x .$

Proof as per Demski [LW · GW]:

$⊢ x \to (□_{p} x \to x),$ by tautology;
$⊢ □_{p} x \to □_{p} (□_{p} x \to x),$ by 1 via weak distributivity,
$⊢ □_{p} (□_{p} x \to x) \to x$ , by assumption;
$⊢ □_{p} x \to x,$ from 2 and 3 by modus ponens;
$⊢ □_{p} (□_{p} x \to x),$ from 4 by necessitation;
$⊢ x,$ from 5 and 3 by modus ponens.

Rules of Inference

Necessitation: $⊢ x ⟹ ⊢ □_{p} x .$ If $⊢ x,$ then $P (x) = 1$ by syntactic-probabilistic correspondence, so by the reflection principle we have $P (□_{p} x) = 1,$ and as such $⊢ □_{p} x$ by syntactic-probabilistic correspondence.

Weak Distributivity: $⊢ x \to y ⟹ ⊢ □_{p} x \to □_{p} y .$ The proof of this is slightly more involved.

From $⊢ x \to y$ we have (via correspondence) that $P (x \to y) = 1,$ so $P (\neg x \lor y) = 1.$ We want to prove that $P (□_{p} x \to □_{p} y) = 1$ from this, or $P ((B e l (┌ x ┐) \leq p) \lor (B e l (┌ y ┐) > p)) = 1.$ We can do casework on $x$ . If $P (x) \leq p,$ then weak distributivity follows from vacuousness. If $P (x) > p,$ then as $\begin{matrix} P (\neg x \lor y) & = P (x \land (\neg x \lor y)) + P (\neg x \land (\neg x \lor y)), 1 & = P (x \land y) + P (\neg x \lor (\neg x \land y)), 1 & = P (x \land y) + P (\neg x), \end{matrix}$ $P (\neg x) < 1 - p,$ so $P (x \land y) < p,$ and therefore $P (y) > p .$ Then, $B e l (┌ y ┐) > p$ is True by reflection, so by correspondence it follows that $⊢ □_{p} x \to □_{p} y .$

(I'm pretty sure this modal logic, following necessitation and weak distributivity, is not normal (it's weaker than $K$ ). This may have some implications? But in the 'agent' context I don't think that restricting ourselves to modal logics makes sense).

Bots

Consider agents $A, B, C$ which return True to signify cooperation in a multi-agent Prisoner's Dilemma and False to signify defection. (Similar setup to Critch's [AF · GW] ). Each agent has 'beliefs' $P_{A}, P_{B}, P_{C} : L \to [0, 1]^{L}$ representing their credences over all formal statements in their respective languages (we are assuming they share the same language: this is unnecessary).

Each agent has the ability to reason about their own 'beliefs' about the world arbitrarily precisely, and this allows them full knowledge of their utility function (if they are VNM agents, and up to the complexity of the world-states they can internally represent). Then, these agents can be modeled with Christiano's probabilistic logic! And I would argue it is natural to do so (you could easily imagine an agent having access to its own beliefs with arbitrary precision by, say, repeatedly querying its own preferences).

Then, if $A, B, C$ each behave in the following manner:

$⊢ □_{a} (□_{e} E \to E) \to A,$
$⊢ □_{b} (□_{e} E \to E) \to B,$
$⊢ □_{c} (□_{e} E \to E) \to C,$

where $E = A \land B \land C$ and $e = max ({a, b, c}),$ they will cooperate by the probabilistic Payor's lemma.

Proof:

$⊢ □_{a} (□_{e} E \to E) \land □_{b} (□_{e} E \to E) \land □_{c} (□_{e} E \to E) \to A \land B \land C,$ via conjunction;
$⊢ □_{e} (□_{e} E \to E) \to E,$ as if the $e$ -threshold is satisfied all others are as well;
$⊢ E,$ by probabilistic Payor.

This can be extended to arbitrarily many agents. Moreso, the valuable insight here is that cooperation is achieved when the evidence that the group cooperates exceeds each and every member's individual threshold for cooperation. A formalism of the intuitive strategy 'I will only cooperate if there are no defectors' (or perhaps 'we will only cooperate if there are no defectors').

It is important to note that any $P$ is going to be uncomputable. However, I think modeling agents as having arbitrary access to their beliefs is in line with existing 'ideal' models (think VNM -- I suspect that this formalism closely maps to VNM agents that have access to arbitrary information about their utility function, at least in the form of preferences), and these agents play well with modal fixedpoint cooperation.

Acknowledgements

This work was done while I was a 2023 Summer Research Fellow at the Center on Long-Term Risk. Many thanks to Abram Demski, my mentor who got me started on this project, as well as Sam Eisenstat for some helpful conversations. CLR was a great place to work! Would highly recommend if you're interested in s-risk reduction.

14 comments

Comments sorted by top scores.

comment by SMK (Sylvester Kollin) · 2023-11-29T10:56:01.611Z · LW(p) · GW(p)

We assume two rules of inference:

Necessitation:
Distributivity: $⊢ □ (x \to y) ⟹ ⊢ □ x \to □ y$

Is there a reason why this differs from the standard presentation of K? Normally you would say that K is generated by the following (coupled with substitution):

Axioms:
- All tautologies of propositional logic.
- Distribution: $□ (x \to y) \to (□ x \to □ y)$ .

Rules of inference:
- Necessitation: $⟨ x, □ x ⟩$ .
- Modus ponens: $⟨ x \to y, x, y ⟩$ .

Replies from: randomwalks

↑ comment by Yudhister Kumar (randomwalks) · 2023-11-30T13:41:33.655Z · LW(p) · GW(p)

No particular reason (this is the setup used by Demski in his original probabilistic Payor post [LW · GW]).

I agree this is nonstandard though! To consider necessitation as a rule of inference & not mentioning modus ponens. Part of the justification is that probabilistic weak distributivity () seems to be much closer to a 'rule of inference' than an axiom for me (or, at least, given the probabilistic logic setup we're using it's already a tautology?).

On reflection, this presentation makes more sense to me (or at least gives me a better sense of what's going on / what's different between $□_{p}$ logic and $□$ logic). I am pretty sure they're interchangeable however.

Replies from: Sylvester Kollin

↑ comment by SMK (Sylvester Kollin) · 2023-12-03T12:59:46.069Z · LW(p) · GW(p)

Thanks.

I am pretty sure they're interchangeable however.

Do you have a reference for this? Or perhaps there is a quick proof that could convince me?

Replies from: randomwalks

↑ comment by Yudhister Kumar (randomwalks) · 2023-12-03T15:34:08.859Z · LW(p) · GW(p)

Payor's Lemma holds in provability logic, distributivity is invoked when moving from step 1) to step 2) and this can be accomplished by considering all instances of distributivity to be true by axiom & using modus ponens. This section should probably be rewritten with the standard presentation of K to avoid confusion.

W.r.t. to this presentation of probabilistic logic, let's see what the analogous generator would be:

Axioms:

all tautologies of Christiano's logic
all instances of (weak distributivity) --- which hold for the reasons in the post

Rules of inference:

Necessitation $⟨ x, □_{p} x ⟩$
Modus Ponens $⟨ x \to y, x, y ⟩$

Then, again, step 1 to 2 of the proof of the probabilistic payor's lemma is shown by considering the axiom of weak distributivity and using modus ponens.

(actually, these are pretty rough thoughts. Unsure what the mapping is to the probabilistic version, and if the axiom schema holds in the same way)

comment by Algon · 2023-11-28T11:14:41.217Z · LW(p) · GW(p)

This can be extended to arbitrarily many agents. Moreso, the valuable insight here is that cooperation is achieved when the evidence that the group cooperates exceeds each and every member's individual threshold for cooperation. A formalism of the intuitive strategy 'I will only cooperate if there are no defectors' (or perhaps 'we will only cooperate if there are no defectors').

You should include the highlighted insight in your summary. Also, why does your setup not lead to inconsistencies when Abram Demski isn't sure his setup [LW · GW] does? Is it just that you don't have ", then $⊢ p (┌ a ┐) \leq p (┌ b ┐)$ "?

Replies from: randomwalks, abramdemski

↑ comment by Yudhister Kumar (randomwalks) · 2023-11-28T11:34:54.014Z · LW(p) · GW(p)

We know that the self-referential probabilistic logic proposed in Christiano 2012 is consistent. So, if we can get probabilistic Payor in this logic, then as we are already operating within a consistent system this should be a legitimate result.

Will respond more in depth later!

↑ comment by abramdemski · 2024-07-25T16:48:21.533Z · LW(p) · GW(p)

Yudhister's treatment here does not satisfy me: the assumption he calls syntactic-probabilistic correspondence is false. For example, in Paul's probability distributions, the self-referential sentence L: must be assigned probability 1, but is not true and not provable.

comment by SMK (Sylvester Kollin) · 2024-03-05T12:47:46.862Z · LW(p) · GW(p)

Could you perhaps say something about what a Kripkean semantics would look like for your logic?

comment by dranorter · 2023-12-10T22:14:49.339Z · LW(p) · GW(p)

I'm interested in what happens if individual agents A, B, C merely have a probability of cooperating given that their threshold is satisfied. So, consider the following assumptions.

$⊢ □_{b} (□_{e} (□_{w} E) \to (□_{w} E)) \to □_{y} B$
$⊢ □_{c} (□_{e} (□_{w} E) \to (□_{w} E)) \to □_{z} C$
$⊢ □_{x} A \land □_{y} B \land □_{z} C \to □_{w} E$

The last assumption being simply that $w$ is low enough. Given these assumptions, we have $⊢ □_{w} E$ via the same proof as in the post.

So for example if $x, y, z$ are all greater than two thirds, there can be some nonzero $w$ such that the agents will cooperate with probability $w$ . In a sense this is not a great outcome, since viable $w$ might be quite small; but it's surprising to get any cooperation in this circumstance.

Replies from: abramdemski

↑ comment by abramdemski · 2023-12-10T22:22:29.865Z · LW(p) · GW(p)

The interesting thing about this -- beyond showing that going probabilistic allows the handshake to work with somewhat unreliable bots -- is that proving rather than $⊢ E$ is a lot different. With $⊢ E$ , we're like "And so Peano arithmetic (or whatever) proves they cooperate! We think Peano arithmetic is accurate about such matters, so, they actually cooperate."

With the conclusion $⊢ □_{w} E$ we're more like "So if the agent's probability estimates are any good, we should also expect them to cooperate" or something like that. The connection to them actually cooperating is looser.