Organisation for Program Equilibrium reading group

post by Smaug123 · 2024-07-25T19:11:02.332Z · LW · GW · 14 comments

Contents

14 comments

2024-08-18: Second meeting on Friday 2024-08-16 at 7pm in Notes, Trafalgar Square. At the first meeting, we went over the basic model theory and proof theory that's required to understand the statement of Löb's theorem, and stepped through the proof; but we haven't got close to the meat of what I want to understand yet. (More detailed notes from the first meeting. [LW(p) · GW(p)])

2024-08-14: First meeting on Friday 2024-08-16 at 7pm in Notes, Trafalgar Square. [Details [LW(p) · GW(p)]].

2024-07-25: I'm organising a little reading group, undecided between in-person in London or remote. My own questions I want to get the answers to are in https://www.patrickstevens.co.uk/posts/2024-07-25-lob-theorem/ . Writing this here so we have a place to start coordinating how we do this. I'm away on holiday until next Sunday so won't be doing anything about the organisation immediately (will just be trying to understand the material myself in the meantime).

Kickstarter status: about a dozen people have expressed interest, in response to my open call for interest in London.

Prerequisites I'll be expecting of people coming in:
* Be fluent and familiar with Peano arithmetic
* Grok the difference between provability and truth
* Have been at one point capable of precisely stating Gödel's incompleteness theorems, even if you no longer could state them correctly now.
* Have at least read [the paper](https://intelligence.org/files/ProgramEquilibrium.pdf) even if you haven't understood it at all; know the crucial property of an ideal agent that it's trying to construct
* Have spent a little time musing on the nature of Löb's theorem (for which https://www.lesswrong.com/posts/ALCnqX6Xx8bpFMZq3/the-cartoon-guide-to-loeb-s-theorem [LW · GW] may be helpful, although personally I found Wikipedia's proof much easier to follow); perhaps still have your brain twisted into pretzels trying to understand what it actually means.

I might end up doing a single session beforehand covering some of the prerequisites, if people are interested in that; comment, or find and upvote an existing comment, if so.

14 comments

Comments sorted by top scores.

comment by Smaug123 · 2024-08-18T19:18:34.356Z · LW(p) · GW(p)

Date and time of second meeting

Friday 23rd August, 7pm. Location: Notes, St Martin's Lane (Trafalgar Square), open til 9pm. I'll have a crew cut and will be wearing a Doge t-shirt.

Agenda: I'd like to understand Smullyan's paper about Löbian reasoning on the island where people either always tell the truth or always lie. See the previous session's notes [LW(p) · GW(p)] for what we've done so far (I've summarised with a lot of jargon; I don't expect attendees to be fluent with jargon, so don't be too daunted by the summary.)

Bring writing implements and paper if you possibly can, and something to read the Internet with.

@Adam Newgas [LW · GW] @LukasM [LW · GW] @kapslock [LW · GW] @philh [LW · GW]

comment by the gears to ascension (lahwran) · 2024-07-25T19:22:52.693Z · LW(p) · GW(p)

Interested. Not in london.

comment by Adam Newgas (BorisTheBrave) · 2024-08-07T11:30:25.044Z · LW(p) · GW(p)

Interested, but I can't make the proposed dates this week.

comment by LukasM · 2024-07-27T09:49:36.906Z · LW(p) · GW(p)

Hey, I would love to go, but only if it's in-person.

My personal experience is that trying to understand something in a group works a lot better in-person, plus I'd like to get to know more people in London.

comment by Smaug123 · 2024-08-30T18:51:50.208Z · LW(p) · GW(p)

Notes from second meeting

Answer to exercise "What's the sentence you get from the diagonal lemma, L iff 'L has length less-than-or-equal-to 1000'?": len("replace('len(x) <= 1000", 'x' -> "replace('len(x) <= 1000', 'x' -> x)")") <= 1000. I think.

We went through the Smullyan paper and got up to the moment before "The Stability Predicament". The point of the paper turns out to be "if we interpret the box modality as 'belief', then a bunch of English sentences sound weird". The real moral of the paper is that it's awfully hard to be a knight or a knave on a knights-and-knaves island! (Making any statement about my beliefs involves extrapolating my entire future mind-state, for example, so only an extremely powerful knight or knave can make such a statement.) A bunch of the paper only requires that the reasoner believe the island to be a knight-knave island, but the reasoner is thereby making an extremely strong assumption.

Brief aside where I got confused about whether a symmetry argument meant we could never distinguish between knights and knaves: of course we can, because a knight can prove knighthood by saying "0 = 0".

After some thought, we eventually decided that Smullyan's "you will never believe I'm a knight" is equally well served by "you will eventually believe I'm a knave", although this wasn't obvious. The former is a statement about all of time; the latter is a statement only about an instant in time. The reasoner is assumed to never believe a false statement, though, so in fact once they believe something, it must be true.

General consensus that the notation of "type 1/2/3/4" is deeply annoying. I consistently refer to Type 1 as "deductively closed", and then types 2/3/4 are precisely adding one each of the Hilbert-Bernays properties.

Question about what the semantics are, if "belief" (= proof) is the syntax. Answer which I was mostly satisfied with at the time: we've got a sequence of statements from numbered knight-or-knaves, and then we freeze the world and get to reason about which speaker has what property, eventually assigning them a boolean "is-knight" label.

Identification of typo in paper: p348 "if there is a proposition q such that he believes q~(Bq implies q), then he will believe p"; that should read "… q~(Bq implies p)…".

Considerable confusion about why the self-fulfilling belief required the presence of a knight-or-knave at all, to say "If you ever believe I'm a knight, then the cure will work". Why couldn't the reasoner simply imagine this sentence to start off with? Answer: actually a later section makes explicit the condition under which the reasoner can imagine this sentence and hence perform all the reasoning of this sentence without a knight-or-knave: this is the "reflexive" property, that for every p there is q such that the reasoner believes q = (Bq implies p). This is of course true of PA, but that's a heavy-duty fact that requires the diagonal lemma or an extremely inspired quine.

We stopped before section 5 due to lack of time, but I at least had a lot of fun, and our waiter said it sounded very interesting!

Homework:

  • We weren't instantly convinced about the Remarks after "Reflexive Reasoners". It was not at all obvious to me that statements 2, 3, or 4 in fact would have kicked off the Löb reasoning (that is, that they are suitable as Löb sentences). Prove that they do in fact work!
  • What is the status of all of this stuff if you reject excluded middle? A whole lot of stuff concludes "not A, or not-not-A", and deduces "not A, or A". Does Löb's theorem even hold without LEM, for example? (Not at all obvious to me that it should! How could I magic up a member of A given only an interpretation function SyntaxTree(A) -> A?)

Interesting tidbits:

  • PA (if consistent) can never prove any statement to be unprovable! Indeed, if PA proves "provable(A) -> False", then by explosion, PA proves "provable(A) -> A", and hence by Löb PA proves A. Hence by one of the Hilbert-Bernays rules, PA proves provable(A), and hence PA proves False by modus ponens. So, for example, PA can't even prove "not(provable(0=1))"! Though it can prove "provable(not(0=1))". Which is very weird, and I still feel like we must have made some mistake!
comment by Smaug123 · 2024-08-18T19:13:56.609Z · LW(p) · GW(p)

Notes from first meeting

Summary: we haven't actually got to the meat of Löb's theorem yet.

We went over:

  • first-order logic
  • enough model theory to observe that "truth" is a property of a sentence applied to a model, whereas "provability" is a property of a sentence applied to a theory
  • mentioned Gödel's completeness theorem (that "provable" = "true in all models")
  • mentioned the Löwenheim-Skolem theorems which guarantee that there are nonstandard models of Peano arithmetic (in fact, nonstandard models of arbitrarily large cardinality), and handwaved what the smallest [countable] nonstandard model looks like (but my memory is very poor on this, and I might have got that model wrong)
  • observed that Gödel's [first] incompleteness theorem can be sloppily stated as "there is a sentence S which is true but not provable" (or more precisely, there are models of Peano arithmetic in which S evaluates to false, but in the standard model \mathbb{N} S evaluates to true)
  • observed the diagonal lemma, and disagreed on whether it's obvious; I believe it's extremely nonobvious and indeed mind-bending and it took me several hours to reliably be able to write a quine, but perhaps I simply haven't read Gödel, Escher, Bach recently enough.

Assuming the existence of an "is-provable" operation which can be coded up with primitive-recursive functions on the naturals and which satisfies three particular axiom [schema]s (the Hilbert-Bernays provability conditions, though I didn't give them that name), we observed that the proof Yudkowsky gives in the Cartoon Guide is in fact correct, but we haven't yet talked about what it means. (I noted that I dislike the cartoon guide, because it's so… big.)

I also did a bit of fanboying about the Curry-Howard correspondence.

Interesting tidbits:

  • I claimed that "there's only one axiom [schema] of first-order logic which makes the False symbol special, and that's the Law of Excluded Middle". Without LEM, no axiom schema mentions the symbol False, so the principle of explosion ("False implies everything") follows a rather curious path: induction on proofs. If you've got a proof whose final line is False, then you could perform exactly the same proof but instead replace all instances of False with some fixed but arbitrary A, and then at the end rather than proving False you've produced a proof of A. Once we add LEM, this reasoning no longer works, because it's no longer valid to substitute the occurrences of the symbol False within usages of LEM; so we need some other reason for explosion to be true. It follows from LEM thus: from False, deduce (A -> False) -> False, by "true things are implied by everything" (strictly, this is from the P -> (Q -> P) axiom schema). Then by LEM and modus ponens, conclude A.
  • Homework: Yudkowsky explicitly stated the Löb sentence, a specific sentence L such that PA proves "L iff (Provable(L) implies P)". This sentence is what you get when you run the usual proof of a much more general theorem, the Diagonal Lemma, on the syntactic manipulation "turn L into the statement 'Provable(L) implies P'". Exercise: perform the diagonal lemma ("quining") construction on the syntactic manipulation "turn L into the statement 'len(L) <= 1000'". Note that the number 1000 is very long - more than 1000 characters long, it's SSSSS…SSS0 - so we expect this construction to result in a false L (by contrast with the use in Löb's theorem, where it results in an L that is true under the Löb hypothesis). I haven't yet done this, and it might be rather hard, I don't know!
comment by Smaug123 · 2024-08-14T19:47:22.764Z · LW(p) · GW(p)

OK, I'm finally both not on holiday and not on the support rota at work.

Date and time of first meeting

Friday 16th August, 7pm. Location: Notes, St Martin's Lane (Trafalgar Square), should be open til 9pm. I'll have a crew cut and will be wearing a Doge t-shirt. If it all goes wrong, in an emergency we can go and have nibbles at Prezzo instead.

Agenda: don't know how far we'll get, but I ambitiously hope we can fully grok Löb's theorem itself by the end of this. You should first refresh your knowledge of first-order logic and Peano arithmetic if you need to.

An ideal venue would have whiteboards/blackboards, but I can't rustle one of those up completely trivially. (Sorry to the remote people and people who couldn't do Fridays.)

Bring writing implements and paper if you possibly can, and something to read the Internet with.

Once we're there we can make a WhatsApp group. If you're in the London Rationalish WhatsApp group, you already have my number in the meantime.

Replies from: philh, kapslock
comment by philh · 2024-08-15T07:54:17.479Z · LW(p) · GW(p)

Ah, I can sometimes make fridays but not tomorrow. Hope it goes well.

comment by kapslock · 2024-08-17T13:56:30.697Z · LW(p) · GW(p)

I didn't check my notifications frequently enough to see this. How far did you get into the theorem, and will there be another session?

Replies from: Smaug123
comment by Smaug123 · 2024-08-18T18:53:56.792Z · LW(p) · GW(p)

I am just about to put up my notes from the first session and to announce the next one! I'm intending Friday 23rd, again at Notes Trafalgar Square, 7pm.

I organised the first one at too short notice, so there were just two of us there, but there are several more people with "yes but give me more notice" status. Since there were only two of us, we went over groundwork in advance of the theorem itself. Still plenty of time for people to come in at the beginning!

comment by Smaug123 · 2024-08-04T19:16:58.011Z · LW(p) · GW(p)

See https://doodle.com/meeting/participate/id/epXrAqme to vote for times and places! Hack: I'm using "midday" to mean "7-9pm, but remote rather than in person". I intend meetings to be weekly until we (I) run out of joy or things to talk about. Up front I'm leaning fairly strongly towards in-person (e.g. because I expect a bunch of shared whiteboarding stuff that I've always found a bit painful over e.g. Jamboard).

@the gears to ascension [LW · GW] @LukasM [LW · GW

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2024-08-06T03:26:01.855Z · LW(p) · GW(p)

Well, I'm not going to be in london any time soon.

comment by Smaug123 · 2024-07-25T19:30:23.516Z · LW(p) · GW(p)

Post for people to register time zone preference. No guarantees that I won't end up just saying "urgh organisational tsouris is raising the activation energy too high, so this group is happening in-person in London after all", but if we do end up remote, I'll consider time zones in this thread!

I vote for a time compatible with the evening of London UK, or a time compatible with the afternoon on Sundays. I declare the cadence to be weekly by default.

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2024-07-26T08:47:46.259Z · LW(p) · GW(p)

I suggest any time during the intersection of london and eastern time, ie london evening. (I'm not in either, but that intersection is a reliable win for organizing small international online social events, in my experience)