And My Axiom! Insights from 'Computability and Logic'

post by TurnTrout · 2019-01-16T19:48:47.388Z · LW · GW · 17 comments

Contents

  Foreword
  Computability and Logic
    1 Enumerability
    2 Diagonalization
    3 Turing Computability
    4 Uncomputability
    5 Abacus Computability
    6 Recursive Functions
    7 Recursive Sets and Relations
    8 Equivalent Definitions of Computability
    9 A Précis of First-Order Logic: Syntax
    10 A Précis of First-Order Logic: Semantics
    11 The Undecidability of First-Order Logic
    12 Models
    13 The Existence of Models
    14 Proofs and Completeness
    15 Arithmetization
    16 Representability of Recursive Functions
    17 Indefinability, Undecidability, Incompleteness
    18 The Unprovability Of Consistency
    19 Normal Forms
    20-27 [Skipped]
    Final Thoughts
  Forwards
None
17 comments

Foreword

Max Tegmark's Our Mathematical Universe briefly touches on a captivating, beautiful mystery:

The arrows indicate the close relations between mathematical structures, formal systems and computations. The question mark suggests that these are all aspects of the same transcendent structure, whose nature we still haven't fully understood.

The profound results compiled by the Computability and Logic textbook may be the first step towards the answer.

Computability and Logic

If this sentence is true, then Santa Claus is real.

As usual, I'll explain confusions I had and generally share observations. This book is on the MIRI reading list.

1 Enumerability

Coming back to this book, I'm amazed by some of my margin scribbles – expressions of wonderment and awe at what now strike me as little more than obvious facts ("relations are sets of ordered pairs!").

2 Diagonalization

Exercise 2.13 (Richard's paradox) What (if anything) is wrong with following argument?

The set of all finite strings of symbols from the alphabet, including the space, capital letters, and punctuation marks, is enumerable; and for definiteness let us use the specific enumeration of finite strings based on prime decomposition. Some strings amount to definitions in English of sets of positive integers and others do not; strike out the ones that do not, and we are left with an enumeration of all definitions in English of sets of positive integers, or, replacing each definition by the set it defines, and enumeration of all sets of positive integers that have definitions in English. Since some sets have more than one definition, there will be redundancies. Strike them out to obtain an irredundant enumeration of all sets of positive integers that have definitions in English. Now consider the set of positive integers defined by the condition that a positive integer belongs to the set if and only if does not belong to the th set in the irredundant enumeration just described.
This set does not appear in that enumeration, so it cannot have a definition in English. Yet it does, and in fact we have just given such a definition.

My first time reading this exercise, I had plenty of objections. "Is not abusive to use English in place of a formal system? How do we even quantify the expressiveness of English, is that a thing?", and so on. Yet, returning with more context and experience, a few minutes thought revealed to me the crux: information and enumerability aren't just defined by what is present, but by what's not.

Let's take a little detour. Consider the regular expression ; its language is infinite, and certainly computable. We don't even need a Turing machine to recognize it; a strictly less-expressive finite state automaton would do just fine. And yet there are infinite subsets of this language which are not at all computable.

Consider some reasonable encoding of a Turing machine and input . As we see later, we can enumerate all possible Turing machines and inputs (given that we first fix the alphabet, etc.). This means that we can number the encodings. Consider the halting set; that is, . Expressed in unary, the numbers of the encodings belonging to this set is a strict subset of the regular language , and yet is not computable (because the halting set is not negatively recursively semi-decidable; i.e., we can't say a computation won't halt. Thus, its complement is not computable).

Do you see the problem now?

Here, we should be careful with how we interpret "information". After all, coNP-complete problems are trivially Cook reducible to their NP-complete counterparts (e.g., query the oracle and then negate the output), but many believe that there isn't a corresponding Karp reduction (where we do a polynomial amount of computation before querying the oracle and returning its answer). Since we aren't considering complexity but instead whether it's computable at all, complementation is fine.

3 Turing Computability

Turing's thesis is that any effectively computable function is Turing computable, so that computation in the precise technical sense we have been developing coincides with effective computability in the intuitive sense.

On several occasions, the authors emphasize how the intuitive nature of "effective computability" renders futile any attempt to formalize the thesis. However, I'm rather interested in formalizing intuitive concepts and therefore wondered why this hasn't been attempted. Indeed, it seems that a recent thesis by Vinogradova conducts a category-theoretic formalization of the notion of abstract computability (although since I don't yet know category theory, I can't tell how related it is).

4 Uncomputability

5 Abacus Computability

6 Recursive Functions

The part where you say “FLooPs” and give up on Turing-complete primitive recursion when the theorems don’t support it.
Preface, The Sequents

Nate wrote:

-Zero: The function that always returns 0.
-Successor: The function that, given , returns .
-Identity: Functions that return something they are given.
-Composition: The function that performs function composition.
These four functions are called the "primitive recursive" functions, and some time is spent exploring what they can do. If we allow use of a fifth function:
-Minimization: Given , finds the arguments for which returns 0.
we get the "recursive" functions.

However, the book defines minimization like so:

This confused me for days, and I didn't truly understand it until I came back several months later (i.e., now). How in the world is it effectively computable if it isn't even defined on all inputs?

Suppose I challenge you to give me a function that, given a number , returns a larger number. The catch is that you aren't allowed to directly modify – you can only use it to check whether your candidate solution is bigger. If you just use the bounded search provided by primitive recursion, for some valid inputs your function will be wrong. If you have to start from scratch, there's no finite number of exponentiations or tetrations or super-duper-tetrations that you can include which will work for all inputs. You have to be able to do unbounded search – for example, taking the successor until you get a larger number.

Depending on the domain, this isn't always total, either. If we're working with and I give you , you'll increment forever. Similarly, your function won't be defined on input . The important part is that we've given an effective procedure for finding the solution whenever it exists and for valid inputs.

7 Recursive Sets and Relations

8 Equivalent Definitions of Computability

Coming to appreciate this bridge between math and computer science was one of my most profound experiences last year. My mind's eye began viewing the world differently. Goings-on came to be characterized not just as interactions of atomic "objects", but as the outgrowth of the evolution of some mathematical structure. As a vast and complex play of sorts, distorted by my mind and approximated in specific ways – some legible, others not.

Most of all, a tugging in the back of my mind intensified, continually reminding me just how ignorant I am about the nature of our world.

9 A Précis of First-Order Logic: Syntax

10 A Précis of First-Order Logic: Semantics

In fact, if you take Euclid's first four postulates, there are many possible interpretations in which "straight line" takes on a multitude of meanings. This ability to disconnect the intended interpretation from the available interpretations is the bedrock of model theory. Model theory is the study of all interpretations of a theory, not just the ones that the original author intended.
Of course, model theory isn't really about finding surprising new interpretations — it's much more general than that. It's about exploring the breadth of interpretations that a given theory makes available. It's about discerning properties that hold in all possible interpretations of a theory. It's about discovering how well (or poorly) a given theory constrains its interpretations. It's a toolset used to discuss interpretations in general.
At its core, model theory is the study of what a mathematical theory actually says, when you strip the intent from the symbols.

I can't emphasize enough how helpful Nate's Mental Context for Model Theory [LW · GW] was; the mental motions behind model theory are a major factor in my excitement for studying more logic.

11 The Undecidability of First-Order Logic

12 Models

Coming out of linear algebra with a "isomorphism ?= bijection" confusion, the treatment in this chapter laid the conceptual foundation for my later understanding homomorphisms. That is, a key part of the "meaning" of mathematical objects lies not just in their number, but in how they relate to one another.

This chapter is also great for disassociating baggage we might naïvely assign to words on the page, underlining the role of syntax as pointers to mathematical objects.

13 The Existence of Models

14 Proofs and Completeness

15 Arithmetization

16 Representability of Recursive Functions

I confess that upon wading back into the thicket of logical notation and terminology, I found myself lost. I was frustrated at how quickly I'd apparently forgotten everything I'd worked so hard for. After simmering down, I went back through a couple chapters, and found myself rather bored by how easy it was. I hadn't forgotten much at all – not beyond the details, anyways. I don't know whether that counts as "truly a part of me", but I don't think it's reasonable to expect myself to memorize everything, especially on the first go.

17 Indefinability, Undecidability, Incompleteness

Perhaps the most important implication of the [first] incompleteness theorem is what it says about the notions of truth (in the standard interpretation) and provability (in a formal system): that they are in no sense the same.

Indeed, the notion of provability can be subtly different from our mental processes for judging the truth of a proposition; within the confines of a formal system, provability doesn't just tell us about the proposition in question, but also about the characteristics of that system. This must be kept in mind.

When Godel’s theorem first appeared, with its more general conclusion that a mathematical system may contain certain propositions that are undecidable within that system, it seems to have been a great psychological blow to logicians, who saw it at first as a devastating obstacle to what they were trying to achieve. Yet a moment’s thought shows us that many quite simple questions are undecidable by deductive logic. There are situations in which one can prove that a certain property must exist in a finite set, even though it is impossible to exhibit any member of the set that has that property. For example, two persons are the sole witnesses to an event; they give opposite testimony about it and then both die. Then we know that one of them was lying, but it is impossible to determine which one.

In this example, the undecidability is not an inherent property of the proposition or the event; it signifies only the incompleteness of our own information. But this is equally true of abstract mathematical systems; when a proposition is undecidable in such a system, that means only that its axioms do not provide enough information to decide it. But new axioms... might supply the missing information and make the proposition decidable after all.

In the future, as science becomes more and more oriented to thinking in terms of information content, Godel’s results will be seen as more of a platitude than a paradox. Indeed, from our viewpoint "undecidability" merely signifies that a problem is one that calls for inference rather than deduction. Probability theory as extended logic is designed specifically for such problems.
~ E.T. Jaynes, Probability Theory

18 The Unprovability Of Consistency

19 Normal Forms

20-27 [Skipped]

I found these advanced topics rather boring; the most important was likely provability logic, but I intend to study that separately in the future anyways.

Final Thoughts

I really liked this book. In the chapters I completed, I did all of the exercises – they seemed to be of appropriate difficulty, and I generally didn't require help.

I've already completed Understanding Machine Learning, the first five chapters of Probability Theory, and much of two books on confrontational complexity. I'm working through a rather hefty abstract algebra tome and intend to go through two calculus books before an ordinary differential equations text. The latter material is more recreational, as I intend to start learning physics. This project will probably be much slower, but I'm really looking forward to it.

Forwards

Good mathematicians see analogies between theorems; great mathematicians see analogies between analogies.
~ Banach

I don't think I'm a great mathematician yet by any means, but as my studies continue, I can't shake a growing sense of structure. I'm trying to broaden my toolbox as much as possible, studying areas of math which had seemed distant and unrelated. Yet the more I learn, the more my mental buckets collapse and merge.

And to think that I had once suspected the Void of melodrama.

If for many years you practice the techniques and submit yourself to strict constraints, it may be that you will glimpse the center. Then you will see how all techniques are one technique, and you will move correctly without feeling constrained. Musashi wrote: “When you appreciate the power of nature, knowing the rhythm of any situation, you will be able to hit the enemy naturally and strike naturally. All this is the Way of the Void.”

If you are interested in working with me or others to learn MIRI-relevant math, if you have a burning desire to knock the alignment problem down a peg - I would be more than happy to work with you. Message me [LW · GW] for an invitation to the MIRIx Discord server.

17 comments

Comments sorted by top scores.

comment by AlexMennen · 2019-01-17T02:11:03.435Z · LW(p) · GW(p)
On several occasions, the authors emphasize how the intuitive nature of "effective computability" renders futile any attempt to formalize the thesis. However, I'm rather interested in formalizing intuitive concepts and therefore wondered why this hasn't been attempted.

Formalizing the intuitive notion of effective computability was exactly what Turing was trying to do when he introduced Turing machines, and Turing's thesis claims that his attempt was successful. If you come up with a new formalization of effective computability and prove it equivalent to Turing computability, then in order to use this as a proof of Turing's thesis, you would need to argue that your new formalization is correct. But such an argument would inevitably be informal, since it links a formal concept to an informal concept, and there already have been informal arguments for Turing's thesis, so I don't think there is anything really fundamental to be gained from this.

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-01-17T08:53:54.050Z · LW(p) · GW(p)

Another way of putting it: you can't possibly know that there isn't some device out in the universe that lets you do more powerful things than your model (eg. a device that can tell you whether an arbitrary Turing machine halts), so it can never be proven that your model captures real-world computability.

Replies from: Gurkenglas
comment by Gurkenglas · 2019-01-18T13:07:21.291Z · LW(p) · GW(p)

Be careful stating what physics can't prove [LW · GW].

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-01-18T20:43:01.854Z · LW(p) · GW(p)

Fwiw, I disagree with the frame of that post as well.

I'm happy to agree that you can prove that your model captures real-world computability under a particular formalization of physics.

comment by AlexMennen · 2019-01-17T01:57:56.471Z · LW(p) · GW(p)

Consider the halting set; ... is not enumerable / computable.
...
Here, we should be careful with how we interpret "information". After all, coNP-complete problems are trivially Cook reducible to their NP-complete counterparts (e.g., query the oracle and then negate the output), but many believe that there isn't a corresponding Karp reduction (where we do a polynomial amount of computation before querying the oracle and returning its answer). Since we aren't considering complexity but instead whether it's enumerable at all, complementation is fine.

You're using the word "enumerable" in a nonstandard way here, which might indicate that you've missed something (and if not, then perhaps at least this will be useful for someone else reading this). "Enumerable" is not usually used as a synonym for computable. A set is computable if there is a program that determines whether or not its input is in the set. But a set is enumerable if there is a program that halts if its input is in the set, and does not halt otherwise. Every computable set is enumerable (since you can just use the output of the computation to decide whether or not to halt). But the halting set is an example of a set that is enumerable but not computable (it is enumerable because you can just run the program coded by your input, and halt if/when it halts). Enumerable sets are not closed under complementation; in fact, an enumerable set whose complement is enumerable is computable (because you can run the programs for the set and its complement in parallel on the same input; eventually one of them will halt, which will tell you whether or not the input is in the set).

The distinction between Cook and Karp reductions remains meaningful when "polynomial-time" is replaced by "Turing computable" in the definitions. Any set that an enumerable set is Turing-Karp reducible to is also enumerable, but an enumerable set is Turing-Cook reducible to its complement.

The reason "enumerable" is used for this concept is that a set is enumerable iff there is a program computing a sequence that enumerates every element of the set. Given a program that halts on exactly the elements of a given set, you can construct an enumeration of the set by running your program on every input in parallel, and adding an element to the end of your sequence whenever the program halts on that input. Conversely, given an enumeration of a set, you can construct a program that halts on elements of the set by going through the sequence and halting whenever you find your input.

Replies from: TurnTrout, TruePath
comment by TurnTrout · 2019-01-17T02:27:34.290Z · LW(p) · GW(p)

Thanks, my terminology was a little loose. What I was trying to hint at is that some of the paradox's culling operations require uncomputable tests of English sentences, and that the regularity of the original language doesn't determine the status of its subsets.

Replies from: TruePath
comment by TruePath · 2019-01-17T17:48:42.842Z · LW(p) · GW(p)

But that's not what the puzzle is about. There is nothing about computability in it. It is supposed to be a paradox along Russell's set of all sets that don't contain themselves.

The response about formalizing exactly what counts as a set defined by an English sentence is exactly correct.

Replies from: TurnTrout
comment by TurnTrout · 2019-01-17T20:03:50.458Z · LW(p) · GW(p)

But there are more objections; even if "computability" isn't explicitly mentioned in the problem, it's still present. Are the sets "the singleton set containing 1 if and only if machine halts on input " and "the singleton set containing 1" the same? Even if we grant a procedure for figuring out what counts as a set, we can't even compute which sentences are duplicates.

Replies from: Gurkenglas
comment by Gurkenglas · 2019-01-18T13:01:40.126Z · LW(p) · GW(p)

That still doesn't make computability relevant until one introduces it deliberately. Compare to weaker notions than computability, like computability in polynomial time. Computability theory also complains the same once we have explicitly made definability subjective, and should have no more logical problems.

Replies from: TurnTrout
comment by TurnTrout · 2019-01-18T16:43:49.294Z · LW(p) · GW(p)

I don’t think I understand this line of objection; would you be willing to expand?

Replies from: Gurkenglas
comment by Gurkenglas · 2019-01-18T21:38:27.878Z · LW(p) · GW(p)

Saying that the problem is about computability because there is no computable solution proves too much: I could reply that it is about complexity theory because there is no polynomial-time solution. (In fact, there is no solution.)

We can build something like a solution by specifying that descriptions must be written in some formal language that cannot describe its own set of describables, then use a more powerful formal language to talk about that previous language's set. For powerful enough languages, that's still not computable, though, so computability theory wouldn't notice such a solution, which speaks against looking at this through the lens of computability theory.

Replies from: TurnTrout
comment by TurnTrout · 2019-01-19T00:33:49.163Z · LW(p) · GW(p)

Sure, but how do we get the final set, then? The paradox addresses the reader in the imperative, implying one can follow along with some effective procedure to trim down the set. Yet if Turing’s thesis is to be believed, there is no such procedure, no final set, and therefore no paradox.

Replies from: TruePath
comment by TruePath · 2019-01-22T06:51:22.207Z · LW(p) · GW(p)

Computability is just \Delta^0_1 definability. There are plenty of other notions of definability you could try to cash out this paradox in terms of. Why pick \Delta^0_1 definability?

If the argument worked in any particular definability notion (e.g. arithmetic definability) it would be a problem. Thus, the solution needs to explain why the argument shouldn't convince you that with respect to any concrete notion of definable set the argument doesn't go through.

Replies from: TurnTrout
comment by TurnTrout · 2019-01-22T15:33:54.781Z · LW(p) · GW(p)

Turing’s thesis applies only to this notion of definability, right?

comment by TruePath · 2019-01-17T17:27:31.160Z · LW(p) · GW(p)

Yah, enumerable means something different than computably enumerable.

comment by philip_b (crabman) · 2019-05-10T17:29:22.982Z · LW(p) · GW(p)

It seems "Computability and Logic" doesn't include Kleene's recursion theorem and Rice's theorem. What sources would you recommend for learning those theorems, their proofs, and their corollaries? Also, which chapters of "Computability and Logic" are required to understand them?

Replies from: None
comment by [deleted] · 2019-11-18T18:52:43.109Z · LW(p) · GW(p)

Michael Sipser's Introduction to the Theory of Computation goes over the recursion theorem and Rice's theorem, IIRC. The proofs are given as well as associated exercises. The textbook walks you through, from DFAs to Turing Machines, so it's pretty self-contained, if you're looking at a source other than Computability and Logic to understand them.