Translations Should Invert
post by abramdemski · 2023-10-05T17:44:23.262Z · LW · GW · 19 commentsContents
19 comments
Suppose that Alice is arguing for ZFC set theory as the proper foundation of mathematics. Bob is arguing for some specific intuitionistic type theory. Alice uses classical logic, and Bob uses intuitionistic logic; this is a big part of their difference in thinking, but not the whole disagreement, since set theory vs type theory is another part, and then (perhaps most importantly) the specific axioms each endorse.
The best way Bob can try to understand Alice is by translating her classical-logic statements into intuitionistic statements using Godel's "double negation translation", which works by sprinkling double negations throughout a sentence. This is classically equivalent to the original, but intuitionistically distinct. So Bob basically understands Alice as always speaking in the double negative (with some complications).
The best way Alice can try to understand Bob is by a similar translation in the other direction, which I can't find right now, but it involves sprinkling "provability" predicates everywhere. So Alice basically understands Bob to always be talking about provability rather than simple truth (with some complications).
These translations work pretty well for allowing Alice and Bob to follow each other's arguments in isolation. Alice and Bob can find no flaw with the translations as translations; Alice sees that Bob will never miss or add important implications (except by virtue of the other axioms Bob and Alice want to debate), and vice versa. So it might seem like Alice and Bob have found a good way to avoid debating classical logic vs intuitionistic logic, so that they can focus on debating set theory vs type theory.
However, it seems to me like there is a severe problem.
One danger is that the use of translations allows Alice and Bob to put off an important part of their debate, which is the debate between intuitionism and classical logic. But maybe this is OK.
The larger danger is that Alice and Bob cannot agree on what they are debating at any given time. Since their translations aren't inverses of each other, when Bob wants to debate X, Alice translates it into Y, and Bob sees Alice as changing the topic to Z. So from each of their perspectives, the other person is always changing the topic, which feels like "avoiding the problem I'm trying to point to -- refusing to deal with it, always insisting on dealing with some other, somewhat analogous, but more complex thing".
Imagine Alice wants to argue for the axiom of choice, C. Bob translates this to C' by adding double negations, and has some counterargument against C'. Alice translates this to C'' by adding provability predicates, and says, OK, that's not exactly what I meant, but I can modify my argument as follows to show that you are wrong in this case also. Bob translates to C''' and adapts his own argument as needed. They keep going back and forth, never quite agreeing on the topic of debate, each always satisfying themselves that they've responded to the other, but then "somehow" the other person comes up with a "new" concern...
It isn't clear who should change behavior and how. They can both see that each other's translations individually should work fine (so neither one can argue too strongly against the other's translations.) But it does seem plausible to claim that they should settle on translations which invert each other.
The post is based on a real discussion I had, about decision theory. I was making an argument for a claim X in terms of pure actions, and the other person was translating to mixed strategies to argue against. My view is that the other person's argument simply doesn't work; but in order to re-apply my own argument to refute their take, I have to treat the choice of mixed strategy as itself being a "pure action" (ie you select a number from 0 to 1, which you put into a randomizing device, which then selects the 1st action for you with that probability, and otherwise does the 2nd).
But for the other person to analyze my new application of my argument, from their own perspective, they would have to again translate to mixed actions; so now the agent isn't just settling on one number to say, but rather, selects a probability distribution over which numbers to say. (Effectively, we now have mixtures of mixtures.) Having done the translation, they re-state their argument against my claim X.
This could just iterate infinitely without going anywhere, even if both of us think we have an argument that is strong enough that we can always reply to the other person, no matter what they try to do to wriggle out of it.
Neither of us think the choice between modeling agents as deciding pure actions vs mixed is particularly critical, since it is so easy to translate back and forth between the two frames. But nonetheless it seems to be harming our ability to see how our arguments connect with each other.
19 comments
Comments sorted by top scores.
comment by Oscar_Cunningham · 2023-10-06T06:13:55.927Z · LW(p) · GW(p)
Yup, you want a bi-interpretation:
Two models or theories are mutually interpretable, when merely each is interpreted in the other, whereas bi-interpretation requires that the interpretations are invertible in a sense after iteration, so that if one should interpret one model or theory in the other and then re-interpret the first theory inside that, then the resulting model should be definably isomorphic to the original universe
comment by kave · 2023-10-06T04:58:28.351Z · LW(p) · GW(p)
If I imagine having a compiler that translates back-and-forth between intuitionistic and classical logic as in the post, and I want to stop the accumulation of round-trip 'cruft', I think the easiest thing to do would be to add provenance information that let me figure out whether a provability predicate, say, was "original" or "translational". But frustratingly that's not really possible in the case where I'm trying to translate between people with pretty different ontologies (who might not be able to parse their interlocutors statements natively).
I dunno whether you're thinking more about the case of differing ontologies or more about the case of preferred framings (but fluency with both), so not sure how relevant to your inquiries.
Replies from: abramdemski↑ comment by abramdemski · 2023-10-06T17:40:21.384Z · LW(p) · GW(p)
I'm interested in concrete advice about how to resolve this problem in a real argument (which I think you don't quite provide), but I'm also quite interested in the abstract question of how two people with different ontologies can communicate. Normally I think of the problem as one of constructing a third reference frame (a common language) by which they can communicate, but your proposal is also interesting, and escapes the common-language idea.
comment by tailcalled · 2023-10-05T19:39:59.754Z · LW(p) · GW(p)
Hmm, I'm trying to follow your argument, but I can't immediately think of what double-negated axiom of choice would look like.
Like type-theoretically, to me, axiom of choice looks something like:
... where is propositional truncation (which computationally you can think of as allowing functions to be nondeterministic; so axiom of choice states that if you have a nondeterministic function, you can nondeterministically pick a deterministic function). Where would we be putting the double-negation to translate it from classical to constructive? Usually implications/functions are unaffected by DNE translation, so the only option I can think of would be to DNE the propositional truncations, but dunno that this is the appropriate choice.
More generally, part of the beauty in the double-negation translation is that all of classical logic is valid under it. So if Alice makes some classical proof, then Bob can just translate that proof with double negations, and then the translated argument will be valid to Bob IFF it was valid to Alice.
(I think this is less feasible in the other direction? Like provability may be one model of constructive logic, sort of (I think you are more thinking about realizability? which is sort of complicated), but it is not the only one and not a universal one. But from the perspective of classical logic, plopping in double negations makes no difference, so any argument Bob makes is an argument that Alice would still accept.)
Replies from: tailcalled, abramdemski↑ comment by tailcalled · 2023-10-06T22:01:29.782Z · LW(p) · GW(p)
More on axiom of choice: https://math.stackexchange.com/a/4782176/315043
↑ comment by abramdemski · 2023-10-06T18:18:27.290Z · LW(p) · GW(p)
As I mentioned here [LW(p) · GW(p)], if Alice understands your point about the power of the double-negation formulation, she would be applying a different translation of Bob's statements from the one I assumed in the post, so she would be escaping the problem. IE:
part of the beauty in the double-negation translation is that all of classical logic is valid under it.
is basically a reminder to Alice that the translation back from double-negation form is trivial in her own view (since it is classically equivalent), and all of Bob's intuitionistic moves are also classically valid, so she only has to squint at Bob's weird axioms, everything else should be understantable.
But I think things will still get weird if Bob makes meta-logical arguments (ie Bob's arguments are not just proofs in Bob's logic), which seems quite probable.
Replies from: tailcalled↑ comment by tailcalled · 2023-10-06T19:39:16.972Z · LW(p) · GW(p)
But I think things will still get weird if Bob makes meta-logical arguments (ie Bob's arguments are not just proofs in Bob's logic), which seems quite probable.
But in that case Bob's meta-logic might be compatible with Alice's meta-logic. For instance usually I see practical constructive mathematicians work in a neutral or classical meta-logic. If Alice and Bob share a meta-logic, then they can easily talk about the models of the object logic in the shared meta-logic.
Replies from: abramdemski↑ comment by abramdemski · 2023-10-07T18:35:51.788Z · LW(p) · GW(p)
Yeah. For my case, I think it should be assumed that the meta-logics are as different as the object-logics, so that things continue to be confusing.
comment by Ericf · 2023-10-06T15:10:36.297Z · LW(p) · GW(p)
Alice: grass is green -> grass isn't not green Bob: the grass is teal -> the grass is provably teal Alice: your spectrometer is miscalibrated -> your spectrometer isn't not miscalibrated.
...
I'm having trouble with the statement {...and has some argument against C'}. The point of the double negative translation is that any argument against not not A is necessarily an argument against A (even though some arguments against A would not apply to not not A). And the same applies to the other translation - Alice is steelmanning Bob's argument, so there shouldn't be any drift of topic.
Replies from: abramdemski↑ comment by abramdemski · 2023-10-06T17:33:07.368Z · LW(p) · GW(p)
That's an interesting point, but I have a couple of replies.
- First and foremost, any argument against 'not not A' becomes an argument against A if Alice translates back into classical logic in a different way than I've assumed she is. Bob's argument might conclude 'not A' (because even in intuitionistic logic), but Alice thinks of this as a tricky intuitionistic assertion, and so she interprets it indirectly as saying something about proofs. For Alice to notice and understand your point would, I think, be Alice fixing the failure case I'm pointing out.
- Second, an argument against an assumption need not be an argument for its negation, especially for intuitionists/constructivists. (Excluded middle is something they want to argue against, but definitely not something they want to negate, for example.) The nature of Bob's argument against Alice's claim can be quite complex and can occur at meta-levels, rather than occurring directly in the logic. So I guess I'm not clear that things are as simple as you claim, when this happens.
↑ comment by Ericf · 2023-10-06T20:29:36.983Z · LW(p) · GW(p)
Arguing against A doesn't support Not A, but arguing against Not Not A is arguing against A (while still not arguing in favor of Not A) - albeit less strongly than arguing against A directly. No back translation is needed, because arguments are made up of actual facts and logic chains. We abstract it to "not A" but even in pure Mathematics, there is some "thing" that is actually being argued (eg, my grass example).
Arguing at a meta level can be thought of as putting the object level debate on hold and starting a new debate about the rules that do/should govern the object level domain.
↑ comment by tailcalled · 2023-10-06T19:36:46.426Z · LW(p) · GW(p)
Second, an argument against an assumption need not be an argument for its negation, especially for intuitionists/constructivists.
Law of noncontradiction is still constructively valid, and constructive logic only rejects the principle of inferring from , it doesn't reject inferring from .
(Excluded middle is something they want to argue against, but definitely not something they want to negate, for example.)
You don't want to negate it in the sense of accepting , but depending on your variant of constructive math, you might be able to prove something like . This is no more mysterious than how you would not want to be able to prove , as it is equivalent to , even though it is true that . Unbound variables are a mess!
When it comes to constructive mathematics, there are basically two kinds. One is "neutral" constructive math which doesn't add any nonclassical principles; it is a strict generalization of classical math, and so it doesn't allow one to prove things like , but conversely it also means that all neutral constructive statements are valid classical statements.
The other kind of constructive math comes from the fact that neutral constructive math has models that are inherently incompatible with classical logic, e.g. models where all functions are computable, or where all functions are continuous, or where all functions are differentiable. For such models, one might want to add additional axioms to make the logic better capture the features of the model, but this rules out the classical models. In such logics, one can prove because e.g. otherwise the Heaviside step would be a well-defined function, and the Heaviside step is not computable/continuous/differentiable.
comment by cubefox · 2023-10-11T07:39:39.985Z · LW(p) · GW(p)
The way out of such translation problems is usually natural language. In the logic case, both interlocutors presumably speak the same language, so they can both use the "logic of English" (of what follows from what in English natural language statements). Unlike some formal logic, the logic of English may be more vague, perhaps even probabilistic instead of deterministic, more evidence conferring than truth preserving, but these things are there for both interlocutors equally, so no translation is necessary, and things can be clarified by using statements about English statements made in English itself. If there is really a disagreement, it can be reduced to how some simple sentence in natural language should be interpreted in some hypothetical situation (thought experiment). These language intuitions are all over philosophy.
I'm not sure where the underlying disagreement in the decision theory case was (something about actions vs mixed strategies) but I assume there again the underlying problem can be expressed in natural language statements which both parties can understand without the need of translating them.
Replies from: abramdemski↑ comment by abramdemski · 2023-10-16T23:59:59.637Z · LW(p) · GW(p)
I disagree. For tricky technical topics, two different people will be speaking sufficiently different versions of English that this isn't true. Vagueness and other such topics will not apply equally to both speakers; one person might have a precise understanding of decision-theoretic terms like "action" and "observation" while the other person may regard them as more vague, or may have different decision-theoretic understanding of those terms. Simple example, one person may regard Jeffrey-Bolker as the default framework for understanding agents, while the other may prefer Savage; these two frameworks ontologize actions in very different ways, which may be incidental to the debate or may be central. Speaking in English just obscures this underlying difference in how we think about things, rather than solving the problem.
I'm not sure where the underlying disagreement in the decision theory case was (something about actions vs mixed strategies) but I assume there again the underlying problem can be expressed in natural language statements which both parties can understand without the need of translating them.
In the case of mixed vs pure strategies, I think it is quite clear that translating to technical terminology rather than English helped clarify rather than obscure, even if it created the non-one-to-one translation problem this post is discussing.
Replies from: cubefox↑ comment by cubefox · 2023-10-19T12:32:16.803Z · LW(p) · GW(p)
The point of axiomatizing aspects of natural language reasoning, like decision theory, is to make them explicit, systematic, and easier to reason about. But the gold standard remain things that are valid in our antecedent natural language understanding. The primitive terms of any axiomatic theory are only meaningful insofar they reflect the meaning of some natural language terms, and the plausibility of the axioms derives from those natural language interpretations.
So for example, when we compare the axiomatizations of Savage and Jeffrey, we can do so by comparing how well or to what extent they capture the reasoning that is plausible in natural language. I would argue that Jeffrey's theory is much more general, it captures parts of natural language reasoning that couldn't be expressed in Savage's earlier theory, while the opposite is arguably not the case. We can argue about that in English, e.g. by using terms like "action" with its natural language meaning and by discussing which theory captures it better. Savage assumes that outcomes are independent of "actions", which is not presumed when doing practical reasoning expressed in natural language, and Jeffrey captures this correctly. One could object that Jeffrey allows us to assign probabilities to our own actions, which might be implausible, etc.
Replies from: abramdemski↑ comment by abramdemski · 2023-10-24T18:52:02.001Z · LW(p) · GW(p)
Even if I conceded this point, which is not obvious to me, I would still insist on the point that different speakers will be using natural language differently and so resorting to natural language rather than formal language is not universally a good move when it comes to clarifying disagreements.
Well, more importantly, I want to argue that "translation" is happening even if both people are apparently using English.
For example, philosophers have settled on distinct but related meanings for the terms "probability", "credence", "chance", "frequency", "belief". (Some of these meanings are more vague/general while others are more precise; but more importantly, these different terms have many different detailed implications.) If two people are unfamiliar with all of those subtleties and they start using one of the words (say, "probability"), then it is very possible that they have two different ideas about which more-precise notion is being invoked.
When doing original research, people are often in this situation, because the several more-precise notions have not even been invented yet (so it's not possible to go look up how philosophers have clarified the possible concepts).
In my experience, this means that two people using natural language to try and discuss a topic are very often in a situation where it feels like we're "translating" back and forth between our two different ontologies, even though we're both expressing those ideas in English.
So, even if both people express their ideas in English, I think the "non-invertible translation problem" discussed in the original post can still arise.
comment by Quinn (quinn-dougherty) · 2023-10-07T19:34:40.618Z · LW(p) · GW(p)
My summary translation:
You have a classical string P
. We like to cooperate across worldviews or whatever, so we provide sprinkleNegations
to sprinkle double negations on classical strings and decorateWithBox
to decorate provability on intuitionistic strings.
The definition of inverse would be cool, but you see that decorateWithBox(sprinkleNegations(P)) = P
isn't really there, because now you have all these annoying symbols (which cancel out if we implemented translations correctly, but how would we know that?). Luckily, we know we can automate cancelation with an evaluator / normalization--- normalize(decorateWithBox(sprinkleNegations(P))) = P
is a trivial syntactic string equality, which is close enough to the definition of inverse.
I think Alice and Bob, socially, care a lot simply about the costs of normalizing to the input after the translations are applied. If it's an expensive application, they're going to struggle to connect. If not, they'll enjoy the benefits of invertable translations.
Replies from: Kenoubi↑ comment by Kenoubi · 2023-10-18T21:53:34.019Z · LW(p) · GW(p)
I've seen a lot of attempts to provide "translations" from one domain-specific computer language to another, and they almost always have at least one of these properties:
- They aren't invertible, nor "almost invertible" via normalization
- They rely on an extension mechanism intentionally allowing the embedding of arbitrary data into the target language
- They use hacks (structured comments, or even uglier encodings if there aren't any comments) to embed arbitrary data
- They require the source of the translation to be normalized before (and sometimes also after, but always before) translation
(2) and (3) I don't think are super great here. If there are blobs of data in the translated version that I can't understand, but that are necessary for the original sender to interpret the statement, it isn't clear how I can manipulate the translated version while keeping all the blobs correct. Plus, as the recipient, I don't really want to be responsible for safely maintaining and manipulating these blobs.
(1) is clearly unworkable (if there's no way to translate back into the original language, there can't be a conversation). That leaves 4. 4 requires stripping anything that can't be represented in an invertible way before translating. E.g., if I have lists but you can only understand sets, and assuming no nesting, I may need to sort my list and remove duplicates from it as part of normalization. This deletes real information! It's information that the other language isn't prepared to handle, so it needs to be removed before sending. This is better than sending the information in a way that the other party won't preserve even when performing only operations they consider valid.
I think this applies to the example from the post, too - how would I know whether certain instances of double negation or provability were artifacts that normalization is supposed to strip, or just places where someone wanted to make a statement about double negation or provability?
comment by Amarko · 2023-10-05T21:46:27.986Z · LW(p) · GW(p)
This reminds me of a lot of discussions I've had with people where we seem to be talking past each other, but can't quite pin down what the disagreement is.
Usually we just end up talking about something else instead that we both seem to derive value from.