Bridging syntax and semantics with Quine's Gavagai

post by Stuart_Armstrong · 2018-09-24T14:39:55.981Z · score: 20 (7 votes) · LW · GW · 2 comments

Quine has an argument showing that you can never be sure what words mean in a foreign language:

Quine uses the example of the word "gavagai" uttered by a native speaker of the unknown language Arunta upon seeing a rabbit. A speaker of English could do what seems natural and translate this as "Lo, a rabbit." But other translations would be compatible with all the evidence he has: "Lo, food"; "Let's go hunting"; "There will be a storm tonight" (these natives may be superstitious); "Lo, a momentary rabbit-stage"; "Lo, an undetached rabbit-part." Some of these might become less likely – that is, become more unwieldy hypotheses – in the light of subsequent observation.

What does this mean from the perspective of empirically bridging syntax and semantics [LW · GW]?

Well, there is no real problem with "gavagai" from the empirical perspective; the fact that there seems to be one is, in my view, due to the fact that the syntax-semantic discussion has focused too heavily on the linguistic aspects of the problem.

Let be the symbol in the Arunta speaker's brain that activates when they say "gavagai". As Quine said towards the end of his quote, "some of these might become less likely – that is, become more unwieldy hypotheses – in the light of subsequent observation." Relatedly, Nick Bostrom argues that indeterminancy is a "matter of degree".

In practice, this means is that if, for instance, r="A rabbit is there" and s="A storm is coming tonight", then, if we accumulate enough observations, the is going to be predictive of better than it is of , or vice versa. Thus there is an empirical test for whether corresponds to some of these hypotheses.

But what about u="An undetached rabbit-part is there"? It may be very difficult to find a situation that distinguishes between u and r in practice - so does stand for "rabbit", or "undetached rabbit-part"?

Similarly to the example of the neural net in the previous post [LW · GW], is a symbol for both r and u. If no experiment can distinguish between the two - or, at least, if no experiment can distinguish between the two in any typical environment that any Arunta speaker would ever experience - then they are synonymous (or, equivalently, they are strongly within each other's web of connotations [LW · GW]).

If we presented the speaker with a situation far outside their typical environment, then we might be able to distinguish r from u - but we'd be uncertain if 'meant that all along', or if the speaker was extending the definition to a new situation.

Back to linguistics

There are some ways we might be able to distinguish whether "gavagai" means "rabbit" or "undetached rabbit-part", even if u and r cannot be distinguished in practice. It's plausible that there might be an internal variable in the speaker that corresponds to "part of an animal", and an internal that correponds to "undetached" (versus detached).

Then, when is activated, we might ask whether and are also activated, or not. You can see this as the internal web of connotations amongst the variables in the human brain. It seems people with different languages have different internal webs, different ways of connecting words and concepts together.

This does not, however, affect the connection between and external variables.

2 comments

Comments sorted by top scores.

comment by romeostevensit · 2018-09-24T19:58:13.083Z · score: 12 (7 votes) · LW · GW

Consider the mapping between a physical system and its representation. There are degrees of freedom in how the mapping is done. We should like the invariant parts of the respresentation to correspond to invariant parts of the physical system and likewise with variant parts. We'd like the variant parts to vary continuously if they vary continuously in the physical system and likewise for discretely. Some representations are tighter in that they have such type matching along more dimensions. A sparse representation that only captures some of the causal structure of the physical system (lossy) can be desirable if the other dimensions don't generate externalities relevant to our intent (the representation is modular in the same way that reality appears to be eg chemistry). When we find a lossless representation that has all of its variable parts varying in exactly the same way in the representation we bundle the whole thing up as an equation. That is to say a conservation relation.

This may all sound straightforward, tautological even. But I think it's worth examining in closer detail what the act of formalization is. Because of course we aren't actually comparing representations to physical systems, we're comparing representations with representations. Degrees of invariance is all we have. When we seek a way to test a hypothesis eg whether gavagai refers to a rabbit, a part of a rabbit, or a situation that includes a rabbit, we're seeking a way to collapse a degree of freedom in the respresentation. Sentences cut down the degrees of freedom in the relation between things until intent is clear. A hypothesis is of the form 'dimension X appears to vary, but is actually a function of dimension Y' which decreases the size of the search space by a whole dimension. Words are hypotheses about how reality is bundled. Sentences are hypotheses about how bundles relate.

comment by Stuart_Armstrong · 2018-09-25T09:56:12.121Z · score: 2 (1 votes) · LW · GW

Thanks, that was a useful way to think of things.