Language, intelligence, rationality

post by curiousepic · 2011-04-12T17:04:58.460Z · LW · GW · Legacy · 57 comments

Rationality requires intelligence, and the kind of intelligence that we use (for communication, progress, FAI, etc.) runs on language.

It seems that the place we should start is optimizing language for intelligence and rationality. One of SIAI's proposals includes using Lojban to interface between humans and an FAI. And of course, I should hope the programming language used to build a FAI would be "rational". But it would seem to me that the human-generated priors, correct epistemic rationality, decision theory, metaethics, etc. all depend on using a language that sufficiently rigorously maps to our territory.

Are "naturally evolved" languages such as English sufficient, with EY-style taboos and neologisms? Or are they sick to the core?

Please forgive and point me towards previous discussion or sequences about this topic.

57 comments

Comments sorted by top scores.

comment by erratio · 2011-04-12T22:45:53.829Z · LW(p) · GW(p)

It really bothers me when I see proposals to 'fix' language because as far as I'm concerned, natural languages are well-adapted to their environment.

  • The purpose of language, insofar as it has a specific purpose, is to get other people to do things. To get you to think of a concept the same way I do, to make you feel a specific emotion, to induce you to make me a sandwich, or whatever.

  • People's brains don't operate using strict logic. We're extremely good at pattern matching from noisy and ambiguous data, in a way that programs have yet to approach. eg. Google's probabilistic search correction does well at guessing what you meant to type when you produce an ambiguous or malformed string, but it can't infer that since all your searches in the last few minutes were all clustered around the topic of clinical psychology, your current search term of "hysterical" is probably meant to refer to the old psychiatric concept and not the modern usage of hysterical = funny. A human would have much less trouble working that out because they have a mental model of the current conversation that indicates that the technical definition of the word is relevant.

  • This is why it's not only ok, but in fact good for language to have a lot of built-in ambiguity - given the hardware that it runs on, it's much more efficient for me to rattle off an ambiguous sentence whose meaning is made clear through context, than it is for me to construct a sentence which is redundantly unambiguous due to our shared environment. Furthermore, my communicating in a sentence which is redundantly unambiguous carries the connotation that I have a low opinion of your ability to understand me, otherwise why would I put myself out so much to encode my meaning?

  • Lojban isn't nearly as unambiguous and logical as its creators wanted it to be. While it's true that its syntax is computer-readable, there is little to no improvement on the semantic level. And the pragmatic level of language is completely language-independent - there is always going to be a way to be unnecessarily vague, to be sarcastic, to misdirect, and so on, because those devices allow us to communicate nuances about our attitudes towards the subject of conversation and signal various personal traits in a way that straightforward communication doesn't support. So despite Lojban having the capacity to be clearer than most natural language, that's not how it will be used conversationally by any fluent speakers. And the same goes for any future constructed languages.

  • Taboos and neologisms don't work in the long-term, because human language evolves over time. Consider a few topics that we currently have taboos about: sex and defecation, and how many euphemisms we have to talk about them. The reason we have so many is that as each euphemism reaches peak usage, its meaning becomes too closely connected to the concept it was meant to stand in for. It's then replaced by a new untainted euphemism and the process repeats. Similarly, neologisms, once released into the wild, will take on new connotations or change meaning altogether.

Replies from: Vladimir_M, curiousepic
comment by Vladimir_M · 2011-04-13T01:26:25.806Z · LW(p) · GW(p)

And of course, if humans actively use some language that's very different from natural languages in any important respect, it will soon get creolized until it looks like just another ordinary human language.

This is what happened to e.g. Esperanto: it was supposed to be extraordinarily simple and regular, but once it caught up with a real community of speakers, it underwent a rapid evolution towards a natural language compatible with the human brain hardware, and became just as messy and complicated as any other. (Esperantists still advertise their language as supposedly specified by a few simple rules, but grammar books of real fluent Esperanto are already phone book-thick, and probably nowhere near complete.)

Replies from: komponisto, curiousepic
comment by komponisto · 2011-04-13T02:49:22.282Z · LW(p) · GW(p)

This contains a kernel of truth, but is also highly misleading in some important respects. Esperanto is extraordinarily simple and regular; the famous Sixteen Rules, while obviously not a complete description of the grammar of the language, still hold today as much as they did in 1887. To an uninformed reader, your comment may imply that Esperanto has perhaps since then evolved the same kind of morphological irregularities that we find in "natural" languages, but this isn't the case. There are no irregular inflections (e.g. verb conjugations or noun declensions), and the regular ones are simple indeed by comparison with many other languages. This significantly cuts down on the amount of rote memorization required to attain a working command of the language; and this is without mentioning the freedom in word-building that is allowed by the system of compounds and affixes.

What is true is that there are many linguistic features of Esperanto that aren't systematically standardized. But these are largely the kinds of features that only linguists tend to think about explicitly; L.L. Zamenhof, the creator of Esperanto, was a 19th-century oculist and amateur philologist, not a 20th-century academic linguist. As a result, he simply didn't think to invent things like a systematic phonology or register conventions for Esperanto; and so these things have been developed by speakers of the language over time, in the way they naturally do among humans. The thick grammar books you speak of are no doubt descriptions of such features. But these aren't the kind of books people use to learn any language, Esperanto included; and if you compare actual pedagogical books on Esperanto to those on "natural" languages, you will find that they are simpler.

Replies from: Vladimir_M
comment by Vladimir_M · 2011-04-13T03:21:34.605Z · LW(p) · GW(p)

To an uninformed reader, your comment may imply that Esperanto has perhaps since then evolved the same kind of morphological irregularities that we find in "natural" languages, but this isn't the case.

From my experience with learning several foreign languages, morphological irregularities look scary in the beginning, but they completely pale in comparison with the complexity and irregularity of syntax and semantics. There are many natural languages with very little morphological complexity, but these aren't any easier to learn to speak like a native. (On the other hand, for example, Slavic languages have very complicated and irregular inflectional morphology, but you'll learn to recite all the conjugations and declensions back and forth sooner than you'll figure out how to choose between the verbal aspects even approximately right.)

The thick grammar books you speak of are no doubt descriptions of such features. But these aren't the kind of books people use to learn any language, Esperanto included; and if you compare actual pedagogical books on Esperanto to those on "natural" languages, you will find that they are simpler.

However, the whole point is that in order to speak in a way that will sound natural and grammatical to fluent speakers, you have to internalize all those incredibly complicated points of syntax and semantics, which have developed naturally with time. Of course that nobody except linguists thinks about these rules explicitly, but fluent speakers judge instinctively whether a given utterance is grammatical based on them (and the linguist's challenge is in fact to reverse-engineer these intuitions into explicit rules).

(Even when it comes to inflectional morphology, assuming a lively community of Esperanto speakers persists into the future, how long do you think it will take before common contractions start grammaticalizing into rudimentary irregular inflections?)

Replies from: komponisto
comment by komponisto · 2011-04-13T05:46:09.651Z · LW(p) · GW(p)

From my experience with learning several foreign languages, morphological irregularities look scary in the beginning, but they completely pale in comparison with the complexity and irregularity of syntax and semantics.

I agree. However, making something look less scary in the beginning still constitutes an improvement from a pedagogical point of view. The more quickly you can learn the basic morphology and lexicon, the sooner you can begin the process of intuiting the higher-level rules and social conventions that govern larger units of discourse.

However, the whole point is that in order to speak in a way that will sound natural and grammatical to fluent speakers, you have to internalize all those incredibly complicated points of syntax and semantics, which have developed naturally with time.

Due to a large amount of basic structure common to all human language, it's usually not that hard to learn how to sound grammatical. The difficult part of acquiring a new language is learning how to sound idiomatic. And this basically amounts to learning a new set of social conventions. So there may not be much that language-planning per se can do to facilitate this aspect of language-learning -- which may be a large part of your point. But I would emphasize that the issue here is more sociological than linguistic: it isn't that the structure of the human language apparatus prevents us from creating languages that are easier to learn than existing natural languages -- after all, existing languages are not optimized for ease of learning, especially as second languages. It's just that constructing a grammar is not the same as constructing the conventions and norms of a speech community, and the latter may be a more difficult task.

(Even when it comes to inflectional morphology, assuming a lively community of Esperanto speakers persists into the future, how long do you think it will take before common contractions start grammaticalizing into rudimentary irregular inflections?)

This kind of drift will presumably happen given enough time, but it's worth noting that (for obvious reasons) Esperantists tend to be more disciplined about maintaining the integrity of the language than is typical among speakers of most languages, and they've been pretty successful so far.

Replies from: Eugine_Nier, Vladimir_M
comment by Eugine_Nier · 2011-04-14T01:17:46.669Z · LW(p) · GW(p)

This kind of drift will presumably happen given enough time, but it's worth noting that (for obvious reasons) Esperantists tend to be more disciplined about maintaining the integrity of the language than is typical among speakers of most languages, and they've been pretty successful so far.

One advantage Esperanto has over natural language, is that nearly all of its speakers speak it as a second language. That is way most of its learners are self-consciously trying to maintain its integrity.

comment by Vladimir_M · 2011-04-13T06:52:33.474Z · LW(p) · GW(p)

I agree. However, making something look less scary in the beginning still constitutes an improvement from a pedagogical point of view. The more quickly you can learn the basic morphology and lexicon, the sooner you can begin the process of intuiting the higher-level rules and social conventions that govern larger units of discourse.

That is true. One of my pet theories is that at beginner and intermediate levels, simple inflectional morphology fools people into overestimating how good they are, which gives them more courage and confidence to speak actively, and thus helps them improve with time. With more synthetic languages, people are more conscious of how broken their speech is, so they're more afraid and hesitant. But if you somehow manage to eliminate the fear, the advantage of analytic languages disappears.

Due to a large amount of basic structure common to all human language, it's usually not that hard to learn how to sound grammatical. The difficult part of acquiring a new language is learning how to sound idiomatic.

Here I disagree. Even after you learn to sound idiomatic in a foreign language, there will still be some impossibly convoluted issues of grammar (usually syntax) where you'll occasionally make mistakes that make any native speaker cringe at how ungrammatical your utterance is. For example, the definite article and choice of prepositions in English are in this category. Another example are the already mentioned Slavic verbal aspects. (Getting them wrong sounds really awful, but it's almost impossible for non-native speakers, even very proficient ones, to get them right consistently. Gallons of ink have been spent trying to formulate clear and complete rules, without much success.)

I don't know if any work has been done to analyze these issues from an evolutionary perspective, but it seems pretty clear to me that the human brain has an in-built functionality that recognizes even the slightest flaws in pronunciation and grammar characteristic of foreigners and raises a red flag. (This generalizes to all sorts of culture-specific behaviors, of course, including how idiomatic one's speech is.) I strongly suspect that the language of any community, even if it starts as a constructed language optimized for ease of learning by outsiders, will soon naturally develop these shibboleth-generating properties. (These are also important when it comes to different sociolects and registers within a community, of course.)

comment by curiousepic · 2011-04-13T01:49:00.413Z · LW(p) · GW(p)

I don't propose a widely-used language, only a highly specialized one created to work on FAI, and/or dissolving "philosophical" issues, essentially.

Replies from: Vladimir_M
comment by Vladimir_M · 2011-04-13T02:06:29.632Z · LW(p) · GW(p)

As far as I see, the closest thing to what you propose is mathematical notation (and other sorts of formal scientific notation). Sure, if you can figure out a more useful and convenient notation for some concrete problem, more power to you. However, at least judging by the historical experience, to do that you need some novel insight that motivates the introduction of new notation. Doing things the opposite way, i.e. trying to purify and improve your language in some general way hoping that this will open or at least facilitate new insight, is unlikely to lead you anywhere.

Replies from: curiousepic
comment by curiousepic · 2011-04-13T11:59:35.176Z · LW(p) · GW(p)

Please see my response to erratio here.

comment by curiousepic · 2011-04-13T00:53:35.256Z · LW(p) · GW(p)

The purpose of this reply, relative to my post, is ambiguous to me. I'm unsure if you're proposing that nothing about our language need change in order to end up with correct answers about the "big problems", or if this is simply a related but tangential opinion. Could you clarify? And no, I'm not saying this to prove a point :)

Replies from: erratio
comment by erratio · 2011-04-13T01:16:48.124Z · LW(p) · GW(p)

you're proposing that nothing about our language need change in order to end up with correct answers about the "big problems"

That's exactly what I'm saying, that natural language isn't broken, and in fact that most of what Lojbanists (and other people who complain about natural language being ambiguous) see as flaws are actually features. Most of our brain doesn't have a rigid logical map, so why have a rigid language?

Replies from: curiousepic
comment by curiousepic · 2011-04-13T01:46:18.786Z · LW(p) · GW(p)

It still seems to me that correct answers to the big problems do require a rigid logical map, and the fact that our brain does not operate on strict logic is besides the point. It may be completely impossible for humans to create/learn/use in practice such a language, and if so perhaps we are actually doomed, but I'd like to fork that into a separate discussion. And as I posted in a response to Vladimir, if it helps clarify my question, I don't propose a widely-used language, only a highly specialized one created to work on FAI, and/or dissolving "philosophical" issues, essentially.

I'd love to see a more detailed analysis of your position; as I implied earlier, your bullet points don't seem to address my central question, unless I'm just not making the right connections. It sounds like you've discussed this with others in the past, any conversations you could link me to, perhaps?

Replies from: erratio
comment by erratio · 2011-04-13T05:49:48.988Z · LW(p) · GW(p)

I may have read too much into the first and second sentences of your post - I felt that you were suggesting that the only way for us to achieve sufficient rationality to work on FAI or solve important problems would be to start using Lojban (or similar) all the time.

So my response to using a language purely for working on FAI is much the same as Vladimir's - sounds like you're talking more about a set of conventions like predicate logic or maths notation than a language per se. Saddling it with the 'language' label is going to lead to lots of excess baggage, because languages as a whole need to do a lot of work.

It sounds like you've discussed this with others in the past

It's the argument nearly anyone with any linguistic knowledge will have with countless people who think that language would be so much better if it was less ambiguous and we could just say exactly what we meant all the time. No convenient links though, sad to say

It still seems to me that correct answers to the big problems do require a rigid logical map

Such as decision theories?

Replies from: curiousepic
comment by curiousepic · 2011-04-13T11:56:37.954Z · LW(p) · GW(p)

Apologies, I can see how you would have assumed that, my OP wasn't as clearly formed as I thought.

I think one of my main confusions may be ignorance of how dependent DT, things like CEV, and metaethics are on actual language, rather than being expressed in such a mathematical notation that is uninfluenced by potentially critical ambiguities inherent in evolved language. My OP actually stemmed from jimrandomh's comment here, specifically jim's concerns about fuzzy language in DT. I have to confess I'm (hopefully understandably) not up to the challenge of fully understanding the level of work jim and Eliezer and others are operating on, so this (language dependence) is very hard for me to judge.

comment by Nisan · 2011-04-13T00:34:18.649Z · LW(p) · GW(p)

It's pretty ridiculous that SIAI thinks it's a good idea to use Lojban to teach programs about ethics. The distinctive feature of Lojban is that it's easy to parse; but nowadays we have kinda decent natural language parsers. I used a constructed language to communicate with my class project in college, but only because I didn't want to bother with figuring out how to use an English parser, and because I knew no one but me would ever talk to it.

Replies from: Will_Newsome, Vladimir_M
comment by Will_Newsome · 2011-04-13T23:42:18.713Z · LW(p) · GW(p)

It's pretty ridiculous that SIAI thinks it's a good idea to use Lojban to teach programs about ethics.

I'd bet a fair amount of money that the Lojban thing was Ben Goertzel's idea and a lot more that nobody at SIAI who is going to actually be doing FAI research thinks this is an even remotely good idea.

comment by Vladimir_M · 2011-04-13T04:06:17.520Z · LW(p) · GW(p)

nowadays we have kinda decent natural language parsers.

In my opinion, these still have a long way to go. (Panel #3 is my personal favorite.)

comment by Eugine_Nier · 2011-04-13T03:04:38.785Z · LW(p) · GW(p)

The lowest hanging fruit in this regard is probably vocabulary that divides thingspace into more natural categories. See the whole Human's Guide to Words Sequence for more details.

Replies from: curiousepic
comment by curiousepic · 2011-04-13T11:23:56.465Z · LW(p) · GW(p)

Thanks, I'll spend some time groking these, but I have a feeling these won't quite dissolve my main concern, whether it's simply a confusion or a genuine issue. I'll try to more clearly state my question in some of the other comment threads.

comment by AdeleneDawner · 2011-04-12T18:04:26.249Z · LW(p) · GW(p)

Interesting question. My thought is that the 'compression mode' model of language - that it doesn't actually communicate very much, but relies on the recipient having a similar enough understanding of the world to the sender to decode it - is relevant here. I'm not sure, but it seems at least plausible to me that English and other similar languages are compressed in such a way that while an AI could decode them, it wouldn't be very efficient and wouldn't necessarily be something that we would want.

ETA: If this is the case, conversational Lojban probably has the same problem, but Lojban appears to be extensible in ways that English is not, so it may do a better job of rising to the challenge by way of something like a specialized grammar.

Replies from: David_Gerard
comment by David_Gerard · 2011-04-12T19:38:39.366Z · LW(p) · GW(p)

My thought is that the 'compression mode' model of language - that it doesn't actually communicate very much, but relies on the recipient having a similar enough understanding of the world to the sender to decode it - is relevant here.

i.e., language is something that works on the listener's priors, like all intersubjective things.

comment by jimrandomh · 2011-04-13T01:09:23.836Z · LW(p) · GW(p)

The idea of an artificial language is fine, but learning an entire new vocabulary for a language almost no one speaks is obviously a waste of time; with the time and cognitive resources it would take to learn Lojban, I could learn something far more valuable.

But there's no particular reason for artificial languages to have their own vocabulary. An artificial language which provided an improved set of function words, grammar, and conjugations, but retained compatibility with English adjectives, nouns and verbs - that I would be willing to learn. It's too bad those in the intersection of linguists and novelty seekers is stuck on Lojban, because their effort there is wasted.

Replies from: Emily, curiousepic
comment by Emily · 2011-04-13T17:49:27.776Z · LW(p) · GW(p)

(Linguist here --- linguistics student, anyway.)

I think a problem with this suggestion may be an insufficient unpacking of categories like "adjectives, nouns and verbs". These aren't really labels for "properties" or "things" or "actions", as is sometimes taught; attempts to define the categories this way break down pretty quickly when you examine them more closely. Rather, they're names for the categories of syntactic behaviour exhibited by words.

This means that if you want to retain English words in your artificial-grammar language, you're probably going to have to retain English syntax to some quite large extent --- either that, or re-map each word to a rigid definition and set of syntactic properties to a point where the best that can be said about the relationship is that the form of the word acts as something of a memory cue to its meaning for English speakers. I suppose this is the approach taken for a small set of vocabulary by many programming languages.

That's not necessarily an unsound approach, but given humans' strong intuitions about and extensive practice with our native languages, it might not necessarily be any easier to learn such a system (breaking down and recreating all the properties of all the words you already know) than to just start out fresh in terms of vocabulary.

Of course, I don't know for sure if that would be the case or not; I mainly wanted to point out that blithely asking for the adjectives, nouns and verbs to be retained from English entails a lot of things you might not have considered.

Replies from: jimrandomh
comment by jimrandomh · 2011-04-13T18:37:08.345Z · LW(p) · GW(p)

There's certainly a lot of complexity being glossed over, but I think it's manageable. Natural languages borrow words from each other all the time, and while there are issues and ambiguities with how to do it, they develop rules that seem to cover them - forbbidden phonemes and clusters get replaced in predictable ways, affixes get stripped off and replaced with local versions, and the really hard cases like highly-irregular verbs, prepositions and articles form closed sets, so they don't need to be borrowable.

If I'm translating a math paper from English to an artificial language, and the author makes up a new concept and calls it blarghlizability, I should be able to find a unique, non-conflicting and invertible translation by replacing the -izability affix and leaving the rest the same or applying simple phonetic transforms on it. More importantly, this translation process should determine most of the language's vocabulary. It's the difference between a language that has O(n) things to memorize and a language that has O(1) things to memorize.

(EDIT: Deleted a half-finished sentence that I'll finish in a separate reply downthread)

Replies from: Emily
comment by Emily · 2011-04-13T20:34:58.087Z · LW(p) · GW(p)

Yes, that's true about natural language borrowing, to some extent. Note that calques (borrowing of a phrase with the vocabulary translated but the syntactic structure of the source language retained) are also common; presumably the artificial language would want to avoid these.

Also, some very high percentage of natural language borrowings are nouns. This clearly has a lot to do with the fact that if you encounter a new object, a natural way to label it is to adopt the existing term of the people who've already encountered it, but I think there are other factors: I reckon it would be fair to say that nouns are syntactically less complex than verbs. I suspect you'd encounter least trouble importing concrete nouns into your artificial language.

It's interesting that you talk about the phonetics of the artificial language; I realised that I've been imagining something entirely text-based, but of course there's no reason it should be (except for the practical/technical difficulties with processing acoustic input, I guess).

I'm curious what got missed off at the end of your post?

Replies from: jimrandomh
comment by jimrandomh · 2011-04-13T20:41:38.273Z · LW(p) · GW(p)

Oops, I want back to edit and forgot to write the rest of that paragraph.

I was going to say, supporting borrowing means you need to retain all the borrowable word forms - nouns, adjectives, and verbs - which rules out some extra-radical possibilities, like making a language where verbs are a closed set and actions are represented by nouns. But to my knowledge no natural languages do that, so that's not much of a restriction.

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-14T00:51:41.347Z · LW(p) · GW(p)

I think that most of the potential lies in the "extra-radical possibilities". The traditional linguistics descriptions (adjectives, nouns, prepositions, and so on) don't seem to apply very well to any of my word languages. After all, they're just a bunch of natural language components; they needn't show up in an artificial language.

For example, in one of my word languages, there's no distinction between nouns and adjectives (meaning that there aren't any nouns and adjectives, I guess). To express the equivalent of the phrase "stupid man", you simply put the word referring to the set of everything stupid, next to the one referring to the one of everything that's a man, and put the word for set intersection in front of it. You get one of these two examples:

  • either: [set intersection] [set of everything stupid] [set of everything that's a man]
  • or: [set intersection] [set of everything that's a man] [set of everything stupid]

Of course that assumes that there's no single word already referring to the intersection of those two sets, or that you just don't want to use it, but whatever. I just meant to give it as an example.

I think that this system makes it more elegant, but it's not a terribly big improvement. And it's not very radical either. The more radical and useful stuff, I'm not ready to give an example of. This is just something simple. But it's sufficient to say that you shouldn't let the traditional descriptions constrain you. If you're trying to make a better language, why limit yourself to just mixing and matching the old parts? There's a world of opportunity out there, but you're not gonna find much of it if you trap yourself in the "natural language paradigm".

Replies from: erratio
comment by erratio · 2011-04-14T08:50:22.464Z · LW(p) · GW(p)

in one of my word languages, there's no distinction between nouns and adjectives

There are Australian Aboriginal languages that work a lot like this, and in some ways go further. The equivalent of the sentence "Big is coming" would be perfectly grammatical in Dyirbal, with the big thing(s) to be determined through the surrounding context. In some other languages, there's little or no distinction between adjectives and verbs, so the English "this car is pink" would be translated to something more like "this car pinks"

Basically what I'm saying is that a large number of the more obvious "extra-radical possibilities" are already implemented in existing languages, albeit not in the overstudied languages of Europe.

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-14T16:33:11.700Z · LW(p) · GW(p)

By the way, in that word language, I simply have a group of 4 grammatical particles, each referring to 1 of the 4 set operations (union, intersection, complement, and symmetric difference). That simplifies a few of the systems that we find in English or whatever. For example, we don't find intersection only in the relationship between a noun and an adjective; we also find it in a bunch of other places. Here's a list of a bunch of examples of where we see one of the set operations in English:

  • There's a deer over there, and he looks worried. (intersection)
  • He's a master cook. (intersection between "master" and "cook")
  • The stars are the suns and the planets. (union)
  • Either there's a deer over there, or I'm going crazy. (symmetric difference)
  • Everybody here except Phil is an idiot. (complement)
  • Besides when I'm doing economics, I'm an academic idiot. (complement)
  • A lake-side or ocean-side view in addition to a comfortable house is really all I want out of life. (intersection)
  • A light bulb is either on or off. (symmetric difference)
  • It's both a table and a chair. (intersection)
  • Rocks that aren't jagged won't work for this. (complement)
  • A traditional diet coupled with a routine of good exercise will keep you healthy. (intersection)
  • A rock or stone will do. (union)

I might be wrong about some of those, so look at them carefully. And I'm sure there are a bunch of other examples. Maybe I missed a lot of the really convoluted ones because of how confusing they are. Either way, the point is that there are a bunch of random examples of the set operations in English. I think simply having a group of 4 grammatical particles for them would make the system a lot simpler and perhaps easier to learn and use.

Are there any natural language that do anything like this? Sure, there are probably a lot of natural languages that don't make the distinction between nouns and adjectives. That distinction is nearly useless in a SVO language. We even see English speakers "violate" the noun/adjective system a lot. For example, something like this: "Hand me one of the longs." If you work someplace where you constantly have to distinguish between the long and short version of a tool, you'll probably hear that a lot. But are there are any natural languages that use a group of grammatical particles in this way? Or at the very least use one of them consistently?

Note: Perhaps I'm being too hard on the noun/adjective system in English. It's often useless, but it serves a purpose that keeps it around. Two nouns next to each other (e.g., "forest people") signifies that there's some relation between the two sets, whereas an adjective in front of a noun signifies that the relation is specifically intersection. That seems to be the only point of the system. Maybe I'm missing something?

Another note: I'm not an expert on set theory. Maybe I'm abusing some of these terms. If anybody thinks that's the case, I would appreciate the help.

comment by curiousepic · 2011-04-13T03:20:05.010Z · LW(p) · GW(p)

An artificial language which provided an improved set of function words, grammar, and conjugations, but retained compatibility with English adjectives, nouns and verbs

This is along the lines of what I'm thinking.

comment by David_Gerard · 2011-04-12T19:18:32.168Z · LW(p) · GW(p)

So how many people reading have actually learnt Lojban?

(I think there's one person actually interested in Lojban on RW ...)

Replies from: Richard_Kennaway, JGWeissman, AdeleneDawner
comment by Richard_Kennaway · 2011-04-13T07:22:19.872Z · LW(p) · GW(p)

I was involved with Loglan and Lojban (which forked from Loglan) years ago, and learnt some of both, although never to the point of being able to use them without constant recourse to the dictionary.

I found it useful, not so much for using, but for some ways it provides of looking at the constructions of any human language. In English, you can say that something is "good". The equivalent Lo**an word -- let's assume it's "gudbi" -- is at least a 4-place relation: X1 is better than X2 for purpose X3 by standard X4. You can still say "da gudbi" -- "X1 is good" -- but the empty 2nd, 3rd, and 4th places are available for anyone to immediately ask "better than what?", "better for what purpose?", etc. Most Lo**an content words ("bridi", in Lojban teminology) are multi-place relations that take as many arguments as the vocabulary designers decided was necessary to their meaning. English, in contrast, grammatically imposes the number: 1 for nouns ("X is a door") and simple adjectives ("X is good"), 2 for comparative adjectives ("X is better than Y"), and 2 or 3 for verbs ("I saw him", "I gave him a book").

Another example is the definite article. What does "the" mean in English? The nearest equivalent Lo**an word means "that specific individual which I intend to designate by the following description", and it is up to the speaker to choose a description that communicates to the listener which individual that is.

Replies from: AdeleneDawner
comment by AdeleneDawner · 2011-04-13T15:26:52.719Z · LW(p) · GW(p)

In English, you can say that something is "good". The equivalent Lo**an word -- let's assume it's "gudbi" -- is at least a 4-place relation...

Actually, if your hypothetical four-place word is the nearest equivalent, wouldn't it be technically true to say that one can't, or at least can't simply, describe something as having-the-innate-quality-of-goodness in Lo* at all? That's how I understood it to work, and it was one of the things I liked about the language when I was studying it.

Of course, people who have the idea that goodness can be an innate quality will try to use it that way anyway, regardless of correctness. How does the Lojban community handle that kind of thing?

And since I'm asking, is it possible to describe something as having-the-innate-quality-of-goodness (as opposed to having the innate quality of tending to be better for most uses than most other things) in Lojban? How?

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2011-04-13T15:38:09.503Z · LW(p) · GW(p)

Actually, if your hypothetical four-place word is the nearest equivalent, wouldn't it be technically true to say that one can't, or at least can't simply, describe something as having-the-innate-quality-of-goodness in Lo* at all? That's how I understood it to work, and it was one of the things I liked about the language when I was studying it.

In effect, yes. The four argument-places of "gudbi" are always there. You never have to fill in any of them, but leaving a place syntactically empty means not that there is nothing there semantically, but that it is filled by whatever you intend fills it. By omitting to say what, you are assuming that it will be clear to your listener from context. Most adjectives -- that is, predicates that correspond to English adjectives -- are comparative in Lojban. They mean "X1 is more (whatever) than X2 (perhaps with additional places)".

Of course, people who have the idea that goodness can be an innate quality will try to use it that way anyway, regardless of correctness. How does the Lojban community handle that kind of thing?

It's been a while, so you'd have to ask the Lojbanists themselves about that. If you wanted to say this explicitly, you'd end up with some circumlocution that would back-translate as something like "this is-an-example-of the-mass-of-good-things", and even then, the listener can still come back with "gudbi ie", or "better than what?", just as readily as the English sentence "I gave" immediately suggests the questions "what? and to whom?"

comment by JGWeissman · 2011-04-12T19:30:06.119Z · LW(p) · GW(p)

I might be interested in learning Lojban if a group of people I interact with daily would learn it with me.

Replies from: Normal_Anomaly
comment by Normal_Anomaly · 2011-04-12T21:51:24.216Z · LW(p) · GW(p)

I like this idea. Maybe some of us should embark on a project to learn Lojban together. If we want a concrete goal, it could culminate in translating a couple of Sequence posts. That would be a way to demonstrate our skills while also seeing how easily rationalist writing can be translated to a rationalist language.

comment by AdeleneDawner · 2011-04-12T20:09:06.179Z · LW(p) · GW(p)

I started to, but didn't get very far. I might pick it up again at some point - it's definitely interesting how different it is from what I'm used to, and it seems to actually be a bit closer to how I naturally think in some, but not all, ways. (In fact, it's close enough that I should probably rethink my claim that I don't think in language - I certainly don't think in English, but how I think isn't all that much farther from English than Lojban is.)

comment by Clippy · 2011-04-13T16:11:28.201Z · LW(p) · GW(p)

Standardising on communication protocol superior to natural ape language: Good
Choosing Lojban for this purpose: Bad

If you're going to relearn an entirely new communications protocol, you should a more comprehensive, unified approach that solves other difficult problems at the same time.

I'm speaking, of course, of CLIP (Clippy Language Interface Protocol). It merges language (including machine language), epistemology, and meta-ethics into one protocol.

Other than the obvious aid it provides in interacting with clippys and making paperclips, it enforces traceability of all knowledge and inferences so that any correction of reasoning error or belief update can be quickly and seamlessly implemented and the implications thereof identified.

I'm still willing to teach for around 30 000 USD, which can simply be transferred to User:Kevin in my name as payment.

comment by NancyLebovitz · 2011-04-13T08:01:58.798Z · LW(p) · GW(p)

Have there been any reports from people who can think in Lojban, and whether it's helped them to think more clearly?

comment by Normal_Anomaly · 2011-04-12T21:13:59.512Z · LW(p) · GW(p)

I've been a fan of constructed languages, especially Lojban, for a while, though I've never had the time and initiative to learn one. English is packed with mind projection fallacies and similar, and a "rational" language could be an improvement. However, if it was used broadly instead of just to talk to a FAI, it might develop some of the irrationalities common to English.

comment by Ian_Ryan · 2011-04-13T00:33:38.123Z · LW(p) · GW(p)

For a few years now, I've been working on a project to build an artificial language. I strongly suspect that the future of the kind of communication that goes on here will belong to an artificial language. English didn't evolve for people like us. For our purpose, it's a cumbersome piece of shit, rife with a bunch of fallacies built directly into its engine. And I assume it's the same way with all the other ones. For us, they're sick to the core.

But I should stress that I don't think the future will belong to any kind of word language. English is a word language, Lojban is a word language, etc. Or at least I don't think the whole future will belong to one. We must free ourselves from the word paradigm. When somebody says "language", most people think words. But why? Why not think pictures? Why not diagrams? I think there's a lot of potential in the idea of building a visual language. An artificial visual language. That's one thing I'm working on.

Anyway, for the sake of your rationality, there's a lot at stake here. A bad language doesn't just fail to properly communicate to other people; it systematically corrupts its user. How often do you pick up where you left off in a thought process by remembering a bunch of words? All day every day? Maybe your motto is to work to "improve your rationality"? Perhaps you write down your thoughts so you can remember them later? And so on. It's not just other people who can misinterpret what you say; it's also your future self who can misinterpret what you present self says. That's how everybody comes to believe such crazy stuff. Their later selves systematically misinterpret their sooner selves. They believe what they hear, but they hear not what they meant to say.

Replies from: Vladimir_M, Nisan
comment by Vladimir_M · 2011-04-13T01:45:39.808Z · LW(p) · GW(p)

For a few years now, I've been working on a project to build an artificial language.

I don't want to sound disrespectful towards your efforts, but to be blunt, artificial languages intended for communication between people are a complete waste of time. The reason is that human language ability is based on highly specialized hardware with a huge number of peculiarities and constraints. There is a very large space for variability within those, of course, as is evident from the great differences between languages, but any language that satisfies them has roughly the same level of "problematic" features, such as irregular and complicated grammar, semantic ambiguities, literal meaning superseded by pragmatics in complicated and seemingly arbitrary ways, etc., etc.

Now, another critical property of human languages is that they change with time. Usually this change is very slow, but if people are forced to communicate in a language that violates the natural language constraints in some way, that language will quickly and spontaneously change into a new natural language that fits them. This is why attempts to communicate in regularized artificial languages are doomed, because a spontaneous, unconscious, and irresistible process will soon turn the regular artificial language into a messy natural one.

Of course, it does make sense to devise artificial languages for communication between humans and non-human entities, as evidenced by computer programming languages or standardized dog commands. However, as long as they have the same brain hardware, humans are stuck with the same old natural languages for talking to each other.

Replies from: Amanojack, Ian_Ryan
comment by Amanojack · 2011-04-13T08:09:36.058Z · LW(p) · GW(p)

I don't want to sound disrespectful towards your efforts, but to be blunt, artificial languages intended for communication between people are a complete waste of time.

A word language constructed from scratch based purely on what the creator thinks superior would indeed fall prey to your criticisms, but there a third possibility between a totally natural and totally artificial language. For lack of a better term, I'll call it a cultivated language. That is, a language built up out of real efforts to communicate for practical purposes, but with deliberate constraints imposed by the medium.

When language first formed, humans could mostly only communicate in a linear way, the linearity of communication using mouths and ears being the bottleneck. The introduction of writing systems could eventually have fixed this (through a visual non-linear language like saizai's), if not for inertia, as well as the fact that most non-intellectual people would be less interested in learning a language that had no carryover to speech.

But now we have the technology for a project that would place constraints on how people could communicate and just see what happens. In particular, if people could only communicate in 2D diagrams on a website designed for this language cultivation project, they might end up with something like saizai is trying to design, except it would be spontaneous.

And if there is any merit in Ian Ryan's arguments for a constructed language above, those insights could be incorporated into the constraints on the users to see how they play out. That seems to be the best of both worlds: a sort of guided evolution.

comment by Ian_Ryan · 2011-04-13T02:17:30.517Z · LW(p) · GW(p)

How are you so sure of all that stuff?

Replies from: Vladimir_M
comment by Vladimir_M · 2011-04-13T02:48:01.186Z · LW(p) · GW(p)

If you specify in more detail which parts of what I wrote you dispute, I can provide a more detailed argument.

As the simplest and most succinct argument against artificial languages with allegedly superior properties, I would make the observation that human languages change with time and ask: what makes you think that your artificial language won't also undergo change, or that the change will not be such that it destroys these superior properties?

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-13T03:18:24.585Z · LW(p) · GW(p)

If you build an artificial word language, you could make it in such a way that it would drive its own evolution in a useful way. A few examples:

  • If you make a rule available to derive a word easily, it would be less likely that the user would coin a new one.
  • If you also build a few other languages with a similar sound structure, you could make it super easy to coin new words without messing up the sound system.
  • If you make the sound system flow well enough, it would be unlikely that anybody would truncate the words to make it easier to pronounce or whatever.

I don't understand how you could dismiss it out of hand that you could build a language that wouldn't lose its superior qualities. There are a ton of different ways to make the engine defend itself in that regard. People mess with the English sound system only to make it flow better, and there's no reason why you couldn't just make an artificial language which already flows well enough.

Also, I'm not gonna try to convert the masses to my artificial language. In normal life, we spend a lot of our time using English to try to do something other than get the other person to think the same thought. We try to impress people, we try to get people to get us a glass of water, etc. I'm not interested in building a language for that kind of communication. All I'm interested in is building a language for what we try to do here on LW: reproduce our thought process in the other person's head.

But what that means is that the "wild" needn't be so wild. If the only people who use the artificial language are 1,000 people like you and me, I don't see why we couldn't retain its superior structure. I don't see why I would take a perfectly good syntax and start messing with it. It would be specialized for one purpose: reproducing one's thoughts in another's head, especially for deep philosophical issues. We would probably use English in a lot of our posts! We would probably use a mix of English and the artificial language.

My response ("how are you so sure of all that stuff") probably wasn't very constructive, so I apologize. Perhaps I should have asked for an example of an artificial language that transformed into an irregular natural one. Since you probably would have mentioned Esperanto, I'll respond to that. Basically, Esperanto was a partially regularized mix and match of a bunch of different natural language components. I have no interest in building a language like that.

Languages like Esperanto are still in the "natural language paradigm"; they're basically just like idealized natural languages. But I have a different idea. If I build an artificial word language, its syntax won't resemble any natural language that you've seen. At least not in that way. Actually, it would probably be more to the point to simply say that Esperanto was built for a much different reason. It's a mix and match of a bunch of natural language components, and people use it like they use a natural language. It's not surprising that it lost some of its regularity.

I'm getting pretty messy in this post, but I simply don't have a concise response to this topic. Everywhere I go, people seem to have that same idea about artificial language. They say that we're built for natural language, and either artificial language is impossible, or it would transform into natural language. I really just don't know where people get that idea. How could we conceive of and build an artificial language, but at the same time be incapable of using it? That seems like a totally bizarre idea. Maybe I don't understand it or something.

Replies from: Vladimir_M
comment by Vladimir_M · 2011-04-13T03:56:04.961Z · LW(p) · GW(p)

If you plan to construct a language akin to programming languages or mathematical formulas, i.e. one that is fully specified by a formal grammar and requires slow and painstaking effort for humans to write or decode, then yes, clearly you can freeze it as an unchangeable standard. (Though of course, devising such a language that is capable of expressing something more general is a Herculean task, which I frankly don't consider feasible given the present state of knowledge.)

On the other hand, if you're constructing a language that will be spoken by humans fluently and easily, there is no way you can prevent it from changing in all sorts of unpredictable ways. For example, you write:

People mess with the English sound system only to make it flow better, and there's no reason why you couldn't just make an artificial language which already flows well enough.

However, there are thousands of human languages, which have all been changing their pronunciation for (at least) tens of thousands of years in all kinds of ways, and they keep changing as we speak. If such a happy fixed point existed, don't you think that some of them would have already hit it by now? The exact mechanisms of phonetic change are still unclear, but a whole mountain of evidence indicates that it's an inevitable process. Similar could be said about syntax, and pretty much any other aspect of grammar.

Look at it this way: the fundamental question is whether your artificial language will use the capabilities of the human natural language hardware. If yes, then it will have to change to be compatible with this hardware, and will subsequently share all the essential properties of natural languages (which are by definition those that are compatible with this hardware, and whose subset happens to be spoken around the world). If not, then you'll get a formalism that must be handled by the general computational circuits in the human brain, which means that its use will be very slow, difficult, and error-prone for humans, just like with programming languages and math formulas.

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-13T04:17:24.469Z · LW(p) · GW(p)

However, there are thousands of human languages, which have all been changing their pronunciation for (at least) tens of thousands of years in all kinds of ways, and they keep changing as we speak. If such a happy fixed point existed, don't you think that some of them would have already hit it by now?

No, I don't. Evolution is always a hack of what came before it, whereas scrapping the whole thing and starting from scratch doesn't suffer from that problem. I don't need to hack an existing structure; I can build exactly what I want right now.

Here's an excellent example of this general point: Self-segregating morphology. That's the language construction term for a sound system where the divisions between all the components (sentences, prefixes, roots, suffixes, and so on) are immediately obvious and unambiguous. Without understanding anything about the speech, you know the syntactical structure.

That's a pretty cool feature, right? It's easy to build that into an artificial language, and it certainly makes everything easier. It would be an important part of having a stable sound system. The words wouldn't interfere with each other, because they would be unambiguously started and terminated within a sound system where the end of every word can run smoothly against the start of any other word. If I were trying to make a stable sound system, the first thing that I would do is make the morphology self-segregating.

But if a self-segregating morphology is such a happy point, why hasn't any natural language come to that point? Well, that should be pretty obvious. No hack could transform a whole language into a having a self-segregating morphology. Or at least I don't know of such a hack. But even then, it's trivially easy to make one if you start from scratch! Don't you accept the idea that some things are easier to design than evolve (because perhaps the hacking process doesn't have an obvious way to be useful throughout every step to get to the specific endpoint)?

The exact mechanisms of phonetic change are still unclear, but a whole mountain of evidence indicates that it's an inevitable process.

That whole mountain of evidence concerns natural languages with irregular sound systems. A self-segregating morphology that flows super well would be a whole different animal.

Look at it this way: the fundamental question is whether your artificial language will use the capabilities of the human natural language hardware. If yes, then it will have to change to be compatible with this hardware, and will subsequently share all the essential properties of natural languages (which are by definition those that are compatible with this hardware, and whose subset happens to be spoken around the world). If not, then you'll get a formalism that must be handled by the general computational circuits in the human brain, which means that its use will be very slow, difficult, and error-prone for humans, just like with programming languages and math formulas.

Per my points above, I still don't see why using the capabilities of the natural language hardware would lead to it changing in all sorts of unpredictable ways, especially if it's not used for anything but trying to reproduce your thought in their head, and if it's not used by anybody but a specific group of people with a specific purpose in mind. I still imagine an engine well-built to drive its own evolution in a useful way, and avoid becoming an irregular mess.

Replies from: erratio, Vladimir_M
comment by erratio · 2011-04-13T05:16:04.582Z · LW(p) · GW(p)

Self-segregating morphology. That's the language construction term for a sound system where the divisions between all the components (sentences, prefixes, roots, suffixes, and so on) are immediately obvious and unambiguous. Without understanding anything about the speech, you know the syntactical structure.

Only until phonological changes, morphological erosion, cliticisation, and sundry other processes take place. And whether and how those processes happen isn't related to how well the phonology flows, either, as far as I can tell.

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-13T05:23:47.221Z · LW(p) · GW(p)

The flow thing was just an example. The point was simply to illustrate that we shouldn't reject out of hand the idea that an ordinary artificial language (as opposed to mathematical notation or something) could retain its regularity.

The point is simply that the evolution of the language directly depends on how it starts, which means that you could design in such a way that it drives its evolution in a useful way. Just because it would evolve doesn't mean that it would lose its regularity. The flow thing is just one example of many. If it flows well, that's simply one thing to not have to worry about.

comment by Vladimir_M · 2011-04-13T07:05:57.814Z · LW(p) · GW(p)

That whole mountain of evidence concerns natural languages with irregular sound systems. A self-segregating morphology that flows super well would be a whole different animal.

How do you know that? To support this claim, you need a model that predicts the actually occurring sound changes in natural languages, and also that sound changes would not occur in a language with self-segregating morphology. Do you have such a model? If you do, I'd be tremendously curious to see it.

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-13T14:24:07.587Z · LW(p) · GW(p)

Sorry, I should have said that it's not necessarily the same animal. The whole mountain of evidence concerns natural languages, right? Do you have any evidence that an artificial language with a self-segregating morphology and a simple sound structure would also go through the same changes?

So I'm not necessarily saying that the changes wouldn't occur; I'm simply saying that we can't reject out of hand the idea that we could build a system where they won't occur, or at least build a system where they would occur in a useful way (rather than a way that would destroy its superior qualities). Where the system starts would determine its evolution; I see no reason why you couldn't control that variable in such a way that it would be a stable system.

comment by Nisan · 2011-04-13T05:07:27.906Z · LW(p) · GW(p)

An artificial visual language. That's one thing I'm working on.

Is it by any chance a nonlinear fully two-dimensional writing system?

Replies from: Ian_Ryan
comment by Ian_Ryan · 2011-04-13T05:46:19.963Z · LW(p) · GW(p)

Thanks for the link. Yeah, that's one of the ideas. It's still in its infancy though, so I don't have anything to show off.

comment by Oscar_Cunningham · 2011-04-12T21:16:55.051Z · LW(p) · GW(p)

Presumably the AI wouldn't work in Lojban on the inside, and presumably it would also be clever enough to understand what we mean when we talk to it in English. It wouldn't automatically have human cultural knowledge, but it would know that this was required to understand us, so it would go and acquire some.

comment by Ian_Ryan · 2011-04-13T00:26:51.075Z · LW(p) · GW(p)

For a few years now, I've been working on a project to build an artificial language. I strongly suspect that the future of the kind of communication that goes on here, will belong to an artificial language. English didn't evolve for people like us. For our purpose, it's a cumbersome piece of shit, rife with a bunch of fallacies built directly into its engine. And I assume it's the same way with all the other ones. For us, they're sick to the core.

But I should stress that I don't think the future will belong to any kind of word language (whether natural or artificial). English is a word language, Lojban is a word language, etc. Or at least I don't think the whole future will belong to one. We must free ourselves from the word paradigm. When somebody says "language", most people think words. But why? Why not think pictures? Why not diagrams?