The rational way to name rivers

philgoetz

The rational way to name rivers

post by PhilGoetz · 2014-08-06T15:41:06.598Z · LW · GW · Legacy · 42 comments

42 comments

I just read this in the Wikipedia article on the Mattaponi River and it really tickled me. If only all language were so rational!

The Mat River and the Ta River join in Spotsylvania County to form the Matta River;

The Po River and the Ni River join in Caroline County to form the Poni River;

The Matta River and the Poni River join in Caroline County to form the Mattaponi River.

42 comments

Comments sorted by top scores.

comment by roystgnr · 2014-08-06T17:19:57.747Z · LW(p) · GW(p)

The trouble is that larger rivers are likely to be referred to more often than smaller rivers, and you want the more-often-used concepts to get shorter names. Connecting river names is still good, but I think "the White Nile and the Blue Nile join to form the Nile" is a better way to do it.

comment by Adele_L · 2014-08-07T03:21:20.513Z · LW(p) · GW(p)

Another really cool language design is Korean hangul. The form of each letter represents how you put your mouth to vocalize it - among many nice features.

Replies from: None, IlyaShpitser, Creutzer

↑ comment by [deleted] · 2014-08-08T00:05:25.025Z · LW(p) · GW(p)

English has Shavian.

There's also Deseret, which I've made some tools for, but it's not featural (beyond some isolated cases, like ligatures for some-but-not-all diphthongs) and is somewhat confusing to learn.

Neither of these will be generally usable for the immediate future, since they're both in Unicode's astral planes, and some common piece of web framework (old versions of MySQL, IIRC) silently fails on encountering astral-plane characters. Font support is another issue, but Deseret is slightly better-supported than Shavian -- my Win8 install came with a font for the former, but not the latter.

(If there's anything after the following colon, LW doesn't have this bug: 𐑄𐐮𐑅 𐐮𐑆 𐐩 𐐻𐐯𐑅𐐻)

Replies from: ChristianKl

↑ comment by ChristianKl · 2014-08-08T13:41:33.794Z · LW(p) · GW(p)

If you wanted to use them you could build Chrome and firefox plugin that automatically parses all English text into Deseret. At the same time you could write a wordpress plugin that automatically offers users under des.domain.name a version of the website in Deseret.

Replies from: None

↑ comment by [deleted] · 2014-08-09T21:56:27.558Z · LW(p) · GW(p)

That would be difficult. Deseret script is phonetic, so you'd have to either look up the pronunciation for each word or eat the imperfection from the ~40% of words that can't be easily predicted.

Deseret script as it's supposed to be used is even harder to automate conversion into than that alone would suggest: you're supposed to write the stressed equivalent of unstressed vowels. So the words "photograph" and "photography", for example, should be 𐑁𐐬𐐻𐐬𐑀𐑉𐐰𐑁 and 𐑁𐐬𐐻𐐪𐑀𐑉𐐰𐑁𐐮 (IPA: foʊ̯toʊ̯græf and foʊ̯tɑgræfɪ, my keyboard transliteration: fo;to;graf and fo;tografi). I don't think this is very common in practice, however -- which is a problem for back-converting Deseret to Latin, since the unstressed schwa can be written either 𐐲 or 𐐮 by people who don't distinguish them.

Also, textspeak is built into it: the name of the letter 𐐒 is 'bee', so the word 'bee' can be written '𐐺'. This can even hold within a word: the Wikipedia page has an example of a coin with the text "𐐐𐐄𐐢𐐆𐐤𐐝 𐐓𐐅 𐐜 𐐢𐐃𐐡𐐔". The first word there is 'holiness', but it's written /hoʊ̯lɪns/ (ho;lins), since the name of the letter 𐐝 is pronounced 'ess'. Usually you see this in the definite article, which is just written 𐑄, but you could also write 'entry', 'zebra', and 'jeep' as '𐑌𐐻𐑉𐐮', '𐑆𐐺𐑉𐐲', and '𐐾𐐹'. (ntrɪ zbrə dʒp / ntri zbru jp -- and 'entry' could also be written with a final -𐐨 instead of -𐐮)

It would be possible to automatically convert Latin to Deseret (or Shavian) and back, but it wouldn't be easy, and it probably couldn't be done quickly enough to have a browser plugin do it.

edit: a Latin -> Deseret converter already exists, but it's crap: can't take more than a few words at a time, returns allcaps, adds semicolons for no reason after some letters, can't handle textspeak even for the definite article, and makes vowel choices that I wouldn't make. (Looks like it writes all unstressed vowels with 𐐆.)

Replies from: ChristianKl

↑ comment by ChristianKl · 2014-08-09T22:18:48.079Z · LW(p) · GW(p)

That would be difficult. Deseret script is phonetic, so you'd have to either look up the pronunciation for each word or eat the imperfection from the ~40% of words that can't be easily predicted.

Yes you need a phonetic dictionary. eSpeak is a project where people already dealt with the problem of predicting phonetics. You could start with the values that eSpeak produces and allow users to edit them in some sort of Wiki to improve on the eSpeak IPA values.

It would be possible to automatically convert Latin to Deseret (or Shavian) and back, but it wouldn't be easy, and it probably couldn't be done quickly enough to have a browser plugin do it.

Local database lookups are very fast I don't see how speed on a client side browser plugin would be an issue.

Replies from: None

↑ comment by [deleted] · 2014-08-09T22:54:40.734Z · LW(p) · GW(p)

Local database lookups are very fast I don't see how speed on a client side browser plugin would be an issue.

Fast enough that you can do a few hundred of them per page? (Not rhetorical; I don't know.)

Textspeak substitution wouldn't actually be a problem; I don't know why I thought otherwise. And back-conversion to Latin would just require brute-forcing words that don't show up in the dictionary.

Replies from: ChristianKl

↑ comment by ChristianKl · 2014-08-10T00:05:42.678Z · LW(p) · GW(p)

Fast enough that you can do a few hundred of them per page? (Not rhetorical; I don't know.)

Yes, select queries don't take much time when you have an index. Thank Moore's law.

↑ comment by IlyaShpitser · 2014-08-07T19:47:06.739Z · LW(p) · GW(p)

Yes, Hangul is our Marain.

Replies from: None

↑ comment by [deleted] · 2014-08-07T19:49:58.712Z · LW(p) · GW(p)

I'd back this if it included hanja on the side.

↑ comment by Creutzer · 2014-08-07T19:32:02.989Z · LW(p) · GW(p)

The Tengwar of Tolkien's share with hangul at least the encoding of phonological features, by the way.

comment by Dan_Moore · 2014-08-07T19:12:39.401Z · LW(p) · GW(p)

It seems clear that the first existing name was Mattaponi, and since the 4 feeder rivers are close together, the syllable names were chosen for the 4 streams, south to north. The Matta (and especially Poni) Rivers look pretty short on the map.

comment by garethrees · 2014-08-09T16:51:45.951Z · LW(p) · GW(p)

This proposal seems like it would run aground on the actual complexity and changeability of river systems. The River Great Ouse, to take an example that's local to me, runs in four channels between Earith and its outflow at Kings Lynn (the Old and New Bedford Rivers, the Great Ouse proper, and an unnamed flood relief channel). But this is a relatively recent configuration: the Great Ouse formerly turned west at Littleport (rather than north as at present), reaching a confluence with the River Nene before flowing into the Wash at Wisbech, while the Little Ouse flowed north to Kings Lynn.

comment by PhilGoetz · 2014-08-09T18:29:29.091Z · LW(p) · GW(p)

When I studied linguistics in grad school, I was taught that Japanese and Navajo have noun suffixes that indicate the physical shape of an object, e.g. "long narrow tube", "flat, paper-like", etc.

Replies from: garethrees

↑ comment by garethrees · 2014-08-12T16:13:47.175Z · LW(p) · GW(p)

In Japanese, these aren't noun suffixes but number suffixes, known as counters or classifiers. You don't say, "*ninjin ga san" [three carrots], but rather, "ninjin ga sanbon" [three-cylinder-shaped carrots].

Mass nouns behave in a similar way in English: you don't say "*three breads", but rather, "three loaves of bread". Also, "head of cattle", "slices of toast", "sheets of paper", "items of cutlery", etc.

In Navajo, the classifiers are verb stems.

comment by ChristianKl · 2014-08-06T16:09:06.155Z · LW(p) · GW(p)

If you want that kind of language learn Esperanto. The problem is that it takes up much more space than English. You need more letters to express the same idea.

Replies from: garabik, Emile, Gunnar_Zarncke, None

↑ comment by garabik · 2014-08-07T08:02:05.694Z · LW(p) · GW(p)

learn Esperanto. The problem is that it takes up much more space than English

That's not how Esperanto works - it is not a philosophical language. While in theory it is 100% agglutinative, it is not used in that way, and the wordbuilding affixes serve more like a mnemotechnical device when learning the language (and it is very cleverly designed and very helpful).

As for the size, Esperanto is not longer or shorter than any particular other language - it is English that is somewhat shorter than the others, due to many mono- and disyllabic words. Also consider the fact that translations are usually longer than originals.

↑ comment by Emile · 2014-08-06T16:20:59.837Z · LW(p) · GW(p)

Tio ne estas pligrandan problemon. Pligranda problemo estas ke malmulta homoj parolas Esperanto. Mi de longa tempo ne uzis tion.

Replies from: Alejandro1, bbleeker, ChristianKl, Creutzer, jaime2000

↑ comment by Alejandro1 · 2014-08-06T16:29:54.405Z · LW(p) · GW(p)

On the other hand, it is kind of awesome that people with no knowledge of Esperanto but knowledge of two or three European languages can immediately understand everything you say--as I just did.

Replies from: Emile, philh

↑ comment by Emile · 2014-08-07T07:39:38.873Z · LW(p) · GW(p)

Agreed, tho my sentence is probably easier than average because I haven't used Esperanto for years now, so I'm much more likely to remember vocabulary similar to languages I know.

Knowing some of a Latin language and a Germanic one, plus knowledge of basic syntax (nounds end in -o, adjectives in -a, verbs in -is/-as/-os (past/present/future), adverbs in -e, plural is -j, accusative has an extra -n) is enough for understanding a lot of simple content.

↑ comment by philh · 2014-08-06T17:15:05.913Z · LW(p) · GW(p)

And this is 'knowledge of' in a very loose sense - I don't know any European languages except English, and I could still work it out. (I did take 'parolas' from French 'parler'.)

Replies from: garabik

↑ comment by garabik · 2014-08-08T12:59:15.958Z · LW(p) · GW(p)

You'd like Interlingua then:

Iste non es le problema maxime. Le problema major es que paucos homines parla Esperanto. Io non lo usa desde longe tempore.

(caveat: I do not speak Interlingua at all. This is just what I managed to put together from a grammar handbook and a dictionary)

Had Zamenhof created his language to be more like Interlingua, we might be using this in international communication by today. Compared with Esperanto, it's easier for Romance speakers, but adequately more difficult for the others.

↑ comment by Sabiola (bbleeker) · 2014-08-07T13:34:30.317Z · LW(p) · GW(p)

I don't think that is wholly correct. I'd have written:

Tio ne estas la plej granda problemo. Pli granda problemo estas ke malmultaj homoj parolas Esperanton. Mi de longa tempo ne uzas ĝin.

Sorry for nitpicking; I'd have said nothing (or maybe just in a PM), but since others have commented on the construction in the first sentence...

Replies from: Emile

↑ comment by Emile · 2014-08-07T20:23:38.158Z · LW(p) · GW(p)

Thanks for the correction, it's helpful! I wrote that in a hurry (pomodoro break at work), I wanted to add "there are probably plenty of grammatical mistakes in all this" but I didn't even remember how to say "mistake" in esperanto :)

↑ comment by ChristianKl · 2014-08-06T22:22:28.644Z · LW(p) · GW(p)

There are people who write their own notes in short hand regardless of whether someone else will be able to read them because short hand is more efficient.

I think it's possible for a whole language to be simply nicer than the established languages so that it would encourage you to write your own notes to yourself in that language.

Esperanto is relatively easy to learn but it doesn't perform better in some niche in a way that would encourage people to use the language for that niche.

↑ comment by Creutzer · 2014-08-06T19:32:34.376Z · LW(p) · GW(p)

Tio ne estas pligrandan problemon.

Construing a copula with accusative case is a very curious and interesting mistake!

Replies from: garabik

↑ comment by garabik · 2014-08-07T08:02:51.762Z · LW(p) · GW(p)

Construing a copula with accusative case is a very curious and interesting mistake!

It's kind of Slavic-like construction.

Replies from: Creutzer

↑ comment by Creutzer · 2014-08-07T15:52:43.031Z · LW(p) · GW(p)

Wait, this is a thing in Esperanto? That wasn't a performance error on Emile's part, but they actually use accusative case predicatively like Slavic languages do instrumental case? That would be pretty bizarre.

Replies from: arundelo

↑ comment by arundelo · 2014-08-07T16:06:03.953Z · LW(p) · GW(p)

It's a performance error; the predicate should be nominative.

English pronoun cases don't divide up the same way Esperanto cases do (e.g., prepositions take the object case), but note that many English speakers say, "It is me" rather than "It is I". (I don't know Emile's first language.)

Also, leaving off the accusative ending is such a pitfall for most beginners at Esperanto that people sometimes overcorrect anything matching the pattern "nominative verb nominative" to "nominative verb accusative".

Edit: Corrected "pronouns take" to "prepositions take".

Replies from: None, Emile, Creutzer

↑ comment by [deleted] · 2014-08-08T00:13:15.206Z · LW(p) · GW(p)

English pronoun cases don't divide up the same way Esperanto cases do (e.g., prepositions take the object case), but note that many English speakers say, "It is me" rather than "It is I". (I don't know Emile's first language.)

Right. I forget the technical terms used in the case of English (they aren't usually called nominative vs. accusative anymore), but the default case is 'me': 'I' is the special case, used only in subject position.

(The default case in English is actually descended from the Old English dative, not the accusative, with the exception of 'it' (OE nom. 'hit', acc. 'hit', dat. and gen. 'him' and 'his' just like the masculine pronoun) though the two merged for most of the pronouns in the OE period: it's only obvious from 'him' (OE nom. 'he', acc. 'hine', dat. 'him') and the pronoun 'they', which was borrowed from Old Norse (nom. 'þeir', acc. 'þá', dat. 'þeim').)

↑ comment by Emile · 2014-08-07T20:27:03.702Z · LW(p) · GW(p)

(I don't know Emile's first language.)

French, and despite liking learning languages, I'm not that good at reasoning abstractly about grammatical rules; "accusative" and "nominative" are not very salient concepts in my mind, and I have to look them up to be sure of what they mean exactly.

Replies from: arundelo

↑ comment by arundelo · 2014-08-08T00:04:52.499Z · LW(p) · GW(p)

I am lucky in that reasoning abstractly about grammatical rules is a good fit for the way my mind works; even so, I only got good at it after I learned a second language.

↑ comment by Creutzer · 2014-08-07T19:04:44.716Z · LW(p) · GW(p)

Makes sense, thanks for providing the explanation I didn't think of!

↑ comment by jaime2000 · 2014-08-06T16:47:05.186Z · LW(p) · GW(p)

What motivated you to learn Esperanto in the first place?

Replies from: Emile

↑ comment by Emile · 2014-08-07T07:30:35.908Z · LW(p) · GW(p)

I like learning languages in general, and Esperanto looked interesting and easy.

Replies from: Gunnar_Zarncke

↑ comment by Gunnar_Zarncke · 2014-08-07T21:37:05.713Z · LW(p) · GW(p)

Then E-minmal may be interesting for you. I created an Anki deck for it.

↑ comment by Gunnar_Zarncke · 2014-08-07T21:14:09.264Z · LW(p) · GW(p)

Or even better: Learn E-minimal. <300 word (-stems) in total. But indeed usual concepts require quite long expressions.