Similarity Clusters

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-02-06T03:34:22.000Z · LW · GW · Legacy · 5 comments

Contents

5 comments

Once upon a time, the philosophers of Plato's Academy claimed that the best definition of human was a "featherless biped".  Diogenes of Sinope, also called Diogenes the Cynic, is said to have promptly exhibited a plucked chicken and declared "Here is Plato's man."  The Platonists promptly changed their definition to "a featherless biped with broad nails".

No dictionary, no encyclopedia, has ever listed all the things that humans have in common.  We have red blood, five fingers on each of two hands, bony skulls, 23 pairs of chromosomes—but the same might be said of other animal species.  We make complex tools to make complex tools, we use syntactical combinatorial language, we harness critical fission reactions as a source of energy: these things may serve out to single out only humans, but not all humans—many of us have never built a fission reactor.  With the right set of necessary-and-sufficient gene sequences you could single out all humans, and only humans—at least for now—but it would still be far from all that humans have in common.

But so long as you don't happen to be near a plucked chicken, saying "Look for featherless bipeds" may serve to pick out a few dozen of the particular things that are humans, as opposed to houses, vases, sandwiches, cats, colors, or mathematical theorems.

Once the definition "featherless biped" has been bound to some particular featherless bipeds, you can look over the group, and begin harvesting some of the other characteristics—beyond mere featherfree twolegginess—that the "featherless bipeds" seem to share in common.  The particular featherless bipeds that you see seem to also use language, build complex tools, speak combinatorial language with syntax, bleed red blood if poked, die when they drink hemlock.

Thus the category "human" grows richer, and adds more and more characteristics; and when Diogenes finally presents his plucked chicken, we are not fooled:  This plucked chicken is obviously not similar to the other "featherless bipeds".

(If Aristotelian logic were a good model of human psychology, the Platonists would have looked at the plucked chicken and said, "Yes, that's a human; what's your point?")

If the first featherless biped you see is a plucked chicken, then you may end up thinking that the verbal label "human" denotes a plucked chicken; so I can modify my treasure map to point to "featherless bipeds with broad nails", and if I am wise, go on to say, "See Diogenes over there?  That's a human, and I'm a human, and you're a human; and that chimpanzee is not a human, though fairly close."

The initial clue only has to lead the user to the similarity cluster—the group of things that have many characteristics in common.  After that, the initial clue has served its purpose, and I can go on to convey the new information "humans are currently mortal", or whatever else I want to say about us featherless bipeds.

A dictionary is best thought of, not as a book of Aristotelian class definitions, but a book of hints for matching verbal labels to similarity clusters, or matching labels to properties that are useful in distinguishing similarity clusters.

5 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

comment by mtraven · 2008-02-06T06:24:05.000Z · LW(p) · GW(p)

A nit: people with polydactyly or trisomies are still human. But that supports the larger point. See also George Lakoff's Women, Fire, and Dangerous Things which develops a prototype model of categorization.

comment by infty · 2009-12-25T23:58:36.285Z · LW(p) · GW(p)

Sorry, newbie here. I assume that once you've fleshed our your similarity cluster, your initial clue may also be happily proven false, right?

I'm thinking of, say, amputees, or, ahem, furries, neither of which would match your clue, but both of which can be recognised (perhaps with some difficulty in the latter case) as humans, once you've gathered enough information to fill out your cluster.

Replies from: taryneast
comment by taryneast · 2010-12-09T10:35:25.845Z · LW(p) · GW(p)

Also a newbie... but I'd gather that each of the "sign-post" characteristics strongly increase the probability that the subject is a human. So, if you look for "things that are bipedal and featherless" - you have a strong likelihood of finding a human... ie it doesn't necessarily mean that if a thing doesn't have that characteristic, that it isn't human ie A->Human doesn't mean that ~A->~Human though if you find ~A then Human has lowered probability. I reckon you can probably sum across the cluster and as long as it has a good percentage of the signalling characteristics - you'd have a high chance of Human.

As to furries and plucked chickens. I'd assume that a temporary-characteristic shouldn't be taken as a permanent characteristic. eg a plucked chicken is temporarily un-feathered (and not by its own choice either!). Normally (for that chicken) it is feathered... the opposite is the case for furries ;)

Replies from: DanielLC
comment by DanielLC · 2012-01-10T05:49:15.963Z · LW(p) · GW(p)

I'd assume that a temporary-characteristic shouldn't be taken as a permanent characteristic.

It was never specified that humans are something that stay human. Once you notice that, you'd start using "was a human" and "will be a human" as criteria. The plucked chicken clearly wasn't a human before, and will most likely go into another obviously non-human state soon, so it doesn't fit well as a human. The homo sapiens with the feathered suit used to be an obvious example, and soon will be again, so it fits well as a human.

comment by blessedArtillery · 2020-05-22T19:21:45.819Z · LW(p) · GW(p)

It is strange to see a common machine learning algorithm explained philosophically rather than mathematically.