So you’ve heard about how fish aren’t a monophyletic group? You’ve heard about carcinization, the process by which ocean arthropods convergently evolve into crabs? You say you get it now? Sit down. Sit down. Shut up. Listen. You don’t know nothing yet.
“Trees” are not a coherent phylogenetic category. On the evolutionary tree of plants, trees are regularly interspersed with things that are absolutely, 100% not trees. This means that, for instance, either:
The common ancestor of a maple and a mulberry tree was not a tree.
The common ancestor of a stinging nettle and a strawberry plant was a tree.
And this is true for most trees or non-trees that you can think of.
I thought I had a pretty good guess at this, but the situation is far worse than I could have imagined.
Why do trees keep happening?
First, what is a tree? It’s a big long-lived self-supporting plant with leaves and wood.
Also of interest to us are the non-tree “woody plants”, like lianas (thick woody vines) and shrubs. They’re not trees, but at least to me, it’s relatively apparent how a tree could evolve into a shrub, or vice-versa. The confusing part is a tree evolving into a dandelion. (Or vice-versa.)
Wood, as you may have guessed by now, is also not a clear phyletic category. But it’s a reasonable category – a lignin-dense structure, usually that grows from the exterior and that forms a pretty readily identifiable material when separated from the tree. (…Okay, not the most explainable, but you know wood? You know when you hold something in your hand, and it’s made of wood, and you can tell that? Yeah, that thing.)
All plants have lignin and cellulose as structural elements – wood is plant matter that is dense with both of these.
Botanists don’t seem to think it only could have gone one way – for instance, the common ancestor of flowering plants is theorized to have been woody. But we also have pretty clear evidence of recent evolution of woodiness – say, a new plant arrives on a relatively barren island, and some of the offspring of that plant becomes treelike. Of plants native to the Canary Islands, wood independently evolved at least 38 times!
One relevant factor is that all woody plants do, in a sense, begin life as herbaceous plants – by and large, a tree sprout shares a lot of properties with any herbaceous plant. Indeed, botanists call this kind of fleshy, soft growth from the center that elongates a plant “primary growth”, and the later growth from towards the outside which causes a plant to thicken is “secondary growth.” In a woody plant, secondary growth also means growing wood and bark – but other plants sometimes do secondary growth as well, like potatoes (in roots)
This paper addresses the question. I don’t understand a lot of the closely genetic details, but my impression of its thesis is that: Analysis of convergently-evolved woody plants show that the genes for secondary woody growth are similar to primary growth in plants that don’t do any secondary growth – even in unrelated plants. And woody growth is an adaption of secondary growth. To abstract a little more, there is a common and useful structure in herbaceous plants that, when slightly tweaked, “dendronizes” them into woody plants.
Dendronization – Evolving into a tree-like morphology. (In the style of “carcinization“.) From ‘dendro‘, the ancient Greek root for tree.
Can this be tested? Yep – knock out a couple of genes that control flower development and change the light levels to mimic summer, and researchers found thatArabidopsis – rock cress, a distinctly herbaceous plant used as a model organism – grows a woody stem never otherwise seen in the species.
So not only can wood develop relatively easily in an herbaceous plant, it can come from messing with some of the genes that regulate annual behavior – an herby plant’s usual lifecycle of reproducing in warm weather, dying off in cool weather. So that gets us two properties of trees at once: woodiness, and being long-lived. It’s still a far cry from turning a plant into a tree, but also, it’s really not that far.
“Obviously, in the search for which genes make a tree versus a herbaceous plant, it would be folly to look for genes present in poplar and absent in Arabidopsis. More likely, tree forms reflect differences in expression of a similar suite of genes to those found in herbaceous relatives.”
So: There are no unique “tree” genes. It’s just a different expression of genes that plants already use. Analogously, you can make a cake with flour, sugar, eggs, sugar, butter, and vanilla. You can also make frosting with sugar, butter, and vanilla – a subset of the ingredients you already have, but in different ratios and use
But again, the reverse also happens – a tree needs to do both primary and secondary growth, so it’s relatively easy for a tree lineage to drop the “secondary” growth stage and remain an herb for its whole lifespan, thus “poaizating.” As stated above, it’s hypothesized that the earliest angiosperms were woody, some of which would have lost that in become the most familiar herbaceous plants today. There are also some plants like cassytha and mistletoe, herbaceous plants from tree-heavy lineages, who are both parasitic plants that grow on a host tree. Knowing absolutely nothing about the evolution of these lineages, I think it’s reasonable to speculate that they each came from a tree-like ancestor but poaized to become parasites. (Evolution is very fond of parasites.)
Poaization: Evolving into an herbaceous morphology. From ‘poai‘, ancient Greek term from Theophrastus defining herbaceous plants (“Theophrastus on Herbals and Herbal Remedies”).
(I apologize to anyone I’ve ever complained to about jargon proliferation in rationalist-diaspora blog posts.)
The trend of staying in an earlier stage of development is also called neotenizing. Axolotls are an example in animals – they resemble the juvenile stages of the closely-related tiger salamander. Did you know very rarely, or when exposed to hormone-affecting substances, axolotls “grow up” into something that looks a lot like a tiger salamander? Not unlike the gene-altered Arabidopsis.
Does this mean anything?
A friend asked why I was so interested in this finding about trees evolving convergently. To me, it’s that a tree is such a familiar, everyday thing. You know birds? Imagine if actually there were amphibian birds and mammal birds and insect birds flying all around, and they all looked pretty much the same – feathers, beaks, little claw feet, the lot. You had to be a real bird expert to be able to tell an insect bird from a mammal bird. Also, most people don’t know that there isn’t just one kind of “bird”. That’s what’s going on with trees.
I was also interested in culinary applications of this knowledge. You know people who get all excited about “don’t you know a tomato is a fruit?” or “a blueberry isn’t really a berry?” I was one once, it’s okay. Listen, forget all of that.
There is a kind of botanical definition of a fruit and a berry, talking about which parts of common plant anatomy and reproduction the structure in question is derived from, but they’re definitely not related to the culinary or common understandings. (An apple, arguably the most central fruit of all to many people, is not truly a botanical fruit either).
Let me be very clear here – mostly, this is not what biologists like to say. When we say a bird is a dinosaur, we mean that a bird and a T. rex share a common ancestor that had recognizably dinosaur-ish properties, and that we can generally point to some of those properties in the bird as well – feathers, bone structure, whatever. You can analogize this to similar statements you may have heard – “a whale is a mammal”, “a spider is not an insect”, “a hyena is a feline”…
But this is not what’s happening with fruit. Most “fruits” or “berries” are not descended from a common “fruit” or “berry” ancestor. Citrus fruits are all derived from a common fruit, and so are apples and pears, and plums and apricots – but an apple and an orange, or a fig and a peach, do not share a fruit ancestor.
Instead of trying to get uppity about this, may I recommend the following:
Acknowledge that all of our categories are weird and a little arbitrary
Send a fruit basket to your local botanist/plant evolutionary biologist for putting up with this, or become one yourself
Some more interesting findings:
A mulberry (left) is not related to a blackberry (right). They just… both did that.
Avocado and cinnamon are from fairly closely-related tree species.
It’s possible that the last common ancestor between an apple and a peach was not even a tree.
Of special interest to my Pacific Northwest readers, the Seattle neighborhood of Magnolia is misnamed after the local madrona tree, which Europeans confused with the (similar-looking) magnolia. In reality, these two species are only very distantly related. (You can find them both on the chart to see exactly how far apart they are.)
None of [cactuses, aloe vera, jade plants, snake plants, and the succulent I grew up knowing as “hens and chicks”] are related to each other.
Rubusis the genus that contains raspberries, blackberries, dewberries, salmonberries… that kind of thing. (Remember, a genus is the category just above a species – which is kind of a made-up distinction, but suffice to say, this is a closely-related groups of plants.) Some of its members have 14 chromosomes. Some of its members have 98 chromosomes.
Seriously, I’m going to hand $20 in cash to the next plant taxonomy expert I meet in person. God knows bacteriologists and zoologists don’t have to deal with this.
And I have one more unanswered question. There doesn’t seem to be a strong tend of plants evolving into grasses, despite the fact that grasses are quite successful and seem kind of like the most anatomically simple plant there could be – root, big leaf, little flower, you’re good to go. But most grass-like plants are in the same group. Why don’t more plants evolve towards the “grass” strategy?
Let’s get personal for a moment. One of my philosophical takeaways from this project is, of course, “convergent evolution is a hell of a drug.” A second is something like “taxonomy is not automatically a great category for regular usage.” Phylogenetics are absolutely fascinating, and I do wish people understood them better, and probably “there’s no such thing as a fish” is a good meme to have around because most people do not realize that they’re genetically closer to a tuna than a tuna is to a shark – and “no such thing as a fish” invites that inquiry.
(You can, at least, say that a tree is a strategy. Wood is a strategy. Fruit is a strategy. A fish is also a strategy.)
At the same time, I have this vision in my mind of a clever person who takes this meandering essay of mine and goes around saying “did you know there’s no such thing as wood?” And they’d be kind of right.
But at the same time, insisting that “wood” is not a useful or comprehensible category would be the most fascinatingly obnoxious rhetorical move. Just the pinnacle of choosing the interestingly abstract over the practical whole. A perfect instance of missing the forest for – uh, the forest for …
Towards the end of writing this piece, I found that actual botanist Dan Ridley-Ellis made a tweet thread about this topic in 2019. See that for more like this from someone who knows what they’re talking about.
Acknowledge that all of our categories are weird and a little arbitrary
That is not the moral! The moral is that the cluster-structure [LW · GW] of similarities induced by phylogenetic relatedness exists in a different subspace from the cluster-structure of similarities induced by convergent evolution! (Where the math jargon "subspace" serves as a precise formalization of the idea that things can be similar in some aspects ("dimensions") while simultaneously being different in other aspects.) This shouldn't actually be surprising if you think about what the phrase "convergent evolution" means!
That's a good expanded takeaway of part of it! (Obviously "weird and a little arbitrary" is kind of nebulous, but IME it's a handy heuristic you've neatly formalized in this case.) To be clear, it doesn't sound like we disagree?
On the specific example of trees, John Wentworth recently pointed out that neural networks tend to learn a "tree" concept [LW · GW]: a small, local change to the network can add or remove trees from generated images. That kind of correspondence between human and unsupervised (!) machine-learning model concepts is the kind of thing I'd expect to happen if trees "actually exist", rather than trees being weird and a little arbitrary. (Where things are closer to "actually existing" rather than being arbitrary when different humans and other AI architectures end up converging on the same concept in order to compress their predictions [LW · GW].)
(Now I'm wondering if there's some sort of fruitful analogy to be made between convergence of tree concepts in different maps, and convergent evolution in the territory; in some sense, the fact that evolution keeps rediscovering the tree strategy makes them less "arbitrary" than if trees had only been "invented once" and all descended from the same ur-tree ...)
Oh, I think you're over-extrapolating what I meant by arbitrary - like I say toward the end of the essay, trees are definitely a meaningful category. Categories being "a little arbitrary" doesn't mean they're not valuable - is there a clear difference between a tree and a shrub? Maybe, but I don't know what it is if so, and it seems like plausibly not. The fruit example is even clearer - is a grape a berry? Is a pumpkin a fruit? Who cares? Probably lots of people, depending on the context? Most common human categories work like this around the edges if you try and pin them down - hence, a little arbitrary. Seems fine.
I'm standing by "weird." That's definitely weird. I don't think of nature as going in for platonic forms! What's going on here?! Weird as hell.
If you're at all like me, part of that feeling is definitely having not internalized [genes as lego bricks] rather than [genes as fragile tightly coupled organism recipie]. The notion that the Blind Idiot God [LW · GW]invented reusable loosly coupled code and is halfway to a functioning package manager is more than a bit of a shocker. And crazier yet has had those capabilities long enough that they're fixed in substantially all life on Earth (albiet with serious regressions in animals).
Apparently there's some ideas that are convergent enough substaintially any optimizer finds them eventually.
Or I'm speaking a slightly different dialect of English from you?? As a point of terminology, I think "fuzzy" is a better word than "arbitrary" for this kind of situation, where I agree that, as a human having a casual conversation, my response to "Is a pumpkin a fruit?" is usually going to be something like "Whatever; if it matters in context, I'll ask for more specifics", but as a philosopher of science, I claim that there definite mathematical laws governing the relationship between what communication signals are sent, and what probabilistic inferences a receiver can infer [LW · GW], and the laws permit things like soft k-means clustering, where given some set of data points representing data about plants, the algorithm could say that this-and-such plant has a membership coefficient of 0.34 in the "shrub" cluster and 0.66 in the "tree" cluster, and there would be nothing arbitrary about those numbers as the definite, precise result of what happens when you run this particular clustering algorithm against that particular data. (But the number 0.34 in this blog comment is arbitrary, because I made it up for concreteness while trying to explain what fuzzy clustering is; there's no reason I couldn't have chosen a different coefficient.)
But then when I actually look up "arbitrary" and "fuzzy" on Wiktionary, it seems common usage [LW · GW] is not unequivocally on my side: your usage of arbitrary fits with the first part of definition 1 ("Based on individual discretion or judgment"), whereas my usage is centered on the second part of definition 1 ("not based on any objective distinction, perhaps even made at random"), with influence from the mathematician's usage, definition 3 ("Any, out of all that are possible"). And the meaning of fuzzy I want barely even makes the list as a technical reference ("Employing or relating to fuzzy logic") ...
I don't love this thread - your first comment reads like you're correcting me on something or saying I got something important philosophically wrong, and then you just expand on part of what I wrote with fancier language. The actual "correction", if there is one, is down the thread and about a single word used in a minor part of the article, which, by your own findings, I am using in a common way and you are using in an idiosyncratic way. ...It seems like a shoehorn for your pet philosophical stance. (I suppose I do at least appreciate you confining the inevitable "What are Women Really" tie-ins to your own thread, because boy howdy, do I not want that here.)
To be clear, the expansion was in fact good, it's the unsupported framing as a correction. This wouldn't normally bother me enough to remark on, but it's by far the top-rated comment, and you know everyone loves a first-comment correction, so I thought I should put it out there.
You keep saying this (and other roughly-equivalent things) but I think it's just wrong.
If you pick a measure on your concept-space, you can use it to define a notion of entropy, and then you can ask what clusterings permit maximally efficient communication. It's not clear that communication efficiency is the thing we want to maximize, and if you permit approximate transmission of information then you may actually want to minimize something like cost of errors + cost of communication, and for that you need not merely a measure but a metric. Anyway, the point is that all these things require some sort of notion of distance, size, etc., in concept-space.
No such notion is given to us by whatever gods there may be. We get to choose them. Indeed, we must choose them. (We can pretend we aren't doing so, which I think is just the same sort of mistake as pretending that you don't have priors.)
We will, if we are wise, choose them with a view to maximizing whatever things we care about maximizing. This is why botanists have one notion of "fruit" and ordinary people eating things have another. As a philosopher of science (considered in a broad enough sense that it doesn't only concern formalized science done in the academy) you should endorse this, and accept that those definite mathematical laws can lead to different optimal concept-boundaries for different people or communities.
If I and the people I need to talk to about pumpkins spend our days dissecting plants and examining their ancestry and contemplating the way in which they reproduce and so forth, then I will probably do best to use the botanists' definition of "fruit". Even if there weren't other botanists, and existing literature, using that definition, then if I do my job well I will likely end up with something very similar.
If I and the people I need to talk to about pumpkins spend our days roasting them, making jam from them, and so forth, then I will probably do best to use the everyday notion of "fruit". Here there's more fuzziness; different people's everday notions of "fruit" may be a little different, most people's may have poorly-defined edges, etc., but we can still agree (contra the botanists) that corn kernels are not fruit but rhubarb is.
And there is no reason at all why either version of me should be wrong or why a philosopher of science should feel obliged to tut-tut that one or other is failing to use the word "fruit" optimally. (A philosopher of language might perhaps inquire whether we'd considered using different words for these two concepts.) Of course either, or more likely both, might in fact be using the word suboptimally; but nothing prevents them both being optimal usages for particular purposes, and there are actually-existing purposes for which both are probably reasonably close to optimal.
As for soft k-means, that algorithm doesn't even provide a well-defined answer (there may be different optima, with which one you land on depending where you start), but in any case it presupposes a particular embedding of (part of) concept-space in Euclidean space, and that is another thing that the gods have not chosen to give us.
If you're going to condescend to me like this, I think I deserve an answer: did you read the post [LW · GW], yes or no? I know, it's kind of long (just under 10,000 words). But ... if you're going to put in the effort to write 500 words allegedly disproving what I "keep saying", isn't it worth ... actually reading what I say?
[The following is rather long; I'd offer the usual Pascal quotation but actually I'm not sure how much shorter it could actually be. I hope it isn't too tedious to read. It is quite a bit shorter than "Unnatural Categories are Optimized for Deception".]
I don't really understand what in what I wrote you're interpreting as condescension, but for what it's worth none was intended.
No, I don't think I ever read UCAOFD in any detail. The "did you read ...?" seems, on the face of it, to be assuming a principle along the lines of "you should not say that someone is wrong about something unless you have read every word they have written about it", which is not a principle I am willing to endorse; would you care either to argue for that principle or explain what weaker principle you are implicitly appealing to here?
Anyway, I've taken a look at UCAOFD now; yes, in it you say something similar to what I'm saying here, and we are in agreement about many things.
Let me summarize some things I think we agree on: 1. Category boundaries are not entirely arbitrary, in that some choices of boundary are just plain better than others for any reasonable purpose. 2. They are also not entirely forced; certain differences in purposes and priorities can rightly lead to different choices of boundaries. 3. When you use a common term to describe a variety of things, you are implicitly declaring that they resemble each other; how reasonable it is to use that common term for that particular set of things therefore depends on how closely they resemble each other. 4. One way to formalize this is to represent the things by points in some sort of concept-space, and words by regions in that space (or maybe by something fuzzier: e.g., maps from the space to [0,1] saying to what extent a given thing is appropriately described by the word). 5. We can then e.g. try to minimize some combination of distances between things within each region (i.e., try to make the things covered by a given term as like one another as possible), or pick a point for each word and try to minimize some combination of distances from things in the region to that point (i.e., try to make the things covered by a given term as like a particular Representative Thing as possible), or contemplate some pattern of communication about these things and try to minimize some combination of message length and interpretation errors committed by the receiver. 6. There is something distinctively honest about category boundaries chosen to maximize this sort of figure of merit.
(To the best of my knowledge, these are all things we agree on, and they summarize a substantial fraction of what you are saying in pieces like UCAOFD, though of course not necessarily all of it. If I am wrong about either of those things, then 1. please let me know and 2. some of what follows may be less useful than I hope for it to be.)
So far, I hope, so good. Now for some probably more contentious bits. It seems to me that when anyone on LW says anything at all like "category boundaries are a bit arbitrary", you are liable to pounce on them and protest: no no no, category boundaries are chosen to optimize prediction and/or communication, and you shouldn't call them arbitrary, can't you see that there are definite mathematical laws here?. I think this is frequently unfair, for reasons I'll elaborate on in a moment. (My protests about choice of metric etc. were, I now agree, probably misdirected; I think there is something motte-and-bailey-ish going on, but I think I misidentified the bailey. I'll say a few words about my error later.)
So, I think the pouncing is frequently unfair because (1) someone saying that category boundaries are a bit arbitrary is not necessarily (or even, I think, usually) meaning anything incompatible with choosing category boundaries to optimize prediction and/or communication, and (2) optimizing prediction and/or communication is not the only goal one can (honestly) have when choosing category boundaries. And (3) if you are going to claim that someone's use of language fails to respect underlying mathematical laws, I think you owe them some sort of argument that some other reasonable alternative respects those laws better, and it seems to me that generally this argument is lacking either in the sense of not being there, or in the sense of not actually making the case well.
I'll argue briefly for those claims in order 2, 1, 3. (Apologies if the order discrepancy makes this harder to follow than it should be.)
2. [Optimizing for prediction/communication is not the only honest goal.] In UCAOFD you concede that there may be goals other than optimizing prediction and/or communication but the only such goals you consider are ones you categorize as "deception": picking boundaries that lead to suboptimal predictions in the hope that others will predict suboptimally in ways that benefit you, or (self-deceivingly) picking boundaries that make predictions you know deep down are wrong, but that make you feel happier. It seems to me you're assuming here that this sort of prediction is the only function of language, and it just isn't. Suppose someone's legal name, given by their parents, is George, but they hate the way that sounds. "Call me Fred", they say. Let's say they're in the UK where there is an official legal procedure for doing this, and that they haven't done it yet. Then there is a sense in which the name Fred is "wrong"; if you call them Fred then other people will draw wrong inferences if for some reason they e.g. have to guess what is on their birth certificate, or what names their parents liked. You may none the less choose to call them Fred because they enjoy being called Fred more than they enjoy being called George. If they ask everyone else to do likewise, after a while it will stop being an "error", but that's not really the point; you aren't calling them Fred to minimize any sort of prediction errors, because prediction isn't what calling someone (in conversation with them) by a given name is for.
You could, I guess, argue that what you're doing is helping them deceive themselves. (They would prefer a world where their name is Fred, so they pretend it is to optimize their internal utility-estimator rather than doing the more difficult thing of changing the world so that their name really is Fred. This is the "wiredheading/self-deception" phenomenon you mention in UCAOFD.) But I don't see much merit to this analysis of the situation. They're presumably well aware that their legal name is still George; when you call them Fred they aren't pretending otherwise; they just enjoy being addressed as Fred and prediction is a small enough element of what you're doing when you call them by name that any loss in prediction-accuracy when you do it is outweighed by their utility gain from being called something they think sounds nicer.
Of course this is a case where prediction is especially unimportant and their (un)enjoyment of being addressed a particular way is especially important. Other kinda-parallel cases will readily occur to the reader and I do not mean to claim that what I've said above obliges anyone to use the same policy in those cases. The only point is that prediction is not everything even when you exclude "deception", unless you define deception so broadly as to include literally everything that is not prediction, which I think would itself be highly misleading.
Here's another related situation, which might also be applicable to some of those kinda-parallel cases. Suppose our interlocutor actually thinks his name is Fred even though it's really George. (Perhaps he intended to file the relevant legal documents but forgot to do so and then forgot having forgotten. Perhaps he's suffering from some sort of delusion. Etc.) Then if you are addressing him by name and you want him actually to understand you, you'd better call him Fred.
Here's another, further toward "deception" territory but, I claim, not in it. Suppose you're talking to someone about the big rock in Australia named Uluru, but your interlocutor is an Englishman stuck in the past who insists that it is, and must always be, called Ayers Rock. And suppose you agree with the position I've taken in the previous sentence that Uluru is in some sense the right name for that rock. Unfortunately the Englishman has a short temper and a gun. You may choose to use the term "Ayers Rock" when talking to him, not because you want to deceive him into thinking that that's the right name or that Australia is still an English colony or that the aboriginal Australians weren't there first or anything like that (indeed, he already believes those things); not because you want to help him with his wireheading (he will carry on believing those things whatever term you use); but because you are concerned that if you insist on calling it Uluru he will shoot you. This isn't what you'd call a respectable use of language, exactly, but it's in no way deceptive and it is driven by something other than optimizing prediction or accurate/efficient communication.
And then there are deliberately non-literal uses of language -- poetry, metaphor, etc. Around here we're mostly not much concerned with these (or, at least, not in the discussions that are relevant right now), but it's worth being aware that much actual everyday language use has elements of these -- we choose our words for euphony, memorability, etc., as well as for accuracy. "I did say as a philosopher of science, after all!" Yes, but we both know that some of the applications of these principles that you're concerned with, such as the use of pronouns by philosophers in casual conversation, are not in technical discussions where accuracy is the only goal. (In that particular case there probably isn't much poetry or metaphor going on either; but the point is that we aren't talking only about highly-literal technical discourse.)
So far I've argued that prediction and communication aren't the only goals it's reasonable (and honest) to optimize for. I should also say that in practice we are seldom optimizing-machines, nor should we be because optimization is expensive. So category boundaries are commonly drawn according to (something like) where everyone else seems to be drawing them or where our brains just happen to draw them according to whatever mostly opaque algorithms are going on inside, and while both of these can be approximated as optimizations of a sort (using words the same way as everyone else is kinda-sorta a communication optimization; those mostly opaque algorithms inside our brain are probably doing something a bit like optimizing something) I don't know of any reason to think that they amount to the sort of optimization you're calling for. And, I claim, this is perfectly fine: nothing requires us to optimize all our language, and we don't optimize all our language, so if someone proposes a particular bit of language use then saying "but that isn't optimizing for anything!" is something like an Isolated Demand For Rigour.
1. ["Boundaries are a bit arbitrary" needn't be a repudiation of optimizing for prediction/communication.] Let's suppose that either in general or in a particular case we agree that the only honest thing to aim for is optimal prediction or communication. Then, as I believe we are agreed, there is still quite a lot of possible variation in where those boundaries get drawn; we may have different purposes, different loss functions, etc. (We may be talking about botany or about jam.) Someone who describes this situation by saying "boundaries are a bit arbitrary" is not saying anything false, even if they haven't said the most precise thing they could have said, and I think it is generally unhelpful to jump on them.
3. [Underlying mathematical laws.] Let's take the discussion a couple of comments upthread as an example. Your criticism of eukaryote's use of the word "arbitrary" appealed to the fact that there are "definite mathematical laws", and your concrete example of a definite mathematical law was the fact that one can draw boundaries by picking lots of examples and using soft k-means, in which case "there would be nothing arbitrary about those numbers as the definite, precise result of what happens when you run this particular clustering algorithm against that particular data". But the choice of which particular data is somewhat arbitrary; the choice of how to embed your various plants into euclidean space in order to run the soft k-means algorithm is somewhat arbitrary; the choice of soft k-means rather than some other clustering algorithm is somewhat arbitrary. (It kinda corresponds to trying to minimize distances from the example data to particular representative points for "tree" and "shrub"; but I see no reason at all to think that we should want notions of "tree" and "shrub" defined by single representative points, especially given what eukaryote has shown us about the set of things that are called "trees".) There's tons of arbitrariness here, and it seems to me that mathematics here is functioning more to intimidate than to enlighten.
To whatever extent I've argued successfully for claims 1,2,3, I think I've justified my claim that your pouncing on people who talk about category boundaries as slightly arbitrary is unfair. But above I made a narrower accusation: that there's something motte-and-bailey-ish about what you're doing. I'm not sure it's exactly motte-and-bailey, but the idea is that the motte is something like "ideally, category boundaries would be drawn so as to optimize some measure of accurate prediction and/or efficient communication" (arguably true for some definition of "ideally") and the bailey is something like "anyone who talks about flexibility in drawing category boundaries, but doesn't specifically insist that they be drawn so as to optimize such a measure, is in a state of sin".
Upthread I implicitly accused you of thinking that there's only ever a single optimal place for a given boundary. I no longer think you think that (even in the sense of having it as bailey). It may already be obvious what (I now think) the source of my error was: it seemed like you were objecting any time anyone endorsed the principle that category boundaries can vary. I no longer think that's quite what's going on, but I do think you're objecting to more than your more nuanced analyses of category boundaries (e.g., in UCAOFD) justify even if what you say therein is 100% correct.
But (1) identifying baileys is an inexact art; perhaps I'm still some way off optimal, and (2) perhaps I'm entirely wrong in thinking that you're motte-and-baileying; perhaps a deep enough understanding of what I've claimed to be the motte actually justifies your criticisms of e.g. eukaryote's use of the phrase "a little arbitrary". If you reckon so, I'd love to understand how.
This is more of a contrast but this line of thinking could be used to remedy that dolphins are fishes. That is the branch of tree fo life "fish" is a different concept than "thing that swims to survive".
In this sense "fishes" don't inherently breathe or have gills. A whole lot of properties would probably be A freedom degree" while the phylogenetics probbaly has a lot of "accidental" properties.
My hypothesis after 30 seconds of thinking was that trees evolve independently because height = good for competing for sunlight, while grasses must specialize a ton to 'afford' passing up on the height advantage. So once a grass is established somewhere it might be hard for an up-and-coming-almost-grass species to nudge out of its niche. Maybe this is related?
I could imagine lots of plants getting stuck in a local maximum of fitness where they are still pretty tree-like but would need to simultaneously lose some tree features and gain C4 photosynthesis in order to succeed as grasses, so the gap to jump in adaptation-space is too large.
Height is also useful for reducing impact of fires, herbivores, some parasites, etc.; and gives you substantially better volume-of-airflow-over-leaves which can be helpful - a flat sheet of leaf-material would underperform substantially for respiration, even before considering the variable angle of sunlight for photosynthesis.
With some handwaving, we seem to agree that "the absence of trees becoming grass-like indicates that there's no nice/large path in evolution-trajectory-space which is continuously competitive" and I'm gesturing towards the known-to-be-difficult C3/C4 distinction as a potentially-relevant feature of that space.
Note that while our non-expert speculation might turn up interesting relevant considerations, the space is very complicated and high-dimensional, and I at least have very little data or subject matter expertise. I therefore expect my analysis to be wrong, though I do enjoy and learn from doing it.
First off, thanks for the reminder that thingspace can map very differently depending on which dimensions you choose to filter on! It's difficult to really grok that idea in a sufficiently general way, I've noticed, and I feel like this was much more surprising that it should have been. I think reframing "tree" and "fish" as strategies may end up being an important takeaway.
Question: Apples not (botanically) a fruit how? Are they not the seed-bearing mature ovum of a plant? I feel like I missed something there.
Accessibility note: I totally might have started with the same colors you chose for your tree diagram! To my eye, they scream "woody thing" and "leafy thing" and "something like both". But also, the yellow and the brown are nearly indistinguishable on my monitor with the blue-reducer turned on, and all three hues sit in the part of color-space that gets kinda muddy to folks with certain kinds of reduced-color vision. Possible adjustments: you could add a shape component to each node (e.g. rounded corners, lozenge, square corners, hexagons), use different border styles (e.g. thin, thick, dotted, double-lined), and/or choose colors with very different values if you want to keep those (admittedly information-rich) hues (e.g. pastel green (maybe with dark text), walnut brown (maybe with white text), mossy green (maybe the text has a border to make it stand out)). The goal is to be able to distinguish the differences easily in a grayscale rendering of the image.
Thank you so much! Re: question: Well, they're not "normal" fruits, at least - they're accessory fruits. I don't know much else about the botanical definitions other than that.
Also, the accessibility point is very much appreciated. I've updated the graphic to take that into account - would love your thoughts on the improved one? Either way, I very much appreciate both the raising-the-issue and the suggestions on improvements!
Not a "real" fruit because the flesh is a product of some tissue adjacent to the ovum instead of within it. That sounds oddly nit-picky to me, even for scientists. Do you think this might be an important distinction for some non-taxes reason, or are botanists just really pedantic sometimes?
Well done on the new graphic! It's much easier to read now: I like the choice to use the darkest color and heaviest border for the "Definitely a tree" category, since that makes them pop out. When I look at it in greyscale (camera filter on my phone), the "Kind of a tree" green and "Definitely not a tree" orange are pretty close in value, but the borders make them easy to differentiate. Given that the goal was ostensibly to highlight the distribution of true trees, I think that's entirely appropriate. And when I turn on my laptop's blue blocker, I still have no problem seeing the difference between the categories.
When I showed the new graphic to my family, Partner suddenly started examining it and making connections. ("🧐 Look how closely related tea is to pitcher plants!") And the 5yo was even trying to make sense of it! Neither of them seemed interested yesterday, so I'm declaring success!
When I came across these facts, upon a little wider reading I had a similar additional mind-blowing moment around the whole set of circumstances of the 'alternation of generations' (https://en.m.wikipedia.org/wiki/Alternation_of_generations) exhibited by plants, fungi and a few other groups. For me, this exploded my conception of what reproduction strategies can look like (and my conception was probably already not even that narrow by most standards). Wait til you read about seed development and ploidy! https://en.m.wikipedia.org/wiki/Seed
I really liked the content, but I found some of the style (`Sit down!' etc) really off-putting, which I why I only actually read the post on my 3rd attempt. Obviously you're welcome to write in whatever style you want, and probably lots of other people really like it, I just thought it might be useful to mention that a non-empty set of people find it off-putting.
Super valid, I appreciate the feedback! For my own future reference, if you have an answer - was it more the general kind of casual/eclectic style, the "antagonistic" bits like what you quoted, or something else?
Definitely the antagonistic bits - I enjoyed the casual style! Really just the line ‘ Sit down. Sit down. Shut up. Listen. You don’t know nothing yet’ I found quite off-putting - even though in hindsight you were correct!
I really enjoyed this post! Look wistfully at pictures of Welwitschia, indeed! I got to see some in person a few years ago when we went to the Kirstenbosch Botanical Gardens in Cape Town, and my wife was very forbearing with my gaping at the unassuming piles of green straps.
If you're interested in learning more about what the plant developmental toolbox looks like and how it's been deployed throughout plant evolution, I'd recommend David Beerling's Making Eden. It's a pop-science book but pitched at the upper end of that range. Merlin Sheldrake's Entangled Life is also fun if you want an intro to the crazy stuff that happens in the fungal kingdom.
Cladistics is useful not only for biology but also for analyzing things like cultures. If your whole experience comes from a particular clade then you a probably missing most of the picture. Systematically exploring an evolutionary tree is a simple way to gain broader perspective.
This blew my mind! A couple nitpicks on the chart: - you said strawberry has a tree ancestor but none of its ancestors on the chart are trees - pineapple is colored as "definitely a tree," but it's a bush
A really enjoyable and informative read. I noticed on your graphic that the starting point for the entire graphic is the very first land plant, and its a moss. The very first evolutionary step from that point has to do with vascularity. This makes me wonder what was before the first land plant? Some sort of water plant obviously but I'm curious what the major evolutionary step from the ocean to land was - what it means to be 'moss'.
My second question is more open ended I guess, but thinking about all the various strategies life on Earth has developed to create such diversity in the plant world, and seeing as how many of them are quite similar across species, if you consider the vascularity of the post-moss plants as similar to the vascular system of animals, and the tightly packed wooden structure of trees as similar to bones of the skeleton, and the reproductive systems as similar as well, what do you see as constituting the "brain" or nervous system of a plant, if anything? I know it's a weird question.
And this is not exactly a deep observation, but It seems like the discussion of classification of plants - but also the post you linked to - involve the idea in relation to rationality and maybe to AI, that these are all examples of things which are hard to define in a true/false way, and this is at the heart of rationality from what I'm gathering. I think I want to do a bit more research about apples as well after reading this.
That common-ancestors-being-totally-different thing is wonderful. But really, wood is just wood, it's not even "the" "interesting" feature compared to, say, the placement of the next-season(s) buds (which is an instrumental thing in itself, but at least a higher-order instrumental thing.)
I'd love to see an ELI5 explanation of high the MADS-box genes coordinate about making wood in an Arabidopsis plant, that has to be some serious challenge for them.
So: There are no unique “tree” genes. It’s just a different expression of genes that plants already use. Analogously, you can make a cake with flour, sugar, eggs, sugar, butter, and vanilla. You can also make frosting with sugar, butter, and vanilla – a subset of the ingredients you already have, but in different ratios and use
Actually, you can even make a frosting with flour, sugar, eggs, sugar (uh… you listed ‘sugar’ twice), butter, and vanilla—i.e., the exact same ingredients!
(The flour and the butter make a roux, the eggs and the sugar make a zabaglione base, then you mix both things together, add vanilla, and bam! Meanwhile, a cake made from the given ingredients would be a chiffon cake.)
EDIT: But I would not recommend using this type of frosting with this type of cake.
I got a feel of this growing sunflowers last year. They got quite big, significantly taller than me. They died in autumn as herbaceous plants do, but as I cut them down I noticed the part where the stem meets the root was seriously woody. Like, I could have cut out a small piece and convinced someone it was a piece of wood.