Suspiciously balanced evidence 2020-02-12T17:04:20.516Z
"Future of Go" summit with AlphaGo 2017-04-10T11:10:40.249Z
Buying happiness 2016-06-16T17:08:53.802Z
AlphaGo versus Lee Sedol 2016-03-09T12:22:53.237Z
[LINK] "The current state of machine intelligence" 2015-12-16T15:22:26.596Z
Scott Aaronson: Common knowledge and Aumann's agreement theorem 2015-08-17T08:41:45.179Z
Group Rationality Diary, March 22 to April 4 2015-03-23T12:17:27.193Z
Group Rationality Diary, March 1-21 2015-03-06T15:29:01.325Z
Open thread, September 15-21, 2014 2014-09-15T12:24:53.165Z
Proportional Giving 2014-03-02T21:09:07.597Z
A few remarks about mass-downvoting 2014-02-13T17:06:43.216Z
[Link] False memories of fabricated political events 2013-02-10T22:25:15.535Z
[LINK] Breaking the illusion of understanding 2012-10-26T23:09:25.790Z
The Problem of Thinking Too Much [LINK] 2012-04-27T14:31:26.552Z
General textbook comparison thread 2011-08-26T13:27:35.095Z
Harry Potter and the Methods of Rationality discussion thread, part 4 2010-10-07T21:12:58.038Z
The uniquely awful example of theism 2009-04-10T00:30:08.149Z
Voting etiquette 2009-04-05T14:28:31.031Z
Open Thread: April 2009 2009-04-03T13:57:49.099Z


Comment by gjm on How my school gamed the stats · 2021-02-24T20:47:58.890Z · LW · GW

To whatever extent that's my fault, I'm sorry too. :-)

Comment by gjm on How my school gamed the stats · 2021-02-24T10:28:59.224Z · LW · GW

I don't actively "want to continue", in that it seems to me that the whole content of this discussion is you saying or implying that I've badly misrepresented how affluent my area is, and me pointing out in various ways that that isn't so.

However, your last paragraph seems once again like an accusation of inconsistency, so let me clarify.

"Upper middle class" means different things in different places. In the US, "class" is largely (but not wholly) about wealth. In the UK, "class" is largely (but not wholly) about social background. These are less different than that makes them sound because the relevant differences in social background are mostly driven by the wealth of one's forebears, and in both societies there is a strong correlation between that and one's own wealth.

The US has a more wealth-based notion of class and is also richer. So being "upper middle class" in the US means a level of wealth that would make you quite rich in the UK.

The UK has a more social-background-based notion of class, which in particular is strongly influenced by the existence of a (statistically very small) aristocratic class. So "upper class" in the UK means a smaller, more-elite fraction of the population than in the US, and "upper middle class" is pulled in the same direction. So being "upper middle class" in the UK typically (but not always, because of the wealth/background distinction) means being at a distinctly higher percentile of wealth than it does in the US.

The combined effect of these things is to put the typical "upper middle class" person or family at something like the same level of wealth in the two countries, although there's plenty of fuzziness and variability.

(Perhaps I have by now made it clear that I do have some idea how social class works in the UK, enough so that you might believe me if I tell you that (1) I am definitely lower-upper-middle-class and (2) my household income is somewhere around the 95th percentile.)

So, issue 1 is that you're wanting to call my neighbourhood "upper middle class" even though the people here don't fit either the UK or the US notion of "upper middle class", because you think it matches the definition you'd get if you applied the US-based notion directly to the UK despite the substantial differences between the two societies.

This would be (annoying but) excusable given that your main point was to suggest that my daughter's school may be more atypical than I was claiming. But there's more.

That percentile-ranking tool is not ranking individuals, it is ranking areas of the country. Areas vary less than individuals do, and (e.g.) an 80th-percentile area is not composed mostly of 80th-percentile individuals. On the other hand, terms like "upper middle class" are descriptions of individuals and families, and only secondarily of areas. If calling an area "upper middle class" (or "upper class", or whatever) means anything, it should mean an area whose people are mostly of the class in question.

Almost no areas are "upper class" or even "upper middle class".

(Perhaps an analogy may help. Suppose you have an area where the Jewish population is at the 95th percentile of areas in the country. Would you call it "very Jewish"? You probably shouldn't, because I bet that 95th-percentile population is still <5%. I submit that "upper middle class" is like "Jewish" in this respect.)

In a typical 80th-percentile area, most people are middle-middle-class. That's unusual; in a more typical area a large fraction will be lower on the socioeconomic scale. How might you describe such an area? Well, maybe as a "nice middle-class area", for instance. Which happens to be exactly the term I used.

"But £36k is distinctly higher than a typical middle-middle-class salary in the UK." It's not all that much higher; median UK household income is £30k. Second, the £36k figure is a mean, not a median. Mean incomes are always higher than median incomes. UK mean household income turns out to be about £37k, if I've done my calculations correctly. Both the area mean of ~£36k and the national mean of ~£37k are described as "equivalised" and I don't know whether that means the same thing in both cases, but the point is that this area is in fact about as well-off as the UK as a whole. No contradiction with the 80th-percentile thing; most areas are (when you calculate in terms of means) a bit poorer than average and a few are substantially richer.

So, issue 2 is that you're taking terms that describe individuals and applying them to areas in a profoundly misleading way. You can call an 80th-percentile individual "upper middle class" if you like, though actually most classifications wouldn't call them that either in the UK or in the US; but an 80th-percentile area is still not an "upper middle class" area. That's not how the words work.)

Comment by gjm on How my school gamed the stats · 2021-02-23T22:33:48.855Z · LW · GW

I said the school was "in a nice middle-class area". You replied (way way back upthread):

Note that only around 3% of UK residents have PhDs - so I strongly suspect that what you're calling "middle-class" is closer to the top 5% of the population, or what sociologists would say is the very upper part of the upper middle class.

Which only makes sense to me if it's saying that what I described as middle-class is actually "the very upper part of the upper middle class", and what I described as middle-class is not "the best-off parents of pupils at the school" but the area the school was in. (For greater precision I should really have been talking about the people at the school rather than the area as such, but I take it that was always understood, and in fact I think the school's pupils are pretty representative of its catchment area.)

And of course I agree that the school isn't average. That's why I said, way back in my original comment,

It's in a nice middle-class area with, e.g., quite a substantial fraction of parents having PhDs, and she's one of the most able students, all of which of course makes it more likely that the school has an easier time of it and that what she sees doesn't include the worst the school has to offer. But then you said that your school was in a nice middle-class area and you were mostly in top classes, so the two seem broadly comparable.

Emphasis added here to make it absolutely clear that right at the outset I explicitly noted the things you're now suggesting I was somehow trying to hide or deny. Note the last sentence: I wasn't responding to a report about the failings of an average school by saying "but my nice middle-class school is different", I was responding to a report about the failings of a nice middle-class school by saying "but my nice middle-class school seems to be different from your nice middle-class school".

I also want to push back a bit again about "the middle of the upper middle class" and "very high-socioeconomic-status parents". I think it is flatly untrue that the area I'm in is "in the middle of the upper middle class" by either UK or US definitions, and I think it's at best debatable whether "a sizeable portion of the children at the school largely have very high-socioeconomic-status parents". Let me quote you some bits of Wikipedia, as indicative of typical usage of the term "upper middle class".

The upper middle-class are traditionally educated at independent schools, preferably one of the "major" or "minor" "public schools" which themselves often have pedigrees going back for hundreds of years and charge fees of as much as £33,000 per year per pupil (as of 2014).

Very few people in the village where I live send their children to public schools, or even to other independent schools. (And obviously the parents of children at my daughter's school don't.)

Although such categorisations are not precise, popular contemporary examples of upper-middle-class people may include Boris Johnson, Catherine, Duchess of Cambridge, David Cameron, and Matthew Pinsent (athlete).

I don't know anything much about Matthew Pinsent, but again: if there's anyone here in the same social class as Boris Johnson, David Cameron, or the Duchess of Cambridge, it's news to me. Maybe there are some lurking somewhere (though I doubt it), but the idea that those could be typical of the population around here is absurd.

What about US notions of the upper middle class? They don't fit much better. Social class in the US is more about money than it is in the UK. Here's Wikipedia again:

Sociologists Dennis Gilbert, William Thompson and Joseph Hickey estimate the upper middle class to constitute roughly 15% of the population. [...] In 2020, the threshold for entering the top 15% of American household incomes is $166,000

I'm sure there are people around here whose annual household income is over $166k, but they're a small minority. (I think my family is one of the better-off ones around here; our household income is comfortably below $166k.)

So: no, our neighbourhood is pretty comfortably off, but it's not anywhere near "the middle of the upper middle class" whether you mean the British or the US upper middle class. (It would be near the bottom of the upper middle class, if you translated the "top 15%" criterion from the US to the UK, but note that that gives you a group of people markedly less elevated than either the people called "upper middle class" in the UK or the people called "upper middle class" in the US.)

As for "very high-socioeconomic-status parents": I guess it depends on what you mean by "very high". This is an area rich in engineers (broadly defined), who tend to be higher in income than in social status (i.e., if you pick a bunch of people earning say a very comfortable £100k/year, there'll be engineers and doctors and lawyers and quants and senior managers and the like, and the engineers will on the whole be the lowest-status of them). And, while I don't know anyone's income other than my own, my impression is that e.g. the fraction earning over ~£80k/year, which according to the UK government's tables is about the 95th percentile, is not much bigger than 5%. In fact, my best guess is that it's somewhat below 5%. But let's suppose that fully 10% of the parents at my daughter's school are top-5% -- which, again, I bet is not true. Is that "very high"? I personally wouldn't say so, any in any event I don't think "maaaaybe as many as 10% of the parents are top-5%" is a stronger statement than the one I already made at the very outset about this being a nice middle-class school where quite a few parents have PhDs.

(I'm aware that the tone of this comment is a bit defensive. I'm sorry if that makes it disagreeable to read. For what it's worth, the way it feels to me right now is that I have gone out of my way to be honest about ways in which the experience I'm reporting may not be representative, and have been met with Isolated Demands For Rigor and implications that I have a wildly inaccurate idea of my own level of privilege and that of others around me, and that even after the person making these accusations has made a concrete prediction that turned out to be wrong while mine was spot-on there's still this constant suggestion that surely I'm trying to mislead everyone about how nice-and-middle-class my area is, and it's a little bit tiring.)

Comment by gjm on Kelly isn't (just) about logarithmic utility · 2021-02-23T16:34:02.553Z · LW · GW

Of those, I think I would vote for one of the fractional-Kelly ones -- providing arguments for something (fractional Kelly) rather than merely against naive Kelly. (Suppose I think one should "Kelly bet on everything", and you write an article saying that Kelly is too volatile, and I read it and am convinced. Now I know that what I've been doing is wrong, but I don't know what I should be doing instead. On the other hand, if instead I read an article saying that fractional Kelly is better in specific circumstances for specific reasons, then it probably also gives some indication of how to choose the fraction one uses, and now I have not merely an awareness that my existing strategy is bad but a concrete better strategy to use in future.)

Comment by gjm on Kelly isn't (just) about logarithmic utility · 2021-02-23T16:27:09.227Z · LW · GW

There's something odd about the tone of this piece. It begins by saying three times (three times!) "Kelly isn't about logarithmic utility. Kelly is about repeated bets." Then, later, it says "Yes, Kelly is maximising log utility. No, it doesn't matter which way you think about this".

I think you have to choose between "You can think about Kelly in terms of maximizing utility = log(wealth), and that's OK" and "Thinking about it that way is sufficiently not-OK that I wrote an article that says not to do it in the title and then reiterates three times in a row that you shouldn't do it".

I do agree that there are excellent arguments for Kelly-betting that assume very little about your utility function. (E.g., if you are going to make a long series of identical bets at given odds then almost certainly the Kelly bet leaves you better off at the end than any different bet.) And I do agree that there's something a bit odd about focusing on details of the formula rather than the underlying principles. (Though I don't think "odd" is the same as "wrong"; if you ever expect to be making Kelly bets, or non-Kelly bets informed by the Kelly criterion, it's not a bad idea to have a memorable form of the formula in your head, as well as the higher-level concept that maximizing expected log wealth is a good idea.) But none of that seems to explain how cross you seem to be about it at the start of the article.

Comment by gjm on How my school gamed the stats · 2021-02-23T14:14:09.947Z · LW · GW

Some other relevant numbers: mean household income in my village (and the area around it that's part of the same area, as used by that tool) is about £36k before, and about £33k after, the cost-of-living adjustment. Those are means; presumably the median is lower.

Again, that makes it a better-off-than-average area, but note that £36k is not by any reasonable standard a middle-upper-middle-class household income. So yes, this is definitely a nice area, but no, it's not the case that everyone here is very well off or very high-status.

Comment by gjm on How my school gamed the stats · 2021-02-23T12:22:04.255Z · LW · GW

I checked. Annoyingly, the tool you linked to only tells you which 10%-sized block of percentiles the area is in. It says 70-80 before, and 80-90 after, adjusting for housing costs. (If you're trying to measure social status, upper-middle-class-ness, etc., then I claim you should actually use the figures before adjusting for housing costs.)

That's the village where I live and where the school is located, but it takes pupils from other places too; the neighbouring village that I think provides the largest number of other pupils is in the 70-80 percentile by both metrics.

The same page has a thing that gives finer-grained percentiles but only for the before-housing-costs figure (which, again, I think is actually the more relevant here). My village gets about 82%, the other one I mentioned gets about 78%.

I think all of this exactly matches my original description: very nice middle-class area but not "the very upper part of the upper middle class".

I'm not sure exactly what you mean by "still pretty well correlated"; 97% on one -> 90% on the other isn't so different from what my toy model says, and 97% on one -> ~80% on the other (which I think is better supported by the evidence) is pretty much exactly what my toy model says.

Comment by gjm on What are the most powerful lotuses? · 2021-02-23T11:54:57.460Z · LW · GW

First of all, this isn't only about literal drug addictions; other examples mentioned in the subthread are eating too much sugar (which surely has some addiction-like qualities but I wouldn't bet heavily that it works just the same way) and advertising (where the people who might be deterred by taxation are the advertisers, whose relationship to advertising is surely not one of addiction). Even for drug addiction, the relevant taxes might reduce profits for the sellers who are generally not addicted, whether or not they affect the behaviour of the buyers.

Second, I dispute that we "know" that increasing incentives not to take some addictive drug consistently does nothing to reduce addictive-drug-taking. Maybe I'm just not aware of some relevant evidence; would you like to give me a pointer to, say, the two things that you think are the strongest evidence for that proposition? (As opposed to weaker ones like "sometimes if you deter people from using one drug they switch to another", which I'm sure is true.)

For the avoidance of doubt, I entirely agree that governments should be trying much harder to reduce the harm done by drug addiction and trying much less hard to punish people for getting addicted to drugs. I'm just not at all convinced by one specific argument you made, namely that taxing "lotuses" can't reduce how much they're used because it decreases the government's incentive to do it.

Comment by gjm on How my school gamed the stats · 2021-02-23T01:03:40.775Z · LW · GW

It's certainly possible that I'm underestimating the level of privilege here. But I guarantee that the area is not at all "the very upper part of the upper middle class". In particular, I know a lot of the parents-with-PhDs, and I'm pretty sure none of us is upper-upper-middle-class by any reasonable definition. To whatever extent the school sounds startlingly privileged rather than merely distinctly more than averagely privileged, I think it's more likely that my estimates of the parents are skewed than that the school is really super-duper-elite. (For instance: I'm an academic sort of person, the people I know will tend to be academic sorts of people, and so it would be very unsurprising if I overestimated how many parents have PhDs. Also, I am consistently trying to overestimate rather than underestimate, because I want to be honest about the fact that this school is serving a pretty "good" population.)

I don't think I agree with your second paragraph. I have a super-handwavy model in my head according to which populations like "Cambridge graduates", while of course both well-educated and high-status, are less high-status than you would expect from e.g. seeing where they sit in the distribution of education and assuming they're in the same place in the distribution of social status. Let me try to make it a bit more concrete and see whether I still disagree with you.

Effect #1: education and status are correlated but not at all the same thing. Toy model: status = education + otherstuff, both are uniform(0,1). If status <= 1 then status quantile = status^2/2; if status >= 1 then status quantile = 1 - (2-status)^2/2. If your education quantile (= your education) is 0.9 then your status is uniform between 0.9 and 1.9, so your average status quantile is (if I've got the calculations right) about 0.78. If your education quantile is 0.97 then your average status quantile is about 0.82. High, but not that high.

Effect #2: when you condition on somewhat extreme values of one variable, its correlations with other correlated variables tend to go away. Toy model: same as above. If I've done my calculations right, the (Pearson) correlation coefficient between education and status in the whole population is 1/sqrt(2); in the population with education >= 0.9 it's 1/sqrt(101).

Effect #3 (I'm much less sure about this one): education and status are correlated because of a bunch of causal links (e->s, s->e, other->{e,s}), and in a population like the Cambridge-area one you get more people who are highly educated for less-status-linked reasons. I won't bother with a toy model because it's obvious how this works, and the questionable bit is whether I'm right about the local population. I'm not at all sure I am.

Effect #4: while education and status correlate, I'm not so sure even of the sign of the correlation at the extremes. Very able people with ambitions that aren't strictly academic don't (I think) usually do PhDs. I wouldn't be surprised to find that wealth and status are higher on average for people whose academic performance would have got them a PhD place if they'd wanted one but who preferred to do something else, than for people who actually did a PhD. I wouldn't be surprised to find it the other way around, either.

Comment by gjm on The Comprehension Curve · 2021-02-23T00:17:08.137Z · LW · GW

When I was a child learning to read, most of the things I was reading were pretty easy to understand. I was clever and a good learner, and my natural way of reading is very quick, about 1000wpm for reasonably straightforward material. But I am no longer a child, and many of the things I want to read are not nearly so easy to understand (mostly because I want to read textbooks and technical papers and dense novels and so forth; probably also because at 50 I am less clever than I was at, say, 20). The techniques I acquired naturally when learning to read are a bad match for much of my actual reading, and if I want to understand things I need to go out of my way to slow things down.

I suspect the story in the paragraph above could be told with equal truth by many people around here.

As well as providing some confirmation for what AAB says about comprehension rate versus reading rate, I think this suggests that the idea of thinking of speed-reading techniques as "eye technique" that are likely to be helpful to everyone might be too optimistic. I would guess that the way I read quickly is similar to the way in which "speed readers" read quickly, mechanically at least, but unfortunately it gives me not only the ability to read quickly but also the habit of reading quickly. If I sit down to read something, the pattern in which my eyes naturally move is one that works well for stuff I can assimilate easily, and doesn't work so well for stuff I have to think harder about as I read. I suspect that in order for "eye technique" to be useful it has to become habitual, and I suspect that "eye technique" that's useful for reading quickly is systematically anti-helpful for reading slowly and deliberately. (But, for the avoidance of doubt, all of this is conjecture. I haven't studied "speed reading" techniques, I haven't compared them with how I read, I don't know whether if I went about it the right way I could make my fast-reading habits more optional, I don't know whether other people using similar techniques would form the same habits as mine, etc., etc., etc.)

Comment by gjm on What are the most powerful lotuses? · 2021-02-21T23:53:12.277Z · LW · GW

If you (= the government) tax something, then you (i.e., the government) have little incentive to decrease it, but the people doing it have a clear financial incentive to decrease it.

So a government that puts a big tax on whatever-it-is gives up its own ability (or at least motivation) to decrease the thing directly, while giving the people who actually do it an incentive to do so. It's reasonable, at least in some cases, to hope that those people are better placed to decrease the thing than the government is.

Comment by gjm on How my school gamed the stats · 2021-02-21T17:40:33.865Z · LW · GW

Well, the "substantial fraction" I have in mind isn't all that large. Certainly not more than 20% of pupils having >= one parent with a PhD ~= 10% of parents having PhDs, which would be ~3x the general population, suggesting that the population here might be comparable to the top 1/3 of the population as a whole. Kinda. Which is, as I say, certainly a nice middle-class area, but not much like taking the top 5% of the population.

Class terminology is used differently by different people; the Wikipedia page for "upper middle class", which is probably reasonably representative, says that "[t]he upper middle class in Britain traditionally consists of the educated professionals who were born into higher-income backgrounds, such as legal professionals, executives, and surgeons"; interpreting "higher-income" fairly broadly, that might be 10% of the school's pupils. So no, not by any means "the very upper part of the upper middle class".

(Also, we're near Cambridge, which means a lot of people with good academic qualifications; so in this population, good academic qualifications will be less evidence of e.g. wealth and social status than in the population at large.)

Comment by gjm on The Kelly Criterion in 3D · 2021-02-21T17:28:35.263Z · LW · GW

Well, near zero utility ~ log(wealth) would mean infinite negative utility for zero wealth. That seems obviously false -- I would hate to lose all my possessions but I wouldn't consider it infinitely bad and I can think of other things that I would hate more. (So near zero I think reality is less nonlinear than the log(bankroll) assumption treats it as being.) In reality, of course, your real wealth basically never goes all the way to zero because pretty much everyone has nonzero earning power or benevolence-of-friends or national-safety-net, and in any case when you're contemplating Kelly-style bets I think it's common to use something smaller than the total value of all your possessions as the bankroll in the calculation.

Comment by gjm on The Prototypical Negotiation Game · 2021-02-21T00:52:04.342Z · LW · GW

I think "empire" and "empirical" have less to do with one another than one would guess, but ultimately there is (probably) a connection.

"empire" and "emperor" of course comes from Latin "imperium" and "imperator", from "imperare" to command. (Harry Potter fans may wish to know that the first person singular indicative active form is "impero", not "imperio" :-).)

"empirical" comes from Greek "empeiria" meaning "experience".

So I guess the question is whether "imperare" is related to "empeiria" somehow. Well, yes and no. It appears that "imperare" is from "in" (preposition) + "parare" (to prepare), whereas "empeiria" is from "en" (preposition) + "peira" (a trial or attempt). Greek "en" is the equivalent of Latin "in", so the initial bits are indeed the same.

But what about "parare" and "peira"? Those are words with quite different meanings. But they are both thought to come from PIE *per-, which has a number of meanings (note: all this stuff is conjecture, but not my conjecture). Oldest seems to be something like "first" or "front" -- we get "first" from this, and "pre-", and all sorts of other things. That seems to have given rise to "go through", "carry forth", etc., which is where "paro" comes from. And that seems to have given rise to "try", "dare", "risk", etc., which is where "peira" comes from.

So the (partly conjectural) tree looks like this. Conjectured bits are in square brackets.

  • [PIE *per-, meaning things like "first" and "front"]
    • [PIE *per-, meaning things like "bring out"]
      • Latin paro, meaning "prepare"
        • Latin impero, meaning "command"
          • Latin imperium and imperator
            • English empire and emperor
      • [PIE *per-, meaning things like "try"]
        • Greek peira, meaning "trial" and "attempt"
          • Greek empeiria, meaning "experience"
            • Greek empeirikos, meaning "empirical"
              • English empirical

If you believe the PIE reconstructions then there is a common origin. But as far as actually known words goes, the oldest we've got is "paro", prepare, and "peira", attempt, and it's not obvious prima facie that those are actually related.

Comment by gjm on How my school gamed the stats · 2021-02-20T22:48:53.235Z · LW · GW

Just to provide a contrary datapoint: My daughter is in (UK state) secondary school right now, and nothing she has told us about her experiences there suggests that any of the awful behaviour you describe (of pupils, teachers, or school management) has been normal or anything like normal at her school at any time while she's been there.

It's in a nice middle-class area with, e.g., quite a substantial fraction of parents having PhDs, and she's one of the most able students, all of which of course makes it more likely that the school has an easier time of it and that what she sees doesn't include the worst the school has to offer. But then you said that your school was in a nice middle-class area and you were mostly in top classes, so the two seem broadly comparable.

My own experience of (UK state secondary) school was more like hers than yours, but as (1) that was back in the 1980s and (2) I was in a part of the country that had grammar schools and my school was an exceptionally strong one academically, that's of limited relevance to what most schools are like now.

My wife's experience of (UK state secondary) school was also more like hers than yours; again, that was back in the 1980s, and again she was at an unusually strong school (not a grammar school, though as it happens its name had "Grammar School" in it for hysterical raisins).

[UK state secondary school = US public high school]

Comment by gjm on The Kelly Criterion in 3D · 2021-02-20T14:52:15.866Z · LW · GW

Kelly does kinda take nonlinearity of money into account, in the following sense.

Suppose your utility increases logarithmically with bankroll. (I think it's widely thought that actually it grows a bit slower than that, but logarithmically will do.) Suppose you make a bet that with probability p wins you a fraction x of your bankroll and with probability 1-p loses you a fraction bx. You get to choose x but not p or b. Then your expected utility on making the bet is  whose derivative w.r.t. x is . You get max expected utility when  or equivalently when  which is exactly the Kelly bet. So betting Kelly maximizes your (short-term) expected utility,  if your utility grows logarithmically with bankroll.

Comment by gjm on The Kelly Criterion in 3D · 2021-02-20T14:38:51.783Z · LW · GW

I agree, and I'd put it in slightly different terms again.

Your edge is , how much bigger your own probability estimate is than the one implied by the odds you're getting. The maximum possible edge is , what your edge would be if you knew you would win. (This also equals your counterparty's probability that you lose.) Kelly says: the fraction of your funds to bet equals the fraction of the maximum possible edge that you've got.

Equivalently: you bet nothing if you've got no edge, bet all your funds if you know you're going to win, and interpolate linearly in between. (That's exactly lsusr's observation, of course.)

Comment by gjm on Open & Welcome Thread – February 2021 · 2021-02-18T23:30:34.310Z · LW · GW

A hamburger.

Comment by gjm on Rafael Harth's Shortform · 2021-02-15T21:37:31.546Z · LW · GW

If you allow series that are infinite in both directions, then you have a new problem which is that multiplication may no longer be possible: the sums involved need not converge. And there's also the issue already noted, that some things that don't look like they equal zero may in some sense have to be zero. (Meaning "absolute" zero = (...,0,0,0,...) rather than the thing you originally called zero which should maybe be called something like  instead.)

What's the best we could hope for? Something like this. Write R for , i.e., all formal potentially-double-ended Laurent series. There's an addition operation defined on the whole thing, and a multiplicative operation defined on some subset of pairs of its elements, namely those for which the relevant sums converge (or maybe are "summable" in some weaker sense). There are two problems: (1) some products aren't defined, and (2) at least with some ways of defining them, there are some zero-divisors -- e.g., (x-1) times the sum of all powers of x, as discussed above. (I remark that if your original purpose is to be able to divide by zero, perhaps you shouldn't be too troubled by the presence of zero-divisors; contrapositively, that if they trouble you, perhaps you shouldn't have wanted to divide by zero in the first place.)

We might hope to deal with issue 1 by restricting to some subset A of R, chosen so that all the sums that occur when multiplying elements of A are "well enough behaved"; if issue 2 persists after doing that, maybe we might hope to deal with that by taking a quotient of A -- i.e., treating some of its elements as being equal to one another.

Some versions of this strategy definitely succeed, and correspond to things just_browsing already mentioned above. For instance, let A consist of everything in R with only finitely many negative powers of x, the Laurent series already mentioned; this is a field. Or let it consist of everything that's the series expansion of a rational function of x; this is also a field. This latter is, I think, the nearest you can get to "finite or periodic". The periodic elements are the ones whose denominator has degree at most 1. Degree <= 2 brings in arithmetico-periodic elements -- things that go, say, 1,1,2,2,3,3,4,4, etc. I'm pretty sure that degree <=d in the denominator is the same as coefficients being ultimately (periodic + polynomial of degree < d). And this is what you get if you say you want to include both 1 and x, and to be closed under addition, subtraction, multiplication, and division.

Maybe that's already all you need. If not, perhaps the next question is: is there any version of this that gives you a field and that allows, at least, some series that are infinite in both directions? Well, by considering inverses of (1-x)^k we can get sequences that grow "rightward" as fast as any polynomial. So if we want the sums inside our products to converge, we're going to need our sequences to shrink faster-than-polynomially as we move "leftward". So here's an attempt. Let A consist of formal double-ended Laurent series  such that for  we have  for some , and for  we have  for some . Clearly the sum or difference of two of these has the same properties. What about products? Well, if we multiply together  to get  then . The terms with  are bounded in absolute value by some constant times  where  gets its value from  and  gets its value from ; so the sum of these terms is bounded by some constant times  which in turn is a constant times . Similarly for the terms with ; the terms with  both of the same sign are bounded by a constant times  when they're negative and by a constant times  when they're positive. So, unless I screwed up, products always "work" in the sense that the sums involved converge and produce a series that's in A. Do we have any zero-divisors? Eh, I don't think so, but it's not instantly obvious.

Here's a revised version that I think does make it obvious that we don't have zero-divisors. Instead of requiring that for  we have  for some , require that to hold for all . Once again our products always exist and still lie in A. But now it's also true that for small enough , the formal series themselves converge to well-behaved functions of t. In particular, there can't be zero-divisors.

I'm not sure any of this really helps much in your quest to divide by zero, though :-).

Comment by gjm on Your Cheerful Price · 2021-02-13T16:01:50.769Z · LW · GW

I think it's more more complicated than that.

For some people, some things will have no Cheerful Price in the sense defined in this article, because a CP is meant to be a price at which no part of their mind is on balance unhappy about doing the thing, and if there's some bit of you that really really doesn't want to have sex with me[1] and that bit of you isn't interested in money, then no amount of money will remove the ouchiness of the transaction.

That might produce the same "how dare you? some things are too sacred to be bought and sold" reaction as if I'd asked you your CP for baking me a cake, but I don't think it's the same underlying cause. You may be absolutely on board with the idea that people exchange things for other things, even with friends, there may even be other people you would trade sex for money with because you don't particularly want to have sex with them but wouldn't hate it if you did, but you may still, specifically, find the prospect of having sex with me aversive[1] enough not to be OK with being asked your price.

But there are surely also people for whom sex is just a particularly-strong example of something Too Sacred To Trade Explicitly, and who might respond with "actually, I'd have been happy to have sex with you if you'd just asked me the right way, but now that you've made it transactional the idea repels me".

[1] It's OK. I don't actually want to have sex with you either. :-)

Comment by gjm on Quadratic, not logarithmic · 2021-02-11T03:00:22.593Z · LW · GW

But if infected you may infect several other people, and indirectly the number of people infected who wouldn't have been if you'd been more careful may be very large.

Comment by gjm on We got what's needed for COVID-19 vaccination completely wrong · 2021-02-10T17:21:34.881Z · LW · GW

What do you mean by "suppressing"?

I know of one instance of attempted suppression: Stöcker gave his alleged vaccine to a bunch of people and someone brought a legal action against him for running an unregistered clinical trial.

If that's what you mean: it seems to me that if someone breaks the law, then taking the legal steps that are generally taken against people who break the law isn't obviously properly described as "suppressing his efforts", even if he claims he did it for the Greater Good. (It sounds as if you reckon he didn't break the law because he's a doctor; maybe you're right -- I don't know anything about German law -- but it seems to me unlikely that anyone would bother making laws against unregistered clinical trials if you could work around them simply by having a doctor administer the drugs, since surely that happens in pretty much all clinical trials anyway, and I've got to assume that whoever brought the case against Stöcker has some idea what the law says.)

If instead you mean that the German agencies didn't actively work with Stöcker to bring his alleged vaccine to proper clinical trials, mass production, etc., then again it seems to me that this doesn't deserve the name of "suppression". I bet there are thousands of people claiming to have better ways to deal with COVID-19, and I bet they're all writing to institutions like the PEI demanding that things be done their way. What algorithm are you proposing they should have executed that would have told them to put time and effort into working with Stöcker but not with dozens of cranks? And how are you so sure that Stöcker isn't a crank himself? Again, so far as I can tell we have only his word for it that his alleged vaccine is safe and effective, even on the small sample of people he allegedly tested it on.

Basically it's saying that if we see the most informed people doing X then that's no evidence that doing X is a good thing to do.

If "basically "means "not", I guess. I don't see much connection between what I actually said and what you say I "basically" said.

If you mean that in addition to the RadVac folks' general argument that their alleged vaccine should be safe and effective, the fact that they have chosen to actually take it is a little bit more evidence, then I guess I agree: but (obviously, I think?) it's not much further evidence. (It tells me that they probably find their own arguments convincing, which isn't exactly a surprise.) And I don't see any obvious reason why I should consider the RadVac folks "the most informed people".

Comment by gjm on We got what's needed for COVID-19 vaccination completely wrong · 2021-02-10T11:26:13.519Z · LW · GW

The evidence that the adenovirus-based and mRNA-based vaccines are safe and effective is large-scale trials with tens of thousands of participants.

The evidence that Stöcker's peptide-based vaccine is safe and effective is Stöcker's own testimony that he and a few other people used it, didn't get sick, and have the relevant antibodies. [EDITED to add:] Sorry, "a few" isn't fair because he claims to have done a further experiment with 64 people. ~70 people is definitely better evidence than ~5 people, but I don't see any reason to trust Stöcker enough to be confident that what he says about those ~70 people is actually true.

The evidence that RadVac is safe and effective is a general argument that it should be safe and might be effective, and so far as I can tell nothing else at all.

It might be true that the peptide approach is better, but I'd want to see much better evidence before calling for a "Truth and Reconciliation Committee" to look into the alleged awful misdeeds of the people who promoted other approaches.

Comment by gjm on The 10,000-Hour Rule is a myth · 2021-02-01T14:12:34.899Z · LW · GW

On the other hand, spending say 2k hours on getting really-good-but-not-world-class on the piano may well be time well spent for any serious musician, because being a good pianist is really useful.

Comment by gjm on Wordtune review · 2021-01-26T11:52:36.086Z · LW · GW

Not very surprisingly, I felt that every single one of the Wordtune versions was strictly worse than your original that it took as input.

(That doesn't necessarily mean it's useless. It might improve worse-written text. Some of its rewrites might be usable as the basis for something better. Sometimes you might want badly-written text of a kind it's good at creating; e.g., it looks like asking it to be "formal" produces some of the same pathologies humans sometimes produce when trying to write formally, so if you needed examples of that kind of bad writing you could maybe get some with Wordtune. Not that there's any shortage of examples elsewhere.)

Comment by gjm on Lessons I've Learned from Self-Teaching · 2021-01-24T17:14:44.105Z · LW · GW

I haven't thought it through, but I suspect the theorems you want to prove about integrals are probably easier to prove with the definition TurnTrout gave than with that one. 

Comment by gjm on Sunzi's《Methods of War》- The Army's Form · 2021-01-23T14:48:21.973Z · LW · GW

Nitpick: Are you missing a "not" in "Noticing the sun and moon does indicate a keen eye"?

Comment by gjm on How do you measure conformity? · 2021-01-18T21:35:50.104Z · LW · GW

"They laughed at Galileo. They laughed at Einstein. But they also laughed at Bozo the Clown."

It's true that being shunned or attacked is a sign that you've likely done something nonconforming and transgressive. But some things are nonconforming and transgressive just because they're stupid or obnoxious.

I think what this means is that trying to measure how non-conformist you are, and hoping that "the more non-conformist the better", is a mistake. Being unlike other people isn't a good goal, in itself, because most ways of being unlike other people are not improvements. What you need to identify is ways of doing better, specifically, and measuring whether they laugh at you / shun you / stone you won't do that, because they'll do that for being worse as well as for being better.

Comment by gjm on D&D.Sci II Evaluation and Ruleset · 2021-01-18T03:36:07.894Z · LW · GW

I don't think the trap was horrible and unfair. Rule One of data science is: always look at your freakin' data rather than blindly feeding it into the sausage-making machine and hoping you'll be able to eat what comes out.

Comment by gjm on Discussion on the choice of concepts · 2021-01-14T19:45:44.063Z · LW · GW

"Since you're troubled by the other possibly unwanted associations of the word stupid, how about we just agree to say that toasters aren't highly intelligent? It doesn't really matter whether you say that that's because toasters aren't the sort of thing one can call intelligent, or that it's because you could call them intelligent if they were but they aren't; either way we can agree that toasters are not highly intelligent agents, and that's what matters."

"Oh, yeah, that works."

"Great. Let's move on."

(Of course, in many arguments about how one should define things there isn't a sufficiently convenient circumlocution, either because there isn't a good one at all or because it's super-important to have a handy short term or because the question is exactly about how one particular term should be used.)

Comment by gjm on D&D.Sci II: The Sorceror's Personal Shopper · 2021-01-13T13:03:21.812Z · LW · GW

Meta: there's one word in that comment that's kinda spoilery and you should maybe spoilerize it.

Comment by gjm on D&D.Sci II: The Sorceror's Personal Shopper · 2021-01-12T17:02:56.133Z · LW · GW

Proposed buy (no explanations but may still be spoilery; there is a lot I still don't understand so I suspect one can do better):

 WH o Ju, Pl o Pl, Ha o Ca, Pe o Ho. I expect a little under 130 mana, for a cost of 144gp.

Explanations (definitely spoilery):

 Yellow-glowing things get 18-21 mana; I haven't found patterns beyond that. Green-glowing things get 2-40 mana, always an even number; I haven't found patterns beyond that. Red-glowing things get 2^a 3^b 5^c mana; other than the fact that somehow we never get >96 even though we separately get 64, 27, 5, I haven't found patterns beyond that. Blue-glowing things get highly variable mana, also favouring small prime factors though 7 occurs; for these (and only these) the thaumometer gives plainly useful information, yielding the true mana gain +-1 except that items you wear yield a number too high by 22. So the two cheaper yellow items are pretty good value, as are the highest-thaumperature blue ones even though one of them is overrated. We should get at least 18+18 for the yellow ones and at least 34+54 for the blue ones.


I suspect there may be more going on than I yet understand with the red and green items, for which at present I don't think I know anything useful. And maybe the finer details of the yellow and blue ones are predictable too.

Comment by gjm on D&D.Sci II: The Sorceror's Personal Shopper · 2021-01-12T16:06:46.276Z · LW · GW

Then you are missing out. I have only a partial understanding of the phenomena so far, but I already have a set of four items that I think should pretty reliably get at least 120 mana for a total price below 150gp.

Comment by gjm on D&D.Sci II: The Sorceror's Personal Shopper · 2021-01-12T15:27:06.157Z · LW · GW

Wakalix the Wizard. Slogan: "Wakalix Maketh It Goe!"

Comment by gjm on Condition-directedness · 2021-01-08T15:02:55.038Z · LW · GW

Nitpick about that first paragraph: that sort of backward chaining is pretty common in chess, actually. Near the end of a game you're very often envisaging aparticular state of affairs and planning how to get there. Not necessarily the exact final state, but something like "I need his king in this corner or that corner, and I need it to arrive there when my knight is here or here so that my bishop can do that and deliver checkmate". Even in the middle of the game you may have intermediate goals like "move my pawn safely from e7 to e5" or "drive the knight away from e4 without weakening my king position".

It feels as if actually there's a continuum from "do things that make the situation better in some generalized sense" to "do things that make the situation better in particular ways that seem like they're likely to be useful" to "do things that make the situation better in particular ways I can see likely uses for" to "do things that make the situation better in particular ways that I definitely have concrete uses for" to "do things that bring about a broad class of later states that I like" to "do things that bring about a very specific range of later states that I like" to "do things that bring about a single specific outcome that I need".

Comment by gjm on [deleted post] 2021-01-08T09:58:15.072Z

I think it genuinely doesn't make sense to say that  reflects our prior expectation of ; the acolyte is correct. What  reflects is our prior on ; that regularization term corresponds exactly to a prior that makes  (multivariate) normally distributed with mean zero and covariance  times the identity (i.e., components independent and each component having variance ).

Comment by gjm on [deleted post] 2021-01-08T09:54:53.152Z

You've dropped a factor of  about half-way through your calculation. And then you've multiplied by  between two lines separated by "="; the idea is that both sides are zero so it kinda-sorta makes sense but it's super-misleading. If you restore the factor of  then your last equation ends up as .

But even this is wrong, I'm afraid. You can't multiply by  there at all. There is no  is not (except by coincidence, and in an ML application if this coincidence happens then you don't have anything like enough data) a square matrix and in general it has no inverse.

There are problems earlier in the derivation, too, which I think are encouraged by some of your nonstandard notation. E.g., you write  rather than  or , and this has fooled you into writing down something wrong for what you write as . That's also nonstandard notation; it's defensible but again it makes it easy to get things wrong by mixing up left and right multiplications. Let's do it with more standard and explicit notation, which will make it harder to make mistakes:

The  is constant and its derivative is zero. The terms linear in  are one another's transposes and readily yield . The second quadratic term is just  whose  is . The first quadratic term is similarly  which equals  whose  is .

So what ends up being zero is the th component of  and if you like you can write . But again you need to be very clear about what you mean by that;  means "the  such that to first order " and so actually the Right Thing to use for the "derivative" is the transpose of what I wrote down above.

Finishing off the correct derivation, we have

 so  so .

Comment by gjm on [deleted post] 2021-01-08T09:04:59.350Z

I'm afraid this is badly wrong.

You can't go from  to . Not even if you were allowed to assume that .

Concrete example: suppose we have two  values, we have  and , and  and . Then the first equation says  which simplifies to , whereas the second equation says  which simplifies to . These are not the same equation and do not have the same consequences.

If you attempt to perform linear regression using the equations you have derived, you will get the wrong answer.

Linear regression is reducible to matrix arithmetic, but the correct equations are slightly more complicated; the thing you need to invert is not the matrix  (which in general is not square and has no inverse) but .

Comment by gjm on Open & Welcome Thread - January 2021 · 2021-01-04T13:39:06.321Z · LW · GW

The first two links are broken for me but Amazon seems to have Beksinski 1-4:

Out of curiosity, what's the art criticism project? The other two things seem very different in kind from this one, on the face of it.

Comment by gjm on Gauging the conscious experience of LessWrong · 2020-12-23T17:25:19.394Z · LW · GW

I endorse all these, and also all of roryokane's that you haven't taken issue with, except that to me "bright" doesn't just mean high notes but also notes whose timbre includes a lot of energy at high frequencies. Also, "soft" is ambiguous between something like Rory's meaning and simply "quiet". (It maybe also suggests to me the opposite of "bright".)

Comment by gjm on Gauging the conscious experience of LessWrong · 2020-12-23T14:05:06.818Z · LW · GW

I have #1 but not #2.

My visual acuity is unusually good (or at least it used to be; 50-year-old eyes don't work as well as 25-year-old ones). I wonder whether there's a noise/resolution tradeoff here.

(Note: by "acuity" I mean something like "resolution after aberrations are corrected"; it's not the same thing as whether you need glasses/contacts, it's how well you see with whatever optical corrections you require.)

Comment by gjm on Gauging the conscious experience of LessWrong · 2020-12-23T13:56:20.264Z · LW · GW

Perhaps people are answering with figures for vividity relative to expectations in some sense.

[EDITED to add:] A specific sense of "relative to expectations" that I think is probably the main thing that's going on here: Vision is an extremely high-bandwidth sense, hearing decidedly less so. If there's limited bandwidth / processing available for synthesizing imaginary/remembered sense experiences, then we should expect visual experiences to fall further short than auditory ones.

(Of course that's an oversimplification. E.g., just how high that bandwidth would need to be may depend on what "level of processing" the imagined/remembered experiences get fed in at; we may be constructing fine details on demand and not notice; etc.)

Comment by gjm on Ideal Chess - drop chess perfected · 2020-12-21T03:31:02.556Z · LW · GW

I'm not keen on giving the king long-range moves either. It's possible that if you don't debuff the other pieces, nothing much like crazyhouse will fail to be terrifying in something like the way crazyhouse is.

I think you're right that small increases in mobility can make a big difference to how hard to checkmate the king is. But the particular increase in mobility you've proposed doesn't change anything at all while the king is on the back rank, and doesn't change much while it's on the second rank. In normal chess, and I think also in crazyhouse, the kings typically stay on the back rank for their own protection, at least until the endgame. Maaaaybe with the extra mobility the kings won't need to stay hidden to the same extent, but I think being exposed is still going to be dangerous even for the enhanced king.

[Meta: I notice that in both of the threads where I've discussed things with you, all my comments have been downvoted. It looks as if the same has happened to other people who have posted comments disagreeing with things you've said. I find that, rationally or not, this makes me a bit reluctant to attempt to discuss anything with you. I suspect the effect on others may be the same. If you are indeed downvoting every comment you don't agree with, you may wish to consider whether the effects of doing so are the ones you want. For the avoidance of doubt, I'm neither complaining nor objecting; a policy of downvoting things one disagrees with is perfectly permissible, although perhaps a bit rude. Also for the avoidance of doubt, I am not claiming that you are downvoting every disagreeing comment; it looks that way but of course there are other possible explanations.]

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-19T02:59:59.793Z · LW · GW

On objection 2: Your expectation of finding the same pattern whatever two languages you compare is not evidence that in fact it holds. For the avoidance of doubt, I do in fact expect a weak form of the pattern to hold near-universally: languages with, say, fewer bits needed to specify each phoneme will tend to be spoken with those phonemes occurring more rapidly. But you're making a substantially stronger claim -- that this will occur to just the extent required to hold the rate of information transfer constant, and that this will apply when there are vocabulary differences as well as when there are phonological differences -- and it seems to me that when making so strong a claim you really ought to provide more evidence. (Or else present it as a conjecture rather than a factual claim.)

On objection 4: ah, so in fact I should have said not that you did the calculation only for one pair of languages, but that in fact you didn't do the full calculation for any pair of languages!

On objection 6: I'm not sure I understand your objection to my objection :-). If "ganru" and "nuakiagu" actually express the same notions as "momentum" and "derivative" then all that means is that Sona does in fact have those words (I don't see that the fact that they are constructed from simpler parts is significant), and then I don't see what a comparison between Sona and English tells us about differences in vocabulary. (Of course it may only "have those words" for Sona speakers who have learned some physics and mathematics, but that's true of English too: to people without the relevant technical knowledge, "momentum" is the name of a splinter group in a UK political party and "derivative" means "copied from other works" or "one of those weird finance things".)

If you were only ever claiming that a language that lacks certain terms can be extended by adding new words that mean the same as those terms do ... well, sure, I agree, but I thought you were saying something much stronger than that.

Much the same goes for "body-flight". If that means (approximately?) the same thing as "momentum" then what that means is that ballet dancers have discovered some of the same ideas as physicists, and like physicists have coined a word for it. Again, the fact that it's a word made out of smaller meaningful parts doesn't seem relevant to me. Unless you're suggesting, here or in the case of Sona, that just from those parts you can work out the full meaning, so that if you use a smaller language then you never need to learn about momentum because you can just slam together "body" + "flight" and get the right concept by magic. But I bet you aren't suggesting that, because it seems to me very obviously not true.

What? My previous comment was exactly an argument to show that. [that my "obvious naive explanation" explains nothing -- gjm]

Well, as I said, I don't see anything in that comment that looks to me like an argument showing anything of the sort. Of course it's entirely possible that you did in fact make an argument, perhaps a very strong one, with that conclusion, and I just failed to grasp it. (The way it looks to me is that you made some non sequiturs.) If you want to persuade me, then I think you will need to make your argument clearer and more explicit. (Of course you are in no way obliged to make that effort.)

You first had to learn the word ‘momentum’, and second, you had to store this sound and its meaning in the brain. [this is answering my question of what extra effort I need to expend on account of saying "momentum" rather than using a circumlocution -- gjm]

OK, sure. But those are both one-off extra efforts, and (so it seems to me) this effort is amply repaid by the reduced effort every time I need to use the concept thereafter. (Compared, again, to a hypothetical situation in which I don't have a word for "momentum". It seems a bit as if you're now shifting to what seems to me an entirely different claim, namely that we would do better with a different word for "momentum", one whose origins are more transparent. That might be true but if it has anything to do with what you were originally saying, I don't see what.)

So it looks to me as if when I learned the word "momentum" I was paying an immediate price for a larger future benefit. This absolutely is a thing human brains do all the time.

Not every instance of learning a new word will end up actually being a benefit on net. (To take another technical example, once upon a time I learned the meaning of "regular" when applied to a topological space. So far as I can recall, I have never once had a need to use that term or the concept it names, and I probably never will again.) But it seems to me that much vocabulary-learning has positive expected net benefit.

Again, it's possible that the case of technical terminology is misleading; learning the meaning of "luxuriant" isn't much like learning the meaning of "momentum" and its benefits are different. So, while I think it's very obvious that "momentum" pays its way, I wouldn't make nearly so strong a claim for "luxuriant". But I think that, to put it mildly, it is not obvious that having more words has insufficient non-signalling value to explain the fact that lexicons grow, and I am never impressed by "I am not convinced that X is practically useful, therefore X must really be all about signalling", which is what it seems to me your argument comes down to: there are just too many guesses and gaps in the chain from "languages with richer phonology tend to be spoken slower", with which I do agree, to "languages with richer phonology are spoken exactly slower enough to cancel out the difference in information rate" and then to "the same applies to languages with richer vocabulary" and then to "and the same goes for thinking as for communication" and then to "and this doesn't merely hold on average, it holds for every specific case" -- which is the point at which I think you'd have a reasonable basis for claiming that there must be some "non-functional" explanation for large vocabularies, though not necessarily for identifying signalling as the specific best explanation.

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-18T17:19:24.088Z · LW · GW

I still don't understand what in my first paragraph you were objecting to. (Sorry to belabour the point, but it seems like I'm missing something and I would prefer to avoid miscommunication if possible.) It seems like you think I was saying that MikkW was not using words fancier than strictly necessary throughout, but I wasn't: I was saying the opposite, which is the same thing you were saying.

The only other claims in that first paragraph were (1) that most of MikkW's usage of fancy words (and indeed most of MikkW's OP) was not primarily signalling and (2) that his explanation of vocabulary in terms of signalling was primarily signalling. Those are both disputable, but I don't think you said anything to dispute them.

So I'm still confused about what, specifically, you were objecting to in what I wrote. What am I missing?

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-18T17:14:55.764Z · LW · GW

I don't at all deny that felicitous word choices may serve a signalling purpose. But e.g. if OJ's lawyers correctly guessed that the jury would find their rhyming nonsense persuasive because of the rhyme, then it seems to me that they used those word choices for a non-signalling purpose. Maybe in some sense they used brain mechanisms whose underlying function is signalling-related, but I don't think that's relevant here; unless I misunderstood, the point of your argument about signalling was something like "we have all these words only for the sake of signalling; this is a sufficient explanation for their use in our language; so it will do us no harm for non-signalling purposes to throw them out", and if some use of those words co-opts our signalling machinery for other purposes then the conclusion no longer holds.

I don't really know how to address a bare unevidenced unargued-for claim that something is "nothing but signalling", but it seems to me that the arts in general are not pure signalling. The fact that many people listen to music even when they're doing it through earphones and no one else can tell what they're listening to is some evidence of that, although of course it's not conclusive (and I don't see how anything could be).

"From the inside" it certainly seems that plenty of poetry is moving, plenty of paintings are pleasant to look at in ways that resemble (e.g.) the ways some actual landscapes, people, etc., are pleasant to look at, and so forth. I would say that that's already enough to show that these arts aren't only signalling. Depending on exactly what the proposition "X is nothing but signaling" means to you, you may well disagree; if so, and if you can give me an idea of what kind of evidence could possibly convince you otherwise, then I'm willing to argue the case :-).

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-18T10:32:42.681Z · LW · GW

As I just mentioned in a reply to someone else, I don't find your argument about the "cadence of information" convincing.

First, you speak of "the fact that the computer that is the human brain is capable of processing up to X = 60 bits of speech per second, but no more" to which I have to say: [citation needed]. So, objection 1: you haven't justified the claim that there is a fixed number of bits per second of language processing in the brain, nor have you made it clear what bits per second you are counting (the number of bits to specify what's being said when how much sophistication in prediction is available?).

Then, you say something with which do I agree: if your language is "denser" then you will tend to speak it slower.

But the argument in the OP goes way further with this than is justified by the evidence it cites. Objection 2: you compare exactly one pair of languages, Mandarin and Hawaiian; as it happens, my guess is that in broad terms the same pattern holds quite generally, but you really need more evidence. Objection 3: the difference you look at between these languages is in phonology, not in vocabulary; it's not obvious that the same goes for both. (Suppose some potential bottleneck in speech decoding "knows" only about sounds and not meanings; then it will impose a tradeoff between richness of phonology and communication speed, but not between richness of vocabulary and communication speed.) Objection 4: in your comparison, the "richer" language is still faster. Objection 5: you consider only spoken language; there might be similar effects for writing and typing, but it's not clear that they're the same. Objection 6: the comparison you offer as evidence is between average data rates, but the likely effect of losing bits of vocabulary is to make particular things harder to express; if communicating simple things becomes faster and communicating complex things becomes slower, this may not show up in such comparisons but could be a big deal. And, most critically, objection 7: all of this is about communication between people rather than thinking within one person's brain, and it's very far from clear that the tradeoffs are anything like the same. (When I am thinking in words, the speed at which I think is not at all the same as the speed at which I speak.)

When you learn physics, economics, art history, analytic philosophy, or whatever, a part of your learning consists of specialized words. If you don't have a word for "momentum", that doesn't stop you talking about momentum, but it makes it clumsier, and it makes higher-level thinking about related topics much clumsier. (Consider e.g. the statement "position and momentum are conjugate variables" in classical dynamics or quantum mechanics.) Imagine trying to do mathematics or physics or economics without a word for "derivative". If I am talking to someone about the behaviour of a system of particles, my use of words like "momentum" and "derivative" is not signalling sophistication, it's a more or less necessary part of expressing what needs to be said. I'm sure I could somehow get my reasoning across with a greatly impoverished vocabulary, but it would require a lot of mental effort that I don't normally need.

Most fancy words aren't technical terms of that sort, of course, and for the avoidance of doubt I am not claiming that if I had to say "much smaller" instead of "greatly impoverished" I'd be handicapped in the same sort of way as if I had to say "mass times velocity" and "change per unit time at infinitely small scales" instead of "momentum" and "derivative". (... Writing that has brought to my attention another way in which circumlocution may hamper communication and thought. "Momentum" is not always just mass times velocity; e.g., in Hamiltonian dynamics one has "generalized momenta", of which a familiar special case is "angular momentum", and using the term "momentum" for all these things is itself a valuable thing. But if you pick an "elementary" circumlocution like "mass times velocity" then it doesn't apply to the more advanced cases, and if you pick a "sophisticated" one like "variable conjugate to position in Hamiltonian dynamics" then, even leaving aside the fact that "conjugate" and "Hamiltonian" and "dynamics" are themselves fancy words, you get something that will make no sense to people who haven't studied the more advanced parts of the theory. Having a word we can use for the simpler concepts and generalize for the harder ones is super-valuable.)

Sorry, I interrupted myself. Again, I'm not saying that all fancy vocabulary has the same sort of value that technical terms do. Technical terms are just a particularly clear illustration of how enriching your vocabulary can make a genuine, and large, difference to how effectively you can think and communicate in a particular domain. (Also, I think some things that are not now technical terms entered the language as technical terms; if so, note that this is a mechanism of language growth that is clearly not driven by signalling.)

I do not agree that the "obvious naive explanation" explains nothing, and so far as I can see you've offered no argument to support that criticism; rather, you've said that it makes a prediction that you know will be false, namely that "you will put in extra effort to achieve the exact same results". I think that's all wrong, and I think at least some of it is wrong even if we stipulate that you're right about constant information cadence and about using circumlocutions being no loss. What extra effort do I put in when I say "momentum" instead of some circumlocution? What extra effort do I put in when I say "communication" instead of some circumlocution? I think that in both cases the "extra effort" is on the side of the circumlocutor.

I haven't yet read Hanson&Simler, but I have read a fair bit of other Hanson and I am aware that, crudely caricatured, he claims that everything is signalling. Again, I suspect that Hanson's talk about signalling is not all about signalling, and that he does it in part because it's an idea that sounds cynically sophisticated and hence high-status. It may be that in TEITB they present compelling evidence and arguments that would change my mind; for the most part, what I've read of Hanson (which is mostly blog posts, so may not be trying to be as rigorous as he could be) doesn't do that, but takes it for granted that if you can posit a kinda-plausible-sounding signalling-based explanation for something then it must be right.

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-18T09:46:29.333Z · LW · GW

I'm not sure what your first paragraph is disagreeing with me about. Specifically, when you say "In many cases a 2nd Ten Hundred pick would do", do you mean to imply that the reason for MikkW's choice was signalling? I don't see any particular reason to believe that. For any particular word choice, the actual explanation may simply be something like "well, there are five different words that would do and I picked one more or less at random", but of course that doesn't do much to explain how the language grows.

Your third paragraph appears to me to be making much the same point as I was: fancy words are useful for communication because clarity and conciseness matter.

One thing you make more explicit than I did, which may be worth making more explicit still: language isn't just for communication with other people, but also for thinking with, and thinking may be easier with a richer vocabulary. This is of course exactly the claim contradicted by MikkW's title, but I don't think the article does much to justify that contradiction: MikkW first claims that "cadence of information is constant", but supports it with one example comparison, where the difference is in phonology not vocabulary, and where the "poorer" language is in fact still slower -- which doesn't seem to me to offer much reassurance -- and then assumes that the same goes for thinking as for communication, which also seems entirely unjustified to me: I can speed up my speaking if my language is "naively" less dense, but I don't think at the same speed as I speak, and it's not at all obvious that the same speedup opportunities are available there.

[EDITED to fix miscapitalization of "MikkW", which I carelessly copied from the parent comment without checking.]

Comment by gjm on No, Newspeak Won’t Make You Stupid · 2020-12-18T01:22:30.317Z · LW · GW

Although your post uses plenty of sophisticated words, the only part of it that I thought was mostly there to signal sophistication was the part where you said that the reason why people use sophisticated words is mostly to signal sophistication.

A perfectly sufficient explanation for a lot of use of sophisticated words, it seems to me, is the obvious naive one. Sometimes a fancier word expresses a useful idea clearly and concisely, and the alternative would be circumlocution; perhaps, as you suggest, just talking faster could make up for that, but (1) I am not convinced and (2) that might make a smaller language function OK, but no one is choosing between a smaller language spoken faster and a larger one spoken slower, they are choosing what to say on a given occasion and they will speak at about the same rate whether they're using fancy words or not. And sometimes a fancier word has a sound that's better for your purposes for some reason; for an extreme case, consider poetry, where poets will sometimes use quite obscure words because they need a particular metre or rhyme or other sonic effect. Of course some poets sometimes may be signalling sophistication too.

Comment by gjm on Ideal Chess - drop chess perfected · 2020-12-18T01:05:29.989Z · LW · GW

My guess is that this does indeed not strengthen the king enough to make drops not perpetually terrifying. It still has only fairly short-range moves, and you've added two more queenlike pieces to the potential attacking forces. I'm not more than say 60% confident of this, though.