You Have About Five Words

post by Raemon · 2019-03-12T20:30:18.806Z · score: 77 (34 votes) · LW · GW · 37 comments

Cross posted from the EA Forum [EA · GW].

Epistemic Status: all numbers are made up and/or sketchily sourced. Post errs on the side of simplistic poetry – take seriously but not literally.

If you want to coordinate with one person on a thing about something nuanced, you can spend as much time as you want talking to them – answering questions in realtime, addressing confusions as you notice them. You can trust them to go off and attempt complex tasks without as much oversight, and you can decide to change your collective plans quickly and nimbly.

You probably speak at around 100 words per minute. That's 6,000 words per hour. If you talk for 3 hours a day, every workday for a year, you can communicate 4.3 million words worth of nuance.

You can have a real conversation with up to 4 people.

(Last year the small organization I work at considered hiring a 5th person. It turned out to be very costly and we decided to wait, and I think the reasons were related to this phenomenon)

If you want to coordinate on something nuanced with, say, 10 people, you realistically can ask them to read a couple books worth of words. A book is maybe 50,000 words, so you have maybe 200,000 words worth of nuance.

Alternately, you can monologue at people, scaling a conversation past the point where people realistically can ask questions. Either way, you need to hope that your books or your monologues happen to address the particular confusions your 10 teammates have.

If you want to coordinate with 100 people, you can ask them to read a few books, but chances are they won't, unless you have directly inceThey might all read a few books worth of stuff, but they won't all have read the same books. The information that they can be coordinate on is more like "several blogposts." If you're trying to coordinate nerds, maybe those blogposts add up to one book because nerds like to read.

If you want to coordinate 1,000 people... you realistically get one blogpost, or maybe one blogpost worth of jargon that's hopefully self-explanatory enough to be useful.

If you want to coordinate thousands of people...

You have about five words.

This has ramifications on how complicated a coordinated effort you can attempt.

What if you need all that nuance and to coordinate thousands of people? What would it look like if the world was filled with complicated problems that required lots of people to solve?

I guess it'd look like this one.


Comments sorted by top scores.

comment by Benquo · 2019-03-13T15:31:46.560Z · score: 42 (12 votes) · LW(p) · GW(p)

You're massively underestimating the upper bound.

I've interacted a bunch recently with members of a group of about 2 million people who recite a 245-word creed twice daily, and assemble weekly to read from an 80,000 word text such that the whole text gets read annually. This is nowhere near a complete accounting of engagement with verbal canon within the group. Each of these practices is preceded and followed by an additional standardized text of substantial length, and many people study full-time a much larger canonical text claiming to interpret the core text.

They also engage in behavior patterns that, while they don't necessarily reflect detailed engagement by each person with the content of the core text, do reflect a lot of fine-grained responsiveness to the larger interpretive canon.

You might be closer for what can be done very quickly (within a single generation) under current conditions. But a political movement plenty of people are newly worried about which likely has thousands of members has a 14-word creed.

comment by Raemon · 2019-03-13T20:09:14.539Z · score: 11 (4 votes) · LW(p) · GW(p)

Nod. Social pressure and/or organizational efforts to read a particular thing together (esp. in public where everyone can see that everyone else is reading) does seem like a thing that would work.

It comes with drawbacks such as "if it turns out you need to change the 80,000 word text because you picked the wrong text or need to amend it, I expect there to be a lot of political drama surrounding that, and the process by which people building momentum towards changing it probably would be subject to the bandwidth limits I'm pointing to [edit: unless the organization has specifically built in tools to alleviate that]"

(Reminder that I specifically said "all numbers are made up and/or sketchily sourced". I'm pointing to order of magnitude. I did consider naming this blogpost "you have about five words" or "you have less than seven words". I think it was a somewhat ironic failure of mine that I went with "you have four words" since it degrades less gracefully than "you have about five words.")

comment by Benquo · 2019-03-14T07:33:09.385Z · score: 3 (2 votes) · LW(p) · GW(p)

14 is still half an order of magnitude above 5, and I don't think neo-Nazis are particularly close to the most complex coordination thousands of people can achieve with a standardized set of words.

comment by Raemon · 2019-03-14T21:02:37.450Z · score: 3 (2 votes) · LW(p) · GW(p)

I suppose, but, again, "all numbers are made up" was the first sentence in this post, and half an order of magnitude feels within bounds of "the general point of the essay holds up."

I also don't currently know of anyone writing on LessWrong or EA forum who should have reason to believe they are as coordinated as the neo-Nazis are here. (See elsethread comment on my take on the state of EA coordination, which was the motivation for this post).

(In Romeo's terms, the neo-nazis are also using a social tech with unfolding complexity, where their actual coordinated action is "recite the pledge every day", which lets them them encode additional information. But to get this you need to spend your initial coordinated action on that unfolding action)

comment by drethelin · 2019-03-13T06:47:32.511Z · score: 25 (11 votes) · LW(p) · GW(p)

Walmart coordinates 2.2 million people directly and millions more indirectly.

Even the boy scouts coordinates 2.7 million.

Religions coordinate, to a greater or lesser extent, far more.

The key to coordination is to not consider yourself as an individual measuring out a ration of words you can force x number of people to read. Most people never read the bible.

comment by ryan_b · 2019-03-13T14:21:19.480Z · score: 21 (9 votes) · LW(p) · GW(p)

These are good examples that drive the point home.

Most people never read the bible.

They don't coordinate based on the nuanced information in it, either. Mostly they coordinate on a few very short statements, like:

Say you are Christian.

Go to church.

A much smaller group of people coordinates on a few more:

Give money to the church.

Run a food drive OR help build houses OR staff a soup kitchen OR ...

The Walmart example seems a little different, because it isn't as though working at Walmart is that different from any other kind of hourly employment. Mostly all employers try to get people to coordinate on a few crucial things:

Show up on time.

Count the money correctly.

Stock the shelves.

Sweep the floor.

And it seems to me there is never a shortage of preachers or employers complaining about people's inability to do even these basic things.

It looks to me like successful coordination on the scale of millions largely amounts to iterating four-word actions.

comment by romeostevensit · 2019-03-14T02:03:16.761Z · score: 15 (6 votes) · LW(p) · GW(p)

Agree, and I'd roll in the incentives more closely. It feels more like:

you have at most space for a few feedback loops

you can improve this by making one of the feedback loops a checklist that makes calls out to other feedback loops

the tighter and more directly incentivized the feedback loop, the more you can pack in

every employer/organization is trying to hire/recruit people who can hold more feedback loops at once and do some unsupervised load balancing between them

you can make some of people's feedback loops managing another person's feedback loops

Now jump to this post

another frame is that instead of thinking about how many bits you can successfully transmit, think about whether the behaviors implied by the bits you transmit can run in loops, whether the loops are supervised or unsupervised and what range of noise they remain stable under.

comment by mr-hire · 2019-03-14T13:05:10.401Z · score: 1 (1 votes) · LW(p) · GW(p)

I didn't make the leap from bits of information to feedback loops but it makes intuitive sense. Transmiting information that compresses by giving you the tools to figure out the information yourself seems useful.

comment by Raemon · 2019-03-14T20:44:55.812Z · score: 10 (6 votes) · LW(p) · GW(p)

Heh, "read the sequences" clocks in at 3 words.

comment by Raemon · 2019-03-13T20:00:38.121Z · score: 10 (6 votes) · LW(p) · GW(p)

The point is not "rationing out your words" is the correct way to coordinate people. The point is that you need to attend, as part of your coordination strategy, to the fact that most people won't read most of your words. Insofar as your coordination strategy relies on lots of people hearing an idea, the idea needs to degrade gracefully as it loses bandwidth.

Walmart I expect to do most of it's coordination via oral tradition. (At the supermarket I worked at, I got one set of cultural onboarding from the store manager, who gave a big speech... which began an ended with a reminder that "the four virtues of the Great Atlantic and Pacific Tea company are integrity, respect, teamwork and responsibility." Then, I learned most of the minutia of how to run a cash register, do janitorial duties or be a baker via on-the-job training, by someone who spent several weeks telling me what to do and giving me corrective feedback)

(Several years later, I have some leftover kinesthetic knowledge of how to run a cash register, and the dangling words "integrity, respect, teamwork, responsibility" in my head, although also I probably only have that because I thought the virtues were sort of funny and wrote a song about it)

comment by catherio · 2019-05-24T01:11:57.561Z · score: 4 (2 votes) · LW(p) · GW(p)

The recent EA meta fund announcement linked to this post ( ) which highlights another parallel approach: in addition to picking idea expressions that fail gracefully, to prefer transmission methods that preserve nuance.

comment by Dagon · 2019-03-12T23:58:41.730Z · score: 13 (5 votes) · LW(p) · GW(p)

Hierarchies (which provide information-cheap mechanisms for coordination) and associative processes (which get people with shared information closer, so less information exchange is necessary) both would seem to expand the numbers greatly from those you suggest.

There are examples of fairly complicated cooperation across many millions. For example, all the expectations behind credit card usage take many pages of contracts, which implicitly depend on many volumes of law, which implicitly depend on uncountable bits of history and social norms.

comment by Raemon · 2019-03-13T00:50:33.186Z · score: 3 (6 votes) · LW(p) · GW(p)

Yes, but it's important to note that if you haven't purposefully built that hierarchy, you can't rely on it existing. (And, it's still a fairly common problem within an org for communication to break down as it scales – I'd argue that most companies don't end up successfully solving this problem)

The motivating example for this post at-the-time-of-writing was that in the EA sphere, there's a nuanced claim made about "EA being talent constrained", which large numbers of people misinterpreted to mean "we need people who are pretty talented" and not "we need highly specific talents, and the reason EA is talent constrained is that the median EA does not have these talents."

There were nuanced blogposts discussing it, but in the EAsphere, the shared information is capped at roughly "1 book worth of content and jargon, which needs to cover a diverse array of concepts, so any given concept won't necessarily have much nuance", and in this case it appeared to hit the literal four word limit.

comment by Dagon · 2019-03-13T21:05:53.759Z · score: 2 (1 votes) · LW(p) · GW(p)

It might be worth a second post examining the reasons that the standard and well-known coordination mechanisms (force, social pressure, hierarchy, broadcast/mass media, etc.) aren't available for the kind of coordination you think is needed, and what you're considering as replacements (or just accepting that a loosely-committed voluntary group with no direct rewards or sanctions has a cap on effectiveness).

(note: I'm not particularly EA-focused; this is a trap) Or perhaps a description of how "the EA community" can have needs that require such coordination, as opposed to actual projects that clearly need aggregated effort to have impact.

comment by Raemon · 2019-03-13T21:10:47.585Z · score: 2 (1 votes) · LW(p) · GW(p)

I do think that'd be a valuable post (and that sort of thing is going on on the EA forum right now, with people proposing various ways to solve a particular scaling problem). I don't know that I have particularly good ideas there, although I do have some. The point of this post was just "don't be surprised when your messages loses nuance if you haven't made special efforts to prevent it from doing so" (or, if it gets out-competed by a less nuanced message that was designed to be scalable and/or viral)

I wrote this post in part so that I could more easily reference later at some point when I had either concrete ideas about what to do, or when I think someone is mistaken in their strategy because they're missing this insight.

comment by Dagon · 2019-03-13T22:10:52.115Z · score: 3 (2 votes) · LW(p) · GW(p)

Fair enough. Interestingly, if I replace "coordinate with" with "communicate a nuanced belief to", my reaction changes radically, in favor of numbers shaped like yours. I'll have to think more about why those concepts are so different.

comment by Raemon · 2019-03-13T22:14:40.609Z · score: 4 (2 votes) · LW(p) · GW(p)

Nod. The claim here is specifically about how much nuance can be relevant to your coordination, not how many people you can coordinate with. (If this failed to come across, that also says something about communicating nuance being hard)

comment by Dagon · 2019-03-14T00:15:44.804Z · score: 4 (2 votes) · LW(p) · GW(p)

I think I was taking "coordination" in the narrow sense of incenting people to do actions toward a relatively straightforward goal that they may or may not share. In that view, nuance is the enemy of coordination, and most of the work is simplifying the instructions so that it's OK that there's not much information transmitted. If the goal is communication, rather than near-term action, you can't avoid the necessity of detail.

comment by Raemon · 2019-03-14T00:32:52.125Z · score: 4 (2 votes) · LW(p) · GW(p)

The whole point is that coordination looks different at different scales.

So, I think I was looking at this through a nonstandard frame (Maybe more nonstandard than I thought). There are two different sets of numbers in this post:

— 4.3 million words worth of nuance

— 200,000 words of nuance

— 50,000 words

— 1 blogpost (1-2k words)

— 4 words

And separately:

— 1-4 people

— 10 people

— 100 people

— 1000 people

— 10,000 people+

While I'm not very confident about any of the numbers, I am more confident in the first set of numbers than the second set.

If I look out into the world, I see clear failures (and successes) of communication strategies that cluster around different strata of communication bandwidth. And in particular, there is clearly some point at which the bandwidth collapses to 3-6 words.

comment by Raemon · 2019-03-13T19:50:11.615Z · score: 12 (6 votes) · LW(p) · GW(p)

So, I think I optimized this piece a bit too much as poetry at the expense of clarity. (I was trying to keep it brief overall, and have the sections sort of correspond in length to how much reading you could expect people to read at that scale).

Obviously people in the real world do successfully coordinate on things, and this piece doesn't address the various ways you might try to do so. The core claim here is just that if you haven't taken some kind of special effort to ensure your nuanced message will scale, it will probably not scale.

Hierarchies are a way to address the problem. Oral tradition that embeds itself in people's socializing process is a way to address the problem. Smaller groups is a way to address the problem. Social pressure to read a specific thing is a way to address the problem. But each of these address it only in particular ways and come with particular tradeoffs.

comment by Benquo · 2019-03-13T16:46:58.394Z · score: 9 (4 votes) · LW(p) · GW(p)

A productive thing to do here would be to try to reconcile the claim that a large number of people can't reasonably be expected to read more than a few words, and the claim that something like EA or Rationalism is possible at anything like the current scale. These are in obvious tension.

Another claim to reconcile with yours would be a claim that there's anything like law going on, or really anything other than gang warfare.

comment by Raemon · 2019-03-13T20:33:50.022Z · score: 18 (3 votes) · LW(p) · GW(p)

My claim is "a large number of people can't reasonably be expected to read more than a few words in common", which I think is subtly different (in addition to the thing where this post wasn't about ways to address the problem, it was about the default state of the problem in the absence of an explicit coordination mechanism)

If your book-length-treatise reaches 1000 people, probably 10-50 of those people read the book and paid careful attention, 100 people read the book, a couple hundred people skimmed the book, and the rest just absorbed a few key points secondhand.

I think it is in fact a failure of law that that the law has grown to the point where a single person can't possibly know it all, and only specialists can know most of it (because this creates an environment where most people don't know what laws they're breaking which enables certain kinds of abuse)

I think the way EA and LessWrong work is that there's a large body of work people are vaguely expected to read (In the case of LessWrong, I think the core sequences are around [edit: a million words, I initially was using my cached pageCount rather than wordCount] not sure how big the overall EA corpus is). EA and LW are filtered by "nerds who like to read", so you get to be on the higher end of the spectrum of how many people have read how much.

But, it still seems like a few things end up happening:

Important essays definitely lose nuance. "Politics in the Mind Killer" is one of the common examples of something where the original essay got game-of-telephoned pretty hard by oral culture.

Similarly, EA empirically runs into messaging issues where, even though 80k had tried intentionally to downplay the "Earning to Give" recommendation, but people still primarily associated 80k with Earning to Give years later. And when they finally successfully switched the message to "EA is talent constrained", that got misconstrued as well.

Empirically people also successfully rely on a common culture to some degree. My sense is that the people who tend to do serious work and get jobs and stick around are ones who have read a lot of at least a good chunk of the words, and they somewhat filter themselves into groups that have read particular subsets. The fact that there are 1000+ people misunderstanding politics is the mind killer doesn't mean there's not also 100-200 people who remember the original claim.

(There are probably different clusters of people who have read different clusters of words, i.e. people who have read the sequences, people who have read Doing Good Better, people who have read a smattering of essays from each as well as the old Givewell blogs, etc)

One problem facing EA is that there is not much coordination on which words are the right ones to read. Doing Good Better was written with a goal of being "the thing you gave people as their cultural onboarding tool", AFAICT. But which 80k essays are you supposed to have read? All of them? I dunno, that's a lot and I certainly haven't, and it's not obvious that that's a better use of my time than reading up on machine learning or the AI Alignment Forum or going off to learn new things that aren't part of the core community material.

comment by Vaniver · 2019-03-13T21:46:50.613Z · score: 3 (1 votes) · LW(p) · GW(p)
In the case of LessWrong, I think the core sequences are around 10,000 words, not sure how big the overall EA corpus is.

This feel like a 100x underestimate; The Sequences clocks in at over a million words, I believe, and it's not the case that only 1% of the words are core.

comment by Raemon · 2019-03-13T21:51:55.158Z · score: 2 (1 votes) · LW(p) · GW(p)

Whoops. I was confusing pages with words.

comment by Raemon · 2019-03-13T22:06:59.620Z · score: 2 (1 votes) · LW(p) · GW(p)

(The mental-action I was performing was "observing what seems to actually happen and then grab the numbers that I remembered coinciding with those actions", rather than working backwards from a model of numbers, which may or may not have been a good procedure, but in any case means that being off by a factor of 100 doesn't influence the surrounding text much)

comment by ryan_b · 2019-03-13T14:38:20.078Z · score: 6 (3 votes) · LW(p) · GW(p)

This puts me in mind of the mandatory reading of a narrative memo they use at Amazon, which appears to conform to the 'several blog posts' level of coordination. It is hierarchically enforced, and the people who use it are the senior leadership which has, I assume, a capability distribution heavily weighted towards the top of the scale.

Also relevant is the Things I Learned From Working With a Marketing Advisor [LW · GW] post.

comment by Raemon · 2019-03-14T21:53:54.550Z · score: 5 (2 votes) · LW(p) · GW(p)

I think the actual final limit is something like:

Coordinated actions can't take up more bandwidth than someone's working memory (which is something like 7 chunks, and if you're using all 7 chunks then they don't have any spare chunks to handle weird edge cases).

A lot of coordination (and communication) is about reducing the chunk-size of actions. This is why jargon is useful, habits and training are useful (as well as checklists and forms and bureaucracy), since that can condense an otherwise unworkably long instruction into something people can manage.

"Go to the store and get eggs" comes with a bunch of implicit knowledge about cars or bikes or where the store is and what eggs are, etc.

comment by Yoav Ravid · 2019-03-15T07:59:05.204Z · score: 3 (2 votes) · LW(p) · GW(p)

What is meant by 7 chunks? seems like that in itself was condensed jargon that i didn't understand :P

comment by Raemon · 2019-03-15T18:11:23.026Z · score: 6 (3 votes) · LW(p) · GW(p)

"Something that your mind thinks of as one unit, even if it's in fact a cluster of things."

The "Go to the store" is four words. But "go" actually means "stand up. walk to the door. open the door. Walk to your car. Open your car door. Get inside. Take the key out of your pocket. Put the key in the ignition slot..." etc. (Which are in turn actually broken into smaller steps like "lift your front leg up while adjusting your weight forward")

But, you are capable of taking all of that an chunking it as the concept "go somewhere" (as as well as the meta concept of "go to the place whichever way is most convenient, which might be walking or biking or taking a bus"), although if you have to use a form of transport you are less familiar with, remembering how to do it might take up a lot of working memory slots, leaving you liable to forget other parts of your plan.

comment by Yoav Ravid · 2019-03-15T19:42:45.729Z · score: 1 (1 votes) · LW(p) · GW(p)

So "7 chunks" was used as almost a synonym for "7 words"? I thought that was some cool concept from neuroscience about working memory :)

comment by Raemon · 2019-03-15T20:56:11.481Z · score: 2 (1 votes) · LW(p) · GW(p)

I think the near-synonym nature is more about convergent evolution. (i.e. words aim to be reflect a concept, working memory is about handling concepts).

comment by Jacobian · 2019-03-13T15:58:32.025Z · score: 2 (1 votes) · LW(p) · GW(p)

This immediately got me thinking about politics.

How many voters could tell you what Obama's platform was in 2008? But 70,000,000 of them agreed on "Hope and Change". How many could do the same for Trump? But they agreed on "Make America Great Again". McCain, Romney, and Hillary didn't have a four-words-or-less memorable slogan, and so...

comment by Raemon · 2019-03-13T19:54:05.627Z · score: 2 (1 votes) · LW(p) · GW(p)

I'm actually two levels of surprised here. I'd have naively expected McCain, Romney and Hillary to have competent enough staffers to make sure they had a slogan, and sort of passively assumed they had one. It'd be surprising if they didn't have one, and if they did have one, surprising that I hadn't heard it. (I hung out in blue tribe spaces so it's not that weird that I'd have failed to hear McCain's or Romneys)

Quick googling says that Hillary's team thought about 84 slogans before settling on "Stronger Together", which I don't remember hearing. (I think instead I heard a bunch of anti-Trump slogans like "Love Trumps Hate", which maybe just outcompeted it?)

comment by philh · 2019-03-18T07:58:21.794Z · score: 2 (1 votes) · LW(p) · GW(p)

I had been under the impression that Hillary's was "I'm with her"? But I think I mostly heard that in the context of people saying it was a bad slogan.

comment by TristanTrim · 2019-08-26T16:01:24.757Z · score: 1 (1 votes) · LW(p) · GW(p)

I like this direction of thought, and I suspect it is true as a general rule, but ignores the incentive people have for correctly receiving the information, and the structure through which the information is disseminated. Both factors (and probably others I haven't thought of) would increase or decrease how much information could be transferred.

comment by Yoav Ravid · 2019-03-13T14:37:29.740Z · score: 1 (1 votes) · LW(p) · GW(p)

So, an action coordination website [? · GW] should be able to phrase actions in four words?

This idea seems interesting, i'd love to see it somehow more formulated.

Do shorter kickstarter descriptions get funded more?

Do protest events on Facebook which have a shorter description get more attendees?

It probably also depends on personality - if you want to coordinate people who are high in contentiousness, you may need more words. for low contentiousness, less words. and if you want both, than you need to give a clear 4-word heading, and a bunch of nuance below.

comment by Raemon · 2019-03-13T22:08:38.879Z · score: 2 (1 votes) · LW(p) · GW(p)

I don't think this directly bears on how to build an action coordination website, more than in lieu of such a site you should expect action coordination to succeed at the 4-word level of complexity. I haven't thought as much about how to account for this when trying hard to build a coordination platform.

But, I do think that kickstarters tend to succeed more if the 4-word version of them are intuitively appealing.