The low Information Density of Eliezer Yudkowsky & LessWrong

felix-olszewski

The low Information Density of Eliezer Yudkowsky & LessWrong

post by Felix Olszewski (quick-maths) · 2024-12-30T19:43:59.355Z · LW · GW · 8 comments

  Actual post:
None
8 comments

TLDR:
I think Eliezer Yudkowsky & many posts on LessWrong are failing at keeping things concise and to the point.

Actual post:

I think the content from Eliezer Yudkowsky & on LessWrong in general is unnecessarily wordy.

A counterexample of where Eliezer Yudkowsky actually managed to get to the point concisely was in this Ted Talk, where he had the external constraint of keeping it to 10 minutes:

An example of a concise post in a forum can be found here, from 2013:

https://news.ycombinator.com/item?id=5248289

Examples of posts, which showcase the wordiness of LessWrong:

Why I think this matters:

I think the long books Eliezer Yudkowsky has published were not the most optimal way to convey the threat of AGI to humanity.
I do not think that his books should not have been made. They are good for some people, this group of humans that has the time & motivation to read long books.
I think a more concise version would reach more people and would be more effective, e.g. something similar to the TedTalk YouTube video.
I think most people just google stuff & then read the shortest summaries quickly or watch shorter YouTube videos.
I think there are many benefits of keeping things very elaborate, e.g. it makes sure people really do not misunderstand your point on a crucial matter.
I think this is not a pure tradeoff situation. I think often Eliezer Yudkowsky & LessWrong can be more concise while still getting the same exact points across.

8 comments

Comments sorted by top scores.

comment by CstineSublime · 2024-12-31T01:01:17.228Z · LW(p) · GW(p)

I was under the impression it was a deliberate decision, as the aphorism of Empedocles goes:

What needs saying needs saying twice

Related is what Horace wrote

It is when I struggle to be brief that I become obscure

Now in case you didn't realize I'm going meta, by repeating similar sentiments over and over. So I'll refer to Professor of Negotiation Strategy Deepak Malhotra who advises would be negotiators:

Don't leave it to chance that they interpret what you're saying

Pithy, concise, brief statements lack context. This increases the chances they will be misinterpreted. Consistent misinterpretation is not optimal. You can remedy this, as Eliezer does by repetition, stating the same thing over and over again. You can give multiple examples of the same sentiment with slight variations. Each adds more context and narrows the band of possible interpretations.

This is not a matter of Kolmogorov complexity. The issue isn't whether it can be compressed and recreated by a theoretically optimal un-compressing machine. The audience is not a theoretically optimal un-compressor.

Have you ever been misinterpreted? How did you deal with it? If you were discussing a topic you thought was extremely important and for which interpretations that veered from your intentions could be very counter-productive, would you try to be as pithy and concise as possible or would you try to minimize and narrow the possible misinterpretations? How would you do that?

comment by cousin_it · 2024-12-30T23:35:19.549Z · LW(p) · GW(p)

I think for a certain time and demographic (which included me then), the wordiness and imagery actually helped. But we were all younger then, maybe smarter, and definitely more open. It doesn't work as much on me now.

Anyway, I'm not sure it needs to be rewritten today. The threat has become easier to see. Lots of people already ask themselves what jobs they'll have, what skills children should learn, how most people will live - given that we already treat our poor and homeless pretty badly. It's not the whole threat, but it's a lower bound threat that feels alarming enough.

comment by winstonBosan · 2024-12-30T22:09:38.224Z · LW(p) · GW(p)

While I agree that we don't live at the Pareto frontier of conciseness, explain-ability and etc, those are some odd examples to use to support your thesis. And the comparison to the hackernews post is likely using the wrong reference class.

Two of the three examples are heavily downvoted. Whether that's because of untruthful content or stylistic (length, tone, etc) or memetic reason (Eliezer ~ prophet), those posts are hardly the poster child of what Lesswrong can do or even is.

As for Vanessa Kosoy's piece, the last third was filled with quotations and they had given the "stop here if you don't want my comments" warning. And it is also otherwise filled with references to many historically important posts and concepts that requires at least a quick refresher to catch the reader up to speed. I suppose Vanessa could have assumed that her readers would have been familiar with all those arguments and the nuances in different positions, but that was not her goal.

The specific example used from hacker news is likely a HackerNews Ask - a format more comparable to the shortforms and quicktake format in Lesswrong. Full fledged posts here vs full fledged posts on Hackernews is actually very comparable. (See below for some data)

Replies from: winstonBosan

↑ comment by winstonBosan · 2024-12-31T01:59:50.881Z · LW(p) · GW(p)

Update - HackewNews posts today and Lesswrong posts today are very similar in length. That doesn't mean they do an equal job at being concise - maybe Lesswrongers say preciously little for the length of their treatises. But deriving the sophistication of the posts is left as an exercise for the readers and beyond my paygrade:

Hackewnews - avg. 2876.125 words. For the current top 10 posts.^[1]

Lesswrong - avg. 2581.2 words. For the top ten post in the last 24 hrs. (God damn it Zvi)

A few problem with this 5 minute method of comparison:

Not Weighted: A better way to do this would be comparing some kind of karma weighted score. After all, the people who have high karmas are who we as a community see as people we really embody the spirit. Same with HackerNews.
Not Representative: I only took the most popular posts in hacknews today. There is no reason to think these posts today represents what HackerNews is in the last decade. Similarly, the posts on lesswrong in the last 24 hours are few and also not a very representative cohort.
Non-systematic way to throw out outliers: There was a project Gutenberg book on HackerNews today. It felt wrong to include the book and I feel justified in its exclusion. But this should be done more systematically.
A lot of discussion and culture building is in the comments, I didn't include that: Ditto

Markdown table below incase I made a mistake:

1		word count
	A Course of Pure Mathematics – G. H. Hardy (1921) [pdf] (gutenberg.org)	N/A - It is a book
	107 points by bikenaga 4 hours ago \| hide \| 23 comments
2
	I keep turning my Google Sheets into phone-friendly webapps, and I can't stop (arstechnica.com)	1443
	26 points by cpeterso 1 hour ago \| hide \| 2 comments
3
	Dumping Memory to Bypass BitLocker on Windows 11 (noinitrd.github.io)	1116
	178 points by supermatou 6 hours ago \| hide \| 120 comments
4
	I Wrote a Game Boy Advance Game in Zig (jonot.me)	2432
	52 points by tehnub 3 hours ago \| hide \| 10 comments
5
	Beyond Gradient Averaging in Parallel Optimization (arxiv.org)	6246
	41 points by shinryudbz 3 hours ago \| hide \| 8 comments
6
	Lightstorm: Minimalistic Ruby Compiler (llvm.org)	1408
	19 points by eutropia 2 hours ago \| hide \| discuss
7
	Jack Elam and the Fly in 'Once Upon a Time in the West' (au.dk)	2656
	51 points by chimpanzee 4 hours ago \| hide \| 9 comments
8
	LineageOS 22 Released (lineageos.org)	3778
	38 points by timschumi 1 hour ago \| hide \| 8 comments
9
	I made a tiny library for switches and sum types in Lua (github.com/alurm)	N/A Code base
	12 points by alurm 2 hours ago \| hide \| discuss
10
	Learning Solver Design: Automating Factorio Balancers (gianlucaventurini.com)	3930
	137 points by kolui 8 hours ago \| hide \| 21 comments
11
		2876.125

DanielTan	1479
AnnaSal	1565
Habryka	1562
Zvi	10212
chanind&Till	4479
alexey	148
This Post	281
Bostock	1200
	N/A <- Cant access
Vishakha	79
xpostah	4807
Avg ->	2581.2

^{^}
See Paul Graham comment for their ranking algos: https://news.ycombinator.com/item?id=1781013.

comment by Archimedes · 2024-12-31T00:31:38.860Z · LW(p) · GW(p)

I think your assessment may be largely correct but I do think it's worth considering how things are not always nicely compressible [LW · GW].

comment by Viliam · 2025-01-15T10:08:23.847Z · LW(p) · GW(p)

I'll try to respect your preference for brevity ;)

a shorter version would be very useful -- yes, fully agree
- at least there is readthesequences.com without the comments (10x as much text as the articles)
- there were summaries at LW wiki, but those were too short; we need something medium-sized
there are some good reasons why Eliezer wrote a long text
- there wasn't rationalist community yet, lines had to be drawn to separate it from many existing adjacent communities (atheists, skeptics, libertarians, sci-fi fans, self-help, contrarians, academia...)
- emotional, near-mode appeal -- why should we even care about "being rational"?
- popular bad memes/patterns (mysterious answers, applause lights, "trust the science"...)

tl;dr -- writing for an already existing rationalist(-ish) community is different from writing in order to create a rationalist community

comment by transhumanist_atom_understander · 2024-12-31T01:04:14.735Z · LW(p) · GW(p)

I'm in the process of turning the ideas in a stack of my notebooks into what I hope will be a short paper, which is just one illustration of what I think was the real trade-off, which is between conciseness and time spent writing. Or for another, see the polished 20-page papers on logical decision theory. Though it's not the same, they cover much of the same ground as the older expositions of timeless decision theory and updateless decision theory. There was a long period where these kinds of decision theories were only available through posts, and then through Eliezer Yudkowsky's long TDT paper. That period could not have been skipped, and could only have been shortened in the sense that the same work could have been done faster at the expense of other work. See also this exchange on Twitter, though they're not talking about being concise specifically:

Miles Brundage: I can’t speak to the details of those experiments but I at least read a much higher fraction of your paper outputs than your blog post outputs. Possibly I’m a minority here but I am certainly not the only one.

Eliezer Yudkowsky: Yeah, I tried it and it was way too fucking time-expensive. My guess is that 100x the output you like less ends up having the larger impact on the world.

comment by Felix Olszewski (quick-maths) · 2025-01-02T12:56:34.384Z · LW(p) · GW(p)

This is another example of Elezier Yudkowsky not being aware of his low information density having a bad influence on his theories reaching a broader audience:

https://youtube.com/shorts/v5GYeKg4YRE?si=KwSairgyr1N7p497

The low Information Density of Eliezer Yudkowsky & LessWrong

Contents

8 comments