The low Information Density of Eliezer Yudkowsky & LessWrong

post by Felix Olszewski (quick-maths) · 2024-12-30T19:43:59.355Z · LW · GW · 7 comments

Contents

  Actual post:
None
7 comments

TLDR:
I think Eliezer Yudkowsky & many posts on LessWrong are failing at keeping things concise and to the point.

 

Actual post:

I think the content from Eliezer Yudkowsky & on LessWrong in general is unnecessarily wordy.

 

A counterexample of where Eliezer Yudkowsky actually managed to get to the point concisely was in this Ted Talk, where he had the external constraint of keeping it to 10 minutes:

 

An example of a concise post in a forum can be found here, from 2013:

Examples of posts, which showcase the wordiness of LessWrong:


Why I think this matters:

7 comments

Comments sorted by top scores.

comment by CstineSublime · 2024-12-31T01:01:17.228Z · LW(p) · GW(p)

I was under the impression it was a deliberate decision, as the aphorism of Empedocles goes:

What needs saying needs saying twice

Related is what Horace wrote
 

It is when I struggle to be brief that I become obscure

Now in case you didn't realize I'm going meta, by repeating similar sentiments over and over. So I'll refer to Professor of Negotiation Strategy Deepak Malhotra who advises would be negotiators:
 

Don't leave it to chance that they interpret what you're saying

Pithy, concise, brief statements lack context. This increases the chances they will be misinterpreted. Consistent misinterpretation is not optimal. You can remedy this, as Eliezer does by repetition, stating the same thing over and over again. You can give multiple examples of the same sentiment with slight variations. Each adds more context and narrows the band of possible interpretations.

This is not a matter of Kolmogorov complexity. The issue isn't whether it can be compressed and recreated by a theoretically optimal un-compressing machine. The audience is not a theoretically optimal un-compressor.

Have you ever been misinterpreted? How did you deal with it? If you were discussing a topic you thought was extremely important and for which interpretations that veered from your intentions could be very counter-productive, would you try to be as pithy and concise as possible or would you try to minimize and narrow the possible misinterpretations? How would you do that?

comment by cousin_it · 2024-12-30T23:35:19.549Z · LW(p) · GW(p)

I think for a certain time and demographic (which included me then), the wordiness and imagery actually helped. But we were all younger then, maybe smarter, and definitely more open. It doesn't work as much on me now.

Anyway, I'm not sure it needs to be rewritten today. The threat has become easier to see. Lots of people already ask themselves what jobs they'll have, what skills children should learn, how most people will live - given that we already treat our poor and homeless pretty badly. It's not the whole threat, but it's a lower bound threat that feels alarming enough.

comment by winstonBosan · 2024-12-30T22:09:38.224Z · LW(p) · GW(p)

While I agree that we don't live at the Pareto frontier of conciseness, explain-ability and etc, those are some odd examples to use to support your thesis. And the comparison to the hackernews post is likely using the wrong reference class. 

Two of the three examples are heavily downvoted. Whether that's because of untruthful content or stylistic (length, tone, etc) or memetic reason (Eliezer ~ prophet), those posts are hardly the poster child of what Lesswrong can do or even is. 

As for Vanessa Kosoy's piece, the last third was filled with quotations and they had given the "stop here if you don't want my comments" warning. And it is also otherwise filled with references to many historically important posts and concepts that requires at least a quick refresher to catch the reader up to speed. I suppose Vanessa could have assumed that her readers would have been familiar with all those arguments and the nuances in different positions, but that was not her goal. 

The specific example used from hacker news is likely a HackerNews Ask - a format more comparable to the shortforms and quicktake format in Lesswrong. Full fledged posts here vs full fledged posts on Hackernews is actually very comparable. (See below for some data)

Replies from: winstonBosan
comment by winstonBosan · 2024-12-31T01:59:50.881Z · LW(p) · GW(p)

Update - HackewNews posts today and Lesswrong posts today are very similar in length. That doesn't mean they do an equal job at being concise - maybe Lesswrongers say preciously little for the length of their treatises. But deriving the sophistication of the posts is left as an exercise for the readers and beyond my paygrade:

Hackewnews - avg. 2876.125 words. For the current top 10 posts.[1]

Lesswrong - avg. 2581.2 words. For the top ten post in the last 24 hrs. (God damn it Zvi)

A few problem with this 5 minute method of comparison: 

  • Not Weighted: A better way to do this would be comparing some kind of karma weighted score. After all, the people who have high karmas are who we as a community see as people we really embody the spirit. Same with HackerNews.
  • Not Representative: I only took the most popular posts in hacknews today. There is no reason to think these posts today represents what HackerNews is in the last decade. Similarly, the posts on lesswrong in the last 24 hours are few and also not a very representative cohort.
  • Non-systematic way to throw out outliers: There was a project Gutenberg book on HackerNews today. It felt wrong to include the book and I feel justified in its exclusion. But this should be done more systematically.
  • A lot of discussion and culture building is in the comments, I didn't include that: Ditto

Markdown table below incase I made a mistake:

1 word count
 A Course of Pure Mathematics – G. H. Hardy (1921) [pdf] (gutenberg.org)N/A - It is a book
 107 points by bikenaga 4 hours ago | hide | 23 comments 
2  
 I keep turning my Google Sheets into phone-friendly webapps, and I can't stop (arstechnica.com)1443
 26 points by cpeterso 1 hour ago | hide | 2 comments 
3  
 Dumping Memory to Bypass BitLocker on Windows 11 (noinitrd.github.io)1116
 178 points by supermatou 6 hours ago | hide | 120 comments 
4  
 I Wrote a Game Boy Advance Game in Zig (jonot.me)2432
 52 points by tehnub 3 hours ago | hide | 10 comments 
5  
 Beyond Gradient Averaging in Parallel Optimization (arxiv.org)6246
 41 points by shinryudbz 3 hours ago | hide | 8 comments 
6  
 Lightstorm: Minimalistic Ruby Compiler (llvm.org)1408
 19 points by eutropia 2 hours ago | hide | discuss 
7  
 Jack Elam and the Fly in 'Once Upon a Time in the West' (au.dk)2656
 51 points by chimpanzee 4 hours ago | hide | 9 comments 
8  
 LineageOS 22 Released (lineageos.org)3778
 38 points by timschumi 1 hour ago | hide | 8 comments 
9  
 I made a tiny library for switches and sum types in Lua (github.com/alurm)N/A Code base
 12 points by alurm 2 hours ago | hide | discuss 
10  
 Learning Solver Design: Automating Factorio Balancers (gianlucaventurini.com)3930
 137 points by kolui 8 hours ago | hide | 21 comments 
11  
  2876.125
DanielTan1479
AnnaSal1565
Habryka1562
Zvi10212
chanind&Till4479
alexey148
This Post281
Bostock1200
 N/A <- Cant access
Vishakha79
xpostah4807
Avg ->2581.2
  1. ^

    See Paul Graham comment for their ranking algos: https://news.ycombinator.com/item?id=1781013.

comment by transhumanist_atom_understander · 2024-12-31T01:04:14.735Z · LW(p) · GW(p)

I'm in the process of turning the ideas in a stack of my notebooks into what I hope will be a short paper, which is just one illustration of what I think was the real trade-off, which is between conciseness and time spent writing. Or for another, see the polished 20-page papers on logical decision theory. Though it's not the same, they cover much of the same ground as the older expositions of timeless decision theory and updateless decision theory. There was a long period where these kinds of decision theories were only available through posts, and then through Eliezer Yudkowsky's long TDT paper. That period could not have been skipped, and could only have been shortened in the sense that the same work could have been done faster at the expense of other work. See also this exchange on Twitter, though they're not talking about being concise specifically:

Miles Brundage: I can’t speak to the details of those experiments but I at least read a much higher fraction of your paper outputs than your blog post outputs. Possibly I’m a minority here but I am certainly not the only one.

Eliezer Yudkowsky: Yeah, I tried it and it was way too fucking time-expensive. My guess is that 100x the output you like less ends up having the larger impact on the world.

comment by Archimedes · 2024-12-31T00:31:38.860Z · LW(p) · GW(p)

I think your assessment may be largely correct but I do think it's worth considering how things are not always nicely compressible [LW · GW].

comment by Felix Olszewski (quick-maths) · 2025-01-02T12:56:34.384Z · LW(p) · GW(p)

This is another example of Elezier Yudkowsky not being aware of his low information density having a bad influence on his theories reaching a broader audience:

https://youtube.com/shorts/v5GYeKg4YRE?si=KwSairgyr1N7p497