The LessWrong 2022 Review

post by habryka (habryka4) · 2023-12-05T04:00:00.000Z · LW · GW · 43 comments

Contents

    Getting Started
    No books this year, sorry folks
  How does the review work?
    Phase 1: Preliminary Voting
      Writing a short review
      Why preliminary voting? Why two voting phases?
      How is preliminary voting calculated?
    Phase 2: Reviews
    Phase 3: Final Voting
None
43 comments

The snow is falling, the carols are starting, and we all know it's time for our favorite winter holiday tradition. It's LessWrong review time!

Each year we come together and review posts that are at least one year old. That means for the next two months we are reviewing all posts from 2022.

While our everyday lives are filled with fads and chasing the sweet taste of karma and social approval, the LessWrong review is the time to take a step back and ask ourselves "did this actually help me think better?", "did this actually turn out to be valuable?" and "which things withstood further and extended scrutiny?".

We've done this 4 times so far (2018 [LW · GW], 2019 [LW · GW], 2020 [LW · GW], 2021 [LW · GW]).

The full technical details of how the Annual Review works are in the final section [LW · GW] of this post, but it's basically the same as the past few years. There are three phases:

  1. Preliminary Voting Phase (2 weeks, Dec 4 — 17): We identify posts especially worthy of consideration in the review casting preliminary votes. Posts with 2 preliminary votes move into the Discussion Phase.
  2. Discussion Phase (4 weeks, Dec  17 — Jan 14): We review and debate posts. Posts that receive at least one written review move to the final voting phase.
  3. Final Voting (2 weeks, Jan 14 — Jan 28): We do a full voting pass, using quadratic voting. The outcome determines the Annual Review results.

For more of the philosophy of the Annual Review, see the previous announcement posts here [LW · GW], here [LW · GW], here [LW · GW], and here [LW · GW].

Getting Started

At the top of any posts eligible for the review, you will see this: 

These will be your preliminary votes for the 2022 review. Posts need to get at least 2 preliminary votes (positive or negative) in order to move to the next phase of the review.

To start perusing posts, I recommend going to the All 2022 Posts page [? · GW], or the View Your Past Upvotes [? · GW] page. Note: only users with accounts registered before January 2022 are eligible to vote.

No books this year, sorry folks

For 2018, 2019, and 2020 we printed books of the results of the review. We have sold many thousands of them, I am very proud of them, and many people told me that these are among the favorite things that they own:

2018: A Map that Reflects the Territory (Amazon)
2019: The Engines of Cognition (Amazon)
2020: The Carving of Reality (Amazon)

Sadly, there won't be a book this year (and also not of the 2021 review). The effort involved in making them is hard to justify with increasing demands from many of our other projects (as well as reduced funding, since if you take into account the 4-5 staff months these cost to make each year, we net lost money on these).

I am thinking about other ways to create an easy to reference artifact that captures the results of this year's and last year's review. I think the minimum I want to do is to create a good ebook and maybe an audible version using our machine narration (or doing human narration). Additional suggestions are welcome.

We are going to be doing a Christmas sale of all of the previous years' books in the next few days, and hopefully before Christmas we will also have a good ebook (and maybe even an audiobook version) available of last year's review results.

How does the review work?

Phase 1: Preliminary Voting

To nominate a post, cast a preliminary vote for it. Eligible voters will see this UI:

If you think a post was an important intellectual contribution, you can cast a vote indicating roughly how important it was. For some rough guidance:

You can vote at the top of a post page, or anywhere the post appears in a list (like the All Posts page [? · GW], or the new View Your Past Upvotes [? · GW] page).

Posts that get at least one positive vote go to the Voting Dashboard, where other users can vote on it. You’re encouraged to give at least a rough vote based on what you remember from last year. It's okay (encouraged!) to change your mind later. 

Writing a short review

If you feel a post was important, you’re also encouraged to write up at least a short review of it saying what stands out about the post and why it matters. (You’re welcome to write multiple reviews of a post, if you want to start by jotting down your quick impressions, and later review it in more detail)

Posts with at least one review get sorted to the top of the list of posts to vote on [? · GW], so if you'd like a post to get more attention it's helpful to review it.

Why preliminary voting? Why two voting phases?

Each year, more posts get written on LessWrong. The first Review of 2018 considered 1,500 posts. In 2021, there were 4,250. Processing that many posts is a lot of work. 

Preliminary voting is designed to help handle the increased number of posts. Instead of simply nominating posts, we start directly with a vote. Those preliminary votes will then be published, and only posts that at least two people voted on go to the next round.

In the review phase this allows individual site members to notice if something seems particularly inaccurate in its placing. If you think a post was inaccurately ranked low, you can write a positive review arguing it should be higher, which other people can take into account for the final vote. Posts which received lots of middling votes can get deprioritized in the review phase, allowing us to focus on the conversations that are most likely to matter for the final result.

How is preliminary voting calculated?

You can cast an unlimited number of votes, but after a certain threshold, the greater the total score of your votes, the less influential each of your votes will be. On the back end, we use a modified quadratic voting system [LW · GW], which allocates a fixed number of points across your votes based on how strong they are.

Fine details: A vote of 1 costs 1 point. A vote of 4 costs 10 points. A vote of 9 costs 45 points. If you spend more than 500 points, your votes start to become proportionally weaker.

Phase 2: Reviews

The second phase is a month long, and focuses entirely on writing reviews. Reviews are special comments that evaluate a post. Good questions to answer in a review include:

Phase 3: Final Voting

Posts that receive at least one review move on the Final Voting Phase. 

The UI will require voters to at least briefly skim reviews before finalizing their vote for each post, so arguments about each post can be considered. 

As in previous years, we'll publish the voting results for users with 1000+ karma, as well as all users. The LessWrong moderation team will take the voting results as a strong indicator of which posts to include in the Best of 2022 sequence.

To get started, you can View Your Past Upvotes [? · GW] and start voting on some posts.

43 comments

Comments sorted by top scores.

comment by Multicore (KaynanK) · 2023-12-05T15:59:18.697Z · LW(p) · GW(p)

Predict the winners at

comment by Neel Nanda (neel-nanda-1) · 2023-12-05T19:42:41.881Z · LW(p) · GW(p)

Is research published elsewhere but cross posted here eligible? Eg I think that Toy Models of Superposition was one of the best papers of last year, and it was [cross posted to LessWrong] (https://www.lesswrong.com/posts/CTh74TaWgvRiXnkS6/toy-models-of-superposition [LW · GW])/came out of the overall alignment space, but isn't exactly a LessWrong post per se.

(notably, my grokking work and casual scrubbing were mech interp research that WAS published on LessWrong first and foremost)

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-05T20:10:07.851Z · LW(p) · GW(p)

Yep, cross-posted works are eligible! 

If something was created in 2022 but not crossposted, I am also happy to backdate it to include it in the review. 

comment by Alex_Altair · 2023-12-05T20:59:58.569Z · LW(p) · GW(p)

I'm curious what you would estimate the cost of producing the books to be. That is, how much would someone have to donate to pay for Lightcone to produce the books?

Replies from: Raemon
comment by Raemon · 2023-12-05T22:21:20.119Z · LW(p) · GW(p)

It’s historically been a couple months of salary time + a bunch of intermittent work over the course of the year. I think it’s at least $20k and plausibly like $40k. Plus the actual team time not being able to be spent on other things. (The books get sold at cost so this money is a cost to the org)

We tried hiring a bookmaker last year which didn’t work out. The hiring process was also pretty costly.

I think the actual cost is more like ‘do the headhunting to find someone who’d do a great job’.

Replies from: Raemon, Benito
comment by Raemon · 2023-12-05T23:25:35.110Z · LW(p) · GW(p)

The hardest part here is ensuring that whoever we hire can actually work self-directedly, without constant management. We've spent 3 years trying to make books efficiently and not succeeded yet, which I think is making us more risk-averse to trying again (although I do have some ideas on how to do it)

I think if someone who had previously made a particularly great HPMOR, SlatestarCodex or Sequences custom book, has good project-management skills, overall good aesthetic taste, and is proficient with both AI art and reworking essay diagrams that were low res to be printable resolution...

...I'd at least personally be pretty interested in hiring that person if they seemed to clearly demonstrate all the skills.

comment by Ben Pace (Benito) · 2023-12-05T23:15:32.221Z · LW(p) · GW(p)

Briefly registering disagreement: my first thought was an order of magnitude higher than yours. 

Brief sketch of my reasoning: Losing a staff member for 1-2 months really cuts out our ability to maintain the infrastructure we have responsibility for (like Lighthaven and Lightspeed grants and LW) while running at the organizational top priority — right now that's dialogues — and we're already stretched thin with only 2 people working on the top-priority full-time who don't have any side commitments (plus 2 other people working on it as their main focus but with side commitments). I've not got a definite sense of how we'd rearrange, but I can see worlds where it would cut our focus on the top priority by as much as 30% during that period, and that's not just the cost measured in the staff member's time, but reduces the value of everyone's time in a big way.

Replies from: Raemon
comment by Raemon · 2023-12-05T23:18:15.340Z · LW(p) · GW(p)

Oh yeah I'm pretty easily sold on "Actually it's just more like $200k" for reasons you cite, although it gets into more intangibles that are harder to quantify. ($200k seems more likely to be "our Cheerful Price [LW · GW]", but I suspect if we got a a $40k donation we'd consider it more strongly anyway, in part because it was an indication someone thought it was that valuable)

Replies from: tomcatfish, adam-jermyn
comment by Alex Vermillion (tomcatfish) · 2023-12-15T22:54:30.763Z · LW(p) · GW(p)

(Expressing confusion here, not frustration or another "negative" emotion)

This number doesn't seem to make any sense. You suggest making an ebook, and that should be most of the heavy effort handled if you can ever get to a point of reusing a previous year's printing process. It's not really clear to me how it can take that much time and/or effort.

I'm only bringing this up because the books were pretty cool and me buying a set actually convinced some non-LW-reading folks to buy some, so it seemed a pretty neat outreach opportunity, if we can ever find an HTML->ebook->book pipeline.

Also please please please don't make a machine-read audiobook, it makes the writing look less valued than not making an audiobook at all.

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-16T00:02:52.311Z · LW(p) · GW(p)

Look, I also really thought this. And then we did it three times and each time it took hundreds (and sometimes over a thousand) hours. I also had my inside-view violated, but I updated towards the outside-view after trying this three times and each time finding it to be a quite massive endeavor with a lot of details.

Also please please please don't make a machine-read audiobook, it makes the writing look less valued than not making an audiobook at all.

No worries, if we make an audiobook we would be collaborating with T3Audio on making a human narration. 

Replies from: tomcatfish
comment by Alex Vermillion (tomcatfish) · 2023-12-16T05:19:46.753Z · LW(p) · GW(p)

Huh, I believe you did this and I believe you got the result, but I just have no model for what the heck is going on. It happens sometimes I guess, but damn I cannot grasp this.

Replies from: Benito
comment by Ben Pace (Benito) · 2024-01-01T22:45:35.046Z · LW(p) · GW(p)

I'm not certain entirely of the cause of it taking so much work. I will say that meeting the standard of "beautiful, professional book" requires all of the details to be okay. Here's a quickly-generated list of possible details that can go wrong:

  • A resized image with blurry/unreadable text in it
  • Some misaligned text in the running header
  • Some misaligned text on the outer cover
  • Some of the text's color being the wrong shade of gray/black
  • Mis-spelling someone's username
  • Having the text no longer quite accurately describe the new versions of the images (never mind the work involved in re-making all the images to fit the reduced-for-cost color-scheme of the printed book)
  • Image color coming out differently in print relative to its appearance in photoshop/indesign
  • Fixing critical typos
  • Figuring out how to deal with text that only makes sense if you can click on the hyperlink
  • Ensuring there's no duplicated paragraphs or short amounts of text that hangs over on a bare page on its own
  • Math/LaTex needs to not look horrendous. Perhaps you need to make an image for it, and  then you must ensure that it's the same size font as the rest of the text.

A lot of stuff has to be re-checked every time you make a change (e.g. "We've reduced the margin between the text and the outside of the page by a quarter of an inch in order to reduce the total number of pages and decrease cost. This means we need to do another visual check of ~1000 pages to make sure nothing broke.") 

There's a lot of low-level details that I need to get right so that it correctly fits in the category of 'beautiful item made with love' rather than 'cheap amazon self-made book'. I think a book where we spent half the time on the details could end up being quite disappointing on net.

Replies from: tomcatfish
comment by Alex Vermillion (tomcatfish) · 2024-01-02T00:08:43.456Z · LW(p) · GW(p)

I'd like to express pretty large appreciation for the answer; this changes what things I, personally, was planning to do wrt the finished product. Thank you

Replies from: Benito
comment by Ben Pace (Benito) · 2024-01-02T00:12:51.608Z · LW(p) · GW(p)

You're quite welcome.

I am curious what this refers to?

this changes what things I, personally, was planning to do wrt the finished product.

Replies from: tomcatfish
comment by Alex Vermillion (tomcatfish) · 2024-01-02T02:49:22.800Z · LW(p) · GW(p)

I was waiting to see what you guys turned out as an ebook or sequence and trying to see if I could take it to a printer for a personal copy.|

Now I understand that the difficulty is a layer earlier and it's worth figuring out how to "make an ebook for printing ", not "print an ebook"

comment by Adam Jermyn (adam-jermyn) · 2023-12-06T04:51:33.505Z · LW(p) · GW(p)

I'm guessing that the sales numbers aren't high enough to make $200k if sold at plausible markups?

Replies from: Raemon
comment by Raemon · 2023-12-06T04:53:12.305Z · LW(p) · GW(p)

The sales are at cost and don’t make money on net.

Replies from: moridinamael, Raemon, FireStormOOO
comment by moridinamael · 2023-12-13T19:00:30.224Z · LW(p) · GW(p)

Well, there’s your problem!

comment by Raemon · 2023-12-06T04:54:14.893Z · LW(p) · GW(p)

(It’s hard to price 4-book sets at this scale of printing at a price that makes sense)

comment by FireStormOOO · 2024-01-18T03:48:24.905Z · LW(p) · GW(p)

If you're selling them at unit cost you aren't selling at cost, you're straightforwardly selling at a loss.  That's definitely not what I'm thinking of when someone tells me they're selling at cost.

Replies from: habryka4
comment by habryka (habryka4) · 2024-01-18T05:34:25.181Z · LW(p) · GW(p)

(We're not selling them at marginal/unit cost, we were selling them so that roughly a whole print run breaks even, including some budget for labor-time/opportunity-cost, but less than people's full salaries for that period)

Replies from: FireStormOOO
comment by FireStormOOO · 2024-01-18T06:42:04.826Z · LW(p) · GW(p)

Ah, gotcha.  I had gotten the other impression from the thread in aggregate.

comment by Vanessa Kosoy (vanessa-kosoy) · 2023-12-05T08:46:48.755Z · LW(p) · GW(p)

The LessWrong moderation team will take the voting results as a strong indicator of which posts to include in the Best of 2022 sequence.

Will there also be a Best of 2021 sequence at some point?

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-05T09:08:41.146Z · LW(p) · GW(p)

Yep, I am working on it right now! 

comment by Yitz (yitz) · 2023-12-06T19:44:50.065Z · LW(p) · GW(p)

Can I write a retrospective review of my own post(s)?

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-06T19:54:34.655Z · LW(p) · GW(p)

Yep! Self-reviews are encouraged.

Replies from: MondSemmel
comment by MondSemmel · 2023-12-13T09:47:44.938Z · LW(p) · GW(p)

Self-reviews and postmortems are great! Even a caricature of a self-review provides valuable information: "Look at my great/terrible take from last year. I've changed my mind about nothing/everything since." And of course the actual self-reviews are much more useful than that.

comment by Nicholas / Heather Kross (NicholasKross) · 2024-01-17T23:24:02.865Z · LW(p) · GW(p)

if you take into account the 4-5 staff months these cost to make each year, we net lost money on these

For the record, if each book-set had cost $40 or even $50, I still would have bought them, right on release, every time. (This was before my financial situation improved, and before the present USD inflation.)

I can't speak for everyone's financial situation, though. But I (personally) mentally categorize these as "community-endorsement luxury-type goods", since all the posts are already online anyway.

The rationality community is unusually good about not selling ingroup-merch when it doesn't need or want to. These book sets are the perfect exceptions.

comment by ryan_greenblatt · 2023-12-08T00:17:22.803Z · LW(p) · GW(p)

Preliminary Voting Phase (2 weeks, Dec 4 — 17): We identify posts especially worthy of consideration in the review casting preliminary votes. Posts with 2 preliminary votes move into the Discussion Phase

[...]

These will be your preliminary votes for the 2022 review. Posts need to get at least 2 preliminary votes (positive or negative) in order to move to the next phase of the review.

Suppose I think a post is misleading or bad. Is the intended approach here that I negative vote in the preliminary phase and then (possibly) write a negative review?

Replies from: Raemon, habryka4
comment by Raemon · 2023-12-08T02:53:24.159Z · LW(p) · GW(p)

Posts also need at least 2 positive votes to get into the Review Phase, so you can wait to see if it seems overrated before putting the effort into a negative review. (although if you think it was already overhyped and want to correct the record anyway, that sound fine too)

Replies from: Benito
comment by Ben Pace (Benito) · 2023-12-08T03:07:50.884Z · LW(p) · GW(p)

I discussed this with Oli and he argued that negative votes were also typically strong evidence that the post was worth reviewing. I was persuaded by the argument that it means someone out of their way to say "there is something actively bad about this post and I really think it should not get a high score", and that probably means that a review of what's bad about the post would be worthwhile to read (that I would learn something interesting or valuable from it).

Replies from: Raemon
comment by Raemon · 2023-12-08T03:15:20.169Z · LW(p) · GW(p)

I certainly buy that as an argument, but don’t know that it’s obviously worth prioritizing before checking that anyone actively cared about it positively. Lots of posts are bad, you can’t cover all of them.

Replies from: Benito
comment by Ben Pace (Benito) · 2023-12-08T03:18:40.442Z · LW(p) · GW(p)

I have the experience when voting in the review that I don't vote on most posts, and my negative votes only go on importantly bad posts. Empirically I don't expect people will downvote anywhere near all of ~4500 posts written in 2022, and I think the 10-20 that people will downvote have a ~100x chance relative to baseline of being worth reviewing. (Perhaps 100x is a bit strong but 30x seems reasonable to me.)

Replies from: Yoav Ravid
comment by Yoav Ravid · 2023-12-08T05:44:30.274Z · LW(p) · GW(p)

I think that's a good policy and making it explicit (by writing something about it in the announcement posts) would be even better. Then when people know that a downvote in the preliminary voting phase means "This is bad and we should pay attention to that", they'll be more likely to use downvotes that way.

Replies from: Raemon
comment by Raemon · 2023-12-08T08:55:51.129Z · LW(p) · GW(p)

(Flagging it’s still technically required to get 2 positive votes to proceed to review phase)

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-09T09:10:41.396Z · LW(p) · GW(p)

(I am currently planning to change that before the start of the review phase, unless it turns out to be hard for some reason)

Replies from: Raemon
comment by Raemon · 2023-12-09T18:06:45.838Z · LW(p) · GW(p)

It should be easy, although I am worried that will result in the review phase being even more overwhelming than it usually is, and the benefit doesn't seem that great to me.

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-09T18:35:56.306Z · LW(p) · GW(p)

I would be really surprised if it added more than 4 posts to the review, and I am confident that at least 2 of those posts would seem really important for you and me to indeed be reviewed.

Replies from: Raemon
comment by Raemon · 2023-12-09T18:38:19.475Z · LW(p) · GW(p)

It current adds 128 posts (query I just ran was "has at least 1 positive vote" vs "has at least 2 positive votes". I'm not 100% sure what query you were planning)

I don't really get why it's worth reviewing bad posts that don't have at least 2 people who think they're good. (I buy that posts that at least some people think were good but are controversial might be interesting/important)

Replies from: habryka4
comment by habryka (habryka4) · 2023-12-09T21:45:16.082Z · LW(p) · GW(p)

Sorry, but that's a totally different query that has nothing to do with what I said? What I said is let's add posts with at least two reviews, whether positive or negative? So you should compare "has at least 2 positive votes" to "has at least 2 non-zero votes". 

Negative votes have a ton of information in them! It means someone thought it was worth spending points sending an active signal that they thought the post was bad. For example, if everyone who votes thinks the Waluigi post, or the Simulators post, is bad, it would be terrible for us to not review them, given the importance they had in the discourse nevertheless. 

comment by habryka (habryka4) · 2023-12-08T00:35:54.383Z · LW(p) · GW(p)

Yep! That's what I would do.

comment by Review Bot · 2024-02-14T06:49:11.438Z · LW(p) · GW(p)

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

comment by Alex Vermillion (tomcatfish) · 2023-12-15T22:56:53.806Z · LW(p) · GW(p)

Just flagging as a thing to consider that several of my favorite posts this year were actually shortish sequences, and that if they make it it might be worth figuring out something nice to do for them. I suppose if they partially make it, that's a question too.