The LessWrong 2018 Review

post by Raemon · 2019-11-21T02:50:58.262Z · LW · GW · 87 comments

Contents

  Improving the Idea Pipeline
  Goals
      Longterm incentives, feedback and rewards
      A highly curated "Best of 2018" sequence / book
      Common knowledge about the LW community's collective epistemic state regarding controversial posts
  Review Process
    Phase
      1 week (Nov 20th – Dec 1st)
    Phase
      4 weeks (Dec 1st – Dec 31st)
    Phase
       1 Week (Jan 1st – Jan 7th)
    and Rewards
      Public Writeup / Aggregation
      Best of 2018 Book / Sequence
      Prizes
None
86 comments

LessWrong is currently doing a major review of 2018 — looking back at old posts and considering which of them have stood the tests of time. It has three phases:

Authors will have a chance to edit posts in response to feedback, and then the moderation team will compile the best posts into a physical book and LessWrong sequence, with $2000 in prizes given out to the top 3-5 posts and up to $2000 given out to people who write the best reviews.

Helpful Links:


This is the first week of the LessWrong 2018 Review – an experiment in improving the LessWrong Community's longterm feedback and reward cycle.

This post begins by exploring the motivations for this project (first at a high level of abstraction, then getting into some more concrete goals), before diving into the details of the process.

Improving the Idea Pipeline

In his LW 2.0 Strategic Overview, habryka noted:

We need to build on each other’s intellectual contributions, archive important content, and avoid primarily being news-driven.

We need to improve the signal-to-noise ratio for the average reader, and only broadcast the most important writing

[...]

Modern science is plagued by severe problems, but of humanity’s institutions it has perhaps the strongest record of being able to build successfully on its previous ideas. 

The physics community has this system where the new ideas get put into journals, and then eventually if they’re important, and true, they get turned into textbooks, which are then read by the upcoming generation of physicists, who then write new papers based on the findings in the textbooks. All good scientific fields have good textbooks, and your undergrad years are largely spent reading them.

Over the past couple years, much of my focus has been on the early-stages of LessWrong's idea pipeline – creating affordance for off-the-cuff conversation, brainstorming, and exploration of paradigms that are still under development (with features like shortform and moderation tools).

But, the beginning of the idea-pipeline is, well, not the end.

I've written a couple times about what the later stages of the idea-pipeline might look like. My best guess is still something like this:

I want LessWrong to encourage extremely high quality intellectual labor. I think the best way to go about this is through escalating positive rewards, rather than strong initial filters.

Right now our highest reward is getting into the curated section, which... just isn't actually that high a bar. We only curate posts if we think they are making a good point. But if we set the curated bar at "extremely well written and extremely epistemically rigorous and extremely useful", we would basically never be able to curate anything.

My current guess is that there should be a "higher than curated" level, and that the general expectation should be that posts should only be put in that section after getting reviewed, scrutinized, and most likely rewritten at least once. 

I still have a lot of uncertainty about the right way to go about a review process, and various members of the LW team have somewhat different takes on it.

I've heard lots of complaints about mainstream science peer review: that reviewing is often a thankless task; the quality of review varies dramatically, and is often entangled with weird political games.

Meanwhile: LessWrong posts cover a variety of topics – some empirical, some philosophical. In many cases it's hard to directly evaluate their truth or usefulness. LessWrong team members had differing opinions on what sort of evaluation is most useful or practical.

I'm not sure if the best process is more open/public (harnessing the wisdom of crowds) or private (relying on the judgment of a small number of thinkers). The current approach involves a mix of both.

What I'm most confident in is that the review should focus on older posts. 

New posts often feel exciting, but a year later, looking back, you can ask if it actually has become a helpful intellectual tool. (I'm also excited for the idea that, in future years, the process could also include reconsidering previously-reviewed posts, if there's been something like a "replication crisis" in the intervening time)

Regardless, I consider the LessWrong Review process to be an experiment, which will likely evolve in the coming years. 


Goals

Before delving into the process, I wanted to go over the high level goals for the project:

1. Improve our longterm incentives, feedback, and rewards for authors

2. Create a highly curated "Best of 2018" sequence / physical book

3. Create common knowledge about the LW community's collective epistemic state regarding controversial posts
 

Longterm incentives, feedback and rewards

Right now, authors on LessWrong are rewarded essentially by comments, voting, and other people citing their work. This is fine, as things go, but has a few issues:

The aim of the Review is to address those concerns by: 

A highly curated "Best of 2018" sequence / book

Many users don't participate in the day-to-day discussion on LessWrong, but want to easily find the best content. 

To those users, a "Best Of" sequence that includes not only posts that seemed exciting at the time, but distilled reviews and followup, seems like a good value proposition. And meanwhile, helps move the site away from being time-sensitive-newsfeed.

Common knowledge about the LW community's collective epistemic state regarding controversial posts

Some posts are highly upvoted because everyone agrees they're true and important. Other posts are upvoted because they're more like exciting hypotheses. There's a lot of disagreement about which claims are actually true, but that disagreement is crudely measured in comments from a vocal minority.

The end of the review process includes a straightforward vote on which posts seem (in retrospect), useful, and which seem "epistemically sound". This is not the end of the conversation about which posts are making true claims that carve reality at it's joints, but my hope is for it to ground that discussion in a clearer group-epistemic state.


Review Process

Nomination Phase 

1 week (Nov 20th – Dec 1st)

Review Phase 

4 weeks (Dec 1st – Dec 31st)

Voting Phase

 1 Week (Jan 1st – Jan 7th)

Posts that got at least one review proceed to the voting phase. The details of this are still being fleshed out, but the current plan is:

Books and Rewards

Public Writeup / Aggregation

Soon afterwards (hopefully within a week), the votes will all be publicly available. A few different aggregate statistics will be available, including the raw average, and potentially some attempt at a "karma-weighted average."

Best of 2018 Book / Sequence

Sometime later, the LessWrong moderation team will put together a physical book, (and online sequence), of the best posts and most valuable reviews

This will involve a lot of editor discretion – the team will essentially take the public review process and use it as input for the construction of a book and sequence. 

I have a lot of uncertainty about the shape of the book. I'm guessing it'd include anywhere from 10-50 posts, along with particularly good reviews of those posts, and some additional commentary from the LW team.

Note: This may involve some custom editing to handle things like hyperlinks, which may work differently in printed media than online blogposts. This will involve some back-and-forth with the authors.

Prizes

87 comments

Comments sorted by top scores.

comment by clone of saturn · 2019-11-22T08:18:51.951Z · LW(p) · GW(p)

I've added support for this on GreaterWrong; you can view nominated posts here [? · GW] and all 2018 posts here [? · GW].

comment by Raemon · 2019-11-22T20:44:29.233Z · LW(p) · GW(p)

Thanks!

comment by Vaniver · 2019-11-22T19:43:51.304Z · LW(p) · GW(p)

What is the plan for incorporation of comments into the book?

I'm guessing for most posts they'll just be omitted, and it'll be fine (or perhaps some curated selection of comments will make it into the book). But I notice that Unreal's Circling [LW · GW] seems to be a historically relevant post that I would only want to endorse if it came along with a substantial fraction of the discussion in the comments (in a way that would dramatically lengthen its section, possibly 'taking over' the book).

comment by JenniferRM · 2019-12-08T08:01:09.255Z · LW(p) · GW(p)

I hunted your comment down here and upvoted it strongly.

I basically only write comments, and when I write "comments for the ages" that I feel proud of, I consider it a good sign if they (1) get many upvotes (especially votes that arrive after lots of competing sibling comments already exist) and (2) do not get any responses (except "Wow! Good! Thanks!" kind of stuff).

Looking at "first level comments" to worthwhile OPs according to a measure like this might provide some interesting and reasonably brief postscripts.

Applying the same basic measure to posts themselves, if an OP gets a large number of direct replies that are highly upvoted that OP may not be dense with relatively useful and/or flawless content. (Though there are probably exceptions that could be detected by thoughtful curating... for example, if the OP is a request for ideas [LW · GW] then a lot of highly voted comments are kinda the point.)

comment by Raemon · 2019-11-22T20:47:29.302Z · LW(p) · GW(p)

A lot of the details are up in the air – over the next week I plan to write out a lot of my thoughts and open questions about the review process, and how it should feel into the overall end product.

One option is to include a curated selection of comments from the post. Another is to sort of leave that up to reviewers, to distill those comments down into a more succinct encapsulation of them. In some cases it might be that the commenters "got it right the first time", and basically wrote a fine "review-like comment" back in 2018, and there should be some way of marking an old comment as a review, retroactively.

A middle ground might be something like "in addition to summarizing key points from the previous discussion, reviewers can point to particular comments that seem worth including".

In the end, the editors will make some judgment calls about how much fits – we definitely wouldn't include the entire comment section of Circling. My guess is that the upper bound of "amount of comments and/or reviews from a given post to include" is roughly the same as "the upper bound for a post." (In some cases posts are quite long, but maybe expect the median comments/reviews-length to be comparable to the median post length)

comment by leggi · 2019-12-08T12:19:11.183Z · LW(p) · GW(p)

I would like to see some comments considered for inclusion - those that expand on a post in some way (the circling post is a good example).

Also I read slack [LW · GW] . I liked it and then G Gordon Worley's comment brought a new dimension to the concept and expanded my 'knowledge base' about things I've not really thought about.

comment by habryka (habryka4) · 2019-11-22T20:01:08.031Z · LW(p) · GW(p)

We have a draft book that tried to do this for some posts on LessWrong. If you ping Ben you can probably take a look at it if you want. 

comment by Kaj_Sotala · 2019-12-08T22:26:39.704Z · LW(p) · GW(p)

Occasionally I think about writing a review, but then feel like I'm too confused to do so.

Some of my open questions:

  • I'm unsure of what to write. The post says that "A good frame of reference for the reviews are shorter versions of LessWrong or SlatestarCodex book reviews (which do a combination of epistemic spot checks, summarizing, and contextualizing)", but this feels like weird advice for reviewing a blog post, which is much shorter than a book. Especially the "summarizing" bit - for most posts the content is already too short for further summarizing to make sense. This guideline confuses me more than it helps.
  • If I just ignore the guideline and think about what would make sense to me, it would be... something like my longer nomination comments. But I already posted those as nominations. Should I re-post some of them as reviews? That seems silly.
  • I don't know which posts I should review. I won't have the chance to review all of them, so I should pick just a few. But which ones? The post says "Posts that got at least one review proceed to the voting phase", which makes it sound like reviews are like nominations / votes; a post won't be included unless it gets at least one vote. That creates an incentive for me not to review posts I don't like, since even a critical review might cause it to get to the voting stage. So I should probably focus on reviewing the posts that I like. That conclusion does not seem like it's what was intended, though.
  • Also, I'm not sure of how to review posts that I didn't like. The posts that got to this stage are generally decent quality, and I don't have major criticisms of them. If I don't think that something should be included in a collection of best posts, then my reason is generally "I didn't seem to have gotten any lasting value out of it". But someone else did, or else it would not have been nominated. There's no point in me posting a review saying "I didn't get lasting value out of this, but of course someone else might have".
comment by Raemon · 2019-12-08T23:05:28.135Z · LW(p) · GW(p)

There should be a post coming up soon that goes into more examples of how to do Reviews. It's a bit tough question because different posts benefit from different types of reviews.

A thing that I think is commonly useful is asking "what are the actual claims this post is making", and listing them succinctly, and writing up some thoughts about how we could actually empirically check if those claims are true. (Even if we don't actually run the experiment, I think operationalizing what observations we'd expect in the world is helpful for evaluating when/why/whether the post is valid)

comment by Raemon · 2019-12-08T23:08:58.466Z · LW(p) · GW(p)

One of the key ideas here is that I'd like posts to have gotten someone to "look into the dark [LW · GW]". If the post wasn't as useful as it seemed, how would we know? If 10 years from now you no longer endorsed the post, why might that be?

comment by Raemon · 2019-12-09T23:58:04.310Z · LW(p) · GW(p)

Here's a review of mine [LW(p) · GW(p)] that I think is pretty representative of the sort of review that I, personally, am most excited about.

comment by Raemon · 2019-11-21T03:11:34.933Z · LW(p) · GW(p)

Perhaps worth noting (ironically)

I just went to begin looking over the 2018 posts, thinking about my own nominations. I was immediately hit with a bit of paralysis of "aaah but I don't even know what standards to employ here – I feel like I want to take a long time to think about all the posts I might want to nominate and how they compare and how they fit into the big picture" (plus, a bit of Pat Modesto whispering in my ear saying "who are youuuu to decide what posts are good!?")

And, well, if I'm experiencing that it seemed like others might be as well. 

So, wanted to explicitly note: I think this process will be more fruitful (as well as more fun) if it's more like an evolving conversation than a bunch of people silently thinking independently. A lot of the value is in getting old posts back into the public spotlight in a concentrated way.

So, I'd err on the side of going ahead and nominating things that seem good – you can retract the nomination later if you feel like it was a mistake. You can also start with a relatively brief nomination-endorsement-message that gives the rough gist of why a post was valuable, and later follow it up with a more extensive message when you have time.

comment by Raemon · 2019-11-22T21:28:05.989Z · LW(p) · GW(p)

Update: Posts need at least 2 nominations to proceed to the Review Phase.

I initially left this requirement as a somewhat vague "sufficient nominations", because I wasn't sure how many people would be engaging with the process and how thoroughly. I'm less worried about that now, and meanwhile I think it's there's a fairly substantial shift between "at least one person liked this and took time to say so" to "at least two people liked it."

(The goal is still to have the Review Phase include 50-100 posts, which could potentially mean the nomination-requirement goes even higher, although I think that's unlikely)

Meanwhile, it's still the case that:

  • It's okay to leave short nominations (that might link to an existing nomination-comment saying 'what they said')
  • It's still useful to have as many concrete details about how the post has been useful, as possible. (i.e. order of magnitude of number of conversations/projects/decisions that the post was useful to, and roughly estimating the magnitude of how useful it was)

(Also note: in another hour or two some new UI will launch that tells you the number of nominations each post has on the nominations page, which should make the process a bit easier)

comment by John_Maxwell (John_Maxwell_IV) · 2019-11-24T05:04:13.829Z · LW(p) · GW(p)

Is there some way I can see all the posts I upvoted in 2018 so I can figure out which I think are worthy of nomination?

Compiling the results into a physical book. I find there's something... literally weighty about having your work in printed form. And because it's much harder to edit books than blogposts, the printing gives authors an extra incentive to clean up their past work or improve the pedagogy.

Physical books are also often read in a different mental mode, with a longer attention span, etc. You could also sell it as a Kindle book to get the same effect. Smashwords is a service that lets you upload a book once and sell it on many different platforms.

The end of the review process includes a straightforward vote on which posts seem (in retrospect), useful, and which seem "epistemically sound". This is not the end of the conversation about which posts are making true claims that carve reality at it's joints, but my hope is for it to ground that discussion in a clearer group-epistemic state.

Is the idea to only include in the review those posts which are almost universally regarded as "epistemically sound"?

comment by Raemon · 2019-11-24T07:13:47.786Z · LW(p) · GW(p)

Is there some way I can see all the posts I upvoted in 2018 so I can figure out which I think are worthy of nomination?

Not currently – I agree that'd be a good feature, although there's probably a few other comparably good features worth building to improve the nomination UI experience and not sure if I'd get to them all this year.

Is the idea to only include in the review those posts which are almost universally regarded as "epistemically sound"?

I'm not sure exactly, but I'd at least want clearer epistemic flags on things. (I can imagine a case where there are some posts that seem clearly important, but still have some questionable claims, and the author hasn't have time to update them. In that case, one option might be to include the work as-is, but follow it up with some commentary, either from a reviewer during the Review Phase or by one of the moderation team member).

comment by Kaj_Sotala · 2019-11-21T12:06:00.106Z · LW(p) · GW(p)

A couple of comments on nomination UI:

  • On mobile at least, the "nominate" pop-up has a "submit" button but not a "cancel" button, which is a little inconvenient in cases where I realize I'd like to go back to check some detail about the post that I'm nominating before I nominate it
  • It would be nice if existing nominations would have a button saying "endorse this nomination" or something, so if I essentially just agree with an existing nomination and don't have anything to add, I have an easy way to add another vote to it. Making a top-level comment saying just "what Raemon's nomination said" feels weird, since the social convention is that comments which only endorse another comment are posted as responses to the comment that they are endorsing.
comment by Raemon · 2019-11-21T17:44:26.699Z · LW(p) · GW(p)

Yeah i think something in that space makes sense for endorsing nominations

comment by Raemon · 2019-11-21T20:41:22.701Z · LW(p) · GW(p)

Hmm, upon further reflection – I agree that better UI here would be helpful, but am wary of investing too much time into a UI element that will only be used for a week. (If we do this again next year, I'd definitely want to invest more in UI, but I put sizable probability on 'if we do it next year the whole process may be different')

So, for the immediate future: my suggestion would be to just make a nomination that says "What Alice said, [link]". Especially because you might want your nomination to include "What Alice said [here], and and what Bob said [here]", and it's a bit tricky to figure out how to count multiple endorsements.

(This is a bit weird as a commenting experience but I think pretty fine for now – in any case you have my blessing to do a slightly weird commenting thing)

comment by Raemon · 2019-12-03T22:42:12.783Z · LW(p) · GW(p)

Quick update: the Review UI is almost ready but has a few kinks to work out before me merge it into production. Apologies for delay.

comment by Raemon · 2019-11-23T23:17:31.146Z · LW(p) · GW(p)

Some open questions and meandering thoughts on 'What exactly do we want out of the Review Phase?'

There's a few different goals one might have for Review. I think ideally I'd like all of them, but I'm not sure how much bandwidth people will have.

I see two broad ontologies for "what I want reviewers to do"

Ontology A – What information do we want?

Different posts call for different types of evaluation. 

A post that makes a bunch of empirical claims should have at least some of those claims epistemic-spot-checked [LW · GW]. 

A post that proposes ontologies and categories should (probably?) have people exploring what other ontologies and categories we might have proposed, and consider information-theoretic-guidelines we should be applying to our considerations of categories. (I think Zack's response to Decoupling vs Contextualization was a good example of this)

A post that proposes a technique or skill should probably check "how many people have actually tried this technique? Did it work? How confident should we be?"

A post that's operating in a weird paradigm should have people asking "Does this paradigm make any sense?"

Ontology B – What role(s) are reviewers enacting?

There's a few different hats you might want a reviewer to fill, such as:

  • Gatekeeping — Is this post something that upholds the standards LessWrong is aiming for? Why or why not?
  • Helpful-to-author — how can this post be improved? If you were just acting as exobrain for the author (not attempting to wage any overton window fights), what advice would you give them to help them think, or explain themselves better?
  • Context-exploration — how does this post fit into the broad ecosystem of ideas? How are you most interested in applying this post? What followup work would you like to see in this space? 

By default, these roles are probably going to get muddled together, or different reviewers might focus on them differently. Critch suggested (based on some personal experience) that there were benefits to giving reviewers multiple text boxes that more cleanly separated those roles.

(Meanwhile, I myself have found that when I send feedback questionaires, I get more information if I ask people a variety of specific questions rather than giving them a single blank textbox, and think a similar principle might apply here)

...

So, I'm interested in people's thoughts on this. What sort of reviews are you interested in giving? What sort of reviews are you hoping other people give? What are you most interested in for the overall process?

comment by Zack_M_Davis · 2019-11-24T02:24:20.137Z · LW(p) · GW(p)

I'm optimistic about the review process incentivizing high-quality intellectual engagement by means of "upping the stakes." Normally, if someone writes a bad post, I'm likely to just downvote or ignore it if I have better things to do with my time that day than argue on the internet.

But if someone writes a bad post and it gets multiple nominations to be included in a paper book allegedly representing the best my stupid robot cult has to offer, then that forces me to write a rebuttal, even though I'd kind of rather not, because I was planning on spending all of my spare energy this month on memoir-writing to help me process trauma and stop being so emotionally attached to this stupid robot cult that's bad for me. If other people feel the same way (higher stakes spur more effort), we could have some fruitful discussions that we otherwise wouldn't.

Thanks so much for organizing this! (Not sarcasm, actual sincere and enthusiastic thanks despite negative-valence words in previous paragraph.)

comment by Raemon · 2019-11-24T03:35:25.221Z · LW(p) · GW(p)

*expression of empathy for energy bottlenecks that force unfortunate tradeoffs.*

One of my hopes for this process is that normally there's a tradeoff of "arguing on the internet often consumes a lot of time and energy that could be better spent on other things", but you do in fact need to argue on the internet (or something similar) in order to have healthy group epistemics.

I'm hoping that concentrating overton-window fights into a) a relatively condensed month, b) narrowing them down to "concepts that multiple longterm community members actually want to make bids for community attention/endorsement of", can get us a better costs/benefit ratio, going forward.

comment by Zack_M_Davis · 2019-11-24T19:46:23.900Z · LW(p) · GW(p)

overton-window fights

So, sorry in advance if I'm reading way too much into a casual choice of words, but—this is an incredibly ominous metaphor, right? (I'm definitely not blaming you for anything, because I've also used it in just this context, and it took me a while to notice how incredibly ominous it is.)

Maybe my rationality realism [LW · GW] is showing, but I thought the premise and promise of the website is that there are laws of systematically correct reasoning [LW · GW] as objective as mathematics—different mathematicians from different cultures might have different interests (like analysis or algebra or combinatorics) or be accustomed to different notations, but ultimately, they're all on the same cooperative quest for Truth—even if that cooperative process may occasionally involve some amount of yelling and crying.

("And being universals," said the Lady 3rd, "they bear no distinguishing evidence of their origin.") [LW · GW]

The Overton window concept describes a process of social-pressure mind control, not rational deliberation: an idea is said to be "outside the Overton window" not on account of its being wrong, but on account of its being unacceptably unpopular. If a mathematician were to describe a debate with their colleagues about mathematics (as opposed to some dumb non-math thing like tenure or teaching duties) as an "Overton-window fight", I would be pretty worried about the culture of that mathematics department, wouldn't you?!

concepts that multiple longterm community members actually want to make bids for community attention/endorsement of

Again, sorry in advance if I'm reading way too much into a casual choice of words, but—paying attention to what "longterm community members" want is an instance of Goodhart's law [LW · GW], isn't it? (Some would say of the regressional [LW · GW] variety, but I think this case is actually the adversarial type [LW(p) · GW(p)].)

We want concepts that advance the art of human rationality. The hope is that longterm community members are performing the right kind of computation such that "concepts that multiple long-term community members want endorsed" ends up being the same thing as "concepts that advance the art of human rationality", much as one would hope that "what this calculator outputs when you type in 2 + 3" ends up being the same thing as 2 + 3 [LW · GW]. If the calculator outputs something other than 5, then the machine really shouldn't be called a "calculator"—in order to not confuse people, it needs to be either renamed or destroyed. Same thing with a "rationality community."

comment by Ruby · 2019-11-24T23:49:40.036Z · LW(p) · GW(p)

The Overton window concept describes a process of social-pressure mind control, not rational deliberation: an idea is said to be "outside the Overton window" not on account of its being wrong, but on account of its being unacceptably unpopular. If a mathematician were to describe a debate with their colleagues about mathematics (as opposed to some dumb non-math thing like tenure or teaching duties) as an "Overton-window fight", I would be pretty worried about the culture of that mathematics department, wouldn't you?!

 

I think it's ominous if Raemon used the word with that intended meaning, but I'm guessing he didn't (and most people around here don't?). When I think "Overton window", I just think "what is considered reasonable to discuss without it being regarded as weird or extreme or requiring extreme evidence to overcome a very low prior"  and think of the term being agnostic to how it got decided. In this sense, our community has an Overton window that definitely includes physics and history, presently really excludes Reiki and astrology, and perhaps has meditation/IFS on the border. I think overall the process by which we've ended up with this window has been much better than what most of broader society uses.

My understanding of Ray's comments about "concentrating Overton window fights" was that just now was a period when we'd more than usual communally debate (using the correct and normative laws of reasoning) ideas which we're as yet still contentious with the community and increasing consensus of whether they were good or not– based on their epistemic merits.

...

It's a separate question about what best way to use the term "Overton window" is and upon which I don't have a strong opinion at present.

 

comment by Raemon · 2019-11-25T01:32:06.962Z · LW(p) · GW(p)

This is roughly how I intended it. But, it's not a coincidence that the word has the history that it does, and did seem worth reflecting on at least briefly.

comment by Raemon · 2019-11-24T21:29:14.162Z · LW(p) · GW(p)

(note: I think this conversation is important, but part of the point of the review is to have a large number of similarly important conversations. I will probably reply a couple more times. My current guess is that my budget for such conversations this month is going to be better spent this month on the object-level review process, and/or building code that's "meta-level" to support the object-level process)

My off-the-cuff thought is that I agree with you about the shape of how this is worrisome, but probably disagree about it's magnitude. 

(But, I notice as I say that that my brain is compressing magnitude into a region can easily compare. i.e. It's seems quite plausible the absolute magnitude of how "worrisome" this should be is 100, but my brain has 12 settings for importance and I've already compressed things down in a way optimized for comparing relevant plans and actions. i.e. if the fire alarm is always ringing, there's not much point in having a fire alarm)

If the calculator outputs something other than 5, then the machine really shouldn't be called a "calculator"—in order to not confuse people, it needs to be either renamed or destroyed. Same thing with a "rationality community."

I think this depends on how the machine compares to other tools that calculate – if there are obviously better tools, you should probably use those. If those tools are strictly better, then the calculator should be abandoned. But if the calculator is currently the most accurate tool for calculating numbers, it probably makes more sense to continue using it (while looking for better tools). You can re-name it to "aspiring calculator", but in practice long names are clunky and hard to use on a day-to-day basis.

Sometimes you don't actually have a better option than implementing FizzBuzz in TensorFlow, or implementing rationality on mental architecture that's at least partially optimized for politics.

There is a certain sense in which this should have you sitting bolt upright in alarm [LW · GW], but, again, a constant-fire-alarm isn't very useful.

Paying attention to what "longterm community members" want is an instance of Goodhart's law [LW · GW], isn't it? (Some would say of the regressional [LW · GW] variety, but I think this case is actually the adversarial type [LW(p) · GW(p)].)

It's definitely an instance of Goodhart's law (which subtype(s) probably depends on the particular discussion). The question is "do we have actually have better ideas that will more rapidly converge on the closest approximation of the most useful truths?"

And in all seriousness: it seems quite likely that there are better versions of the review process out there. The current iteration is one that got roughly 3 weeks of discussion among the LessWrong team and a couple people I reached out to for feedback. I think it's quite likely we can do better.

I think for this particular year it makes most sense to stick at least to the broad-strokes of the plan (switching plans completely every time you have a plausibly-better idea seems a recipe for not completing projects at all). But, that leaves some wiggle room for implementation details. And if you do have suggestions for broad-strokes-of-plans that are better than the status quo, I'm definitely interested in those for next year.

(This all brings me back to the original comment at the top of this thread: what process for the "Review Phase" of the review do you expect to yield the best results, in aggregate? I think it's more useful to try to answer that question, than to figure out exactly how freaked out to be about the architecture of the LessWrong community's metacognition)

comment by Raemon · 2019-11-24T21:43:34.350Z · LW(p) · GW(p)

(actually, the thing I'm worried about here is that I expect this subthread to be much more enticing than figuring out the best answers to "how should the Review (and Voting) Phases be structured", despite the latter being much more actionably useful. And this seems like a concrete instance of "human brains are architected around politics, finding it easier to fight than to build, with 'overton-window-fight' being an unfortunately accurate description of what's going on a lot of the time")

comment by Zack_M_Davis · 2019-11-24T21:59:47.769Z · LW(p) · GW(p)

I agree that focusing on the object-level review process is a much better use of your time than reacting to my perma-panicked cultural commentary. Happy to end this subthread here.

comment by Raemon · 2019-11-24T22:03:29.202Z · LW(p) · GW(p)

FWIW, I would be quite excited for you to devote thought to the "how to do a good review process?" question, if that's something you have in your motivation budget.

comment by Vaniver · 2019-11-24T06:40:34.023Z · LW(p) · GW(p)

I also note that I'm looking afresh at many of my backburner post ideas, since getting them out before the end of December would mean they'd be available for review in 2020 instead of 2021. 

comment by Raemon · 2019-11-24T07:08:40.819Z · LW(p) · GW(p)

Hah. That's surprisingly amusing.

One of the original seeds of the review idea came from a blogpost I once read arguing that the Oscars should be given out multiple years (preferably more like a decade) after a movie comes out, rather than the immediately next year. This would give the awards a benefit of hindsight, and "okay, but which movies do you actually still like a decade later." It's also remove the weird incentive for "Oscarworthy" movies to come out in Nov/Dec.

I didn't think it made sense to do a full decade for the LW Review, because then you'd either go all the way back to the golden age (where, well, you have the sequences), or if you did a half-decade, you'd have the Dark Times, where there wasn't all that much interesting stuff going on. But, I still thought doing a full year would be enough to get some of the same effect.

But, if around December people are like "oh shit my blogpost that I think is actually going to be really good... I should write that now so I only have to wait one year!", that's (somewhat) amusing way to accidentally introduce "Oscar Season Bias" again.

comment by Vaniver · 2019-11-24T08:16:23.800Z · LW(p) · GW(p)

Note that the gap of a year cuts out a lot of recency bias, and I think availability favors posts in January (since some people will think they're going to go through all of 2018 in chronological order, and then maybe run out of steam at some point). So if all you cared about was winning, I think you'd actually want it to come out in January instead.

comment by Raemon · 2019-11-24T08:37:16.447Z · LW(p) · GW(p)

Yeah, I think getting rid of the recency bias is still good for exactly the reasons I'd intended, just amusing if there still turned out to be an "Oscar Season" anyway.

comment by Vaniver · 2019-11-22T19:35:47.024Z · LW(p) · GW(p)

I'm curious about negative or neutral endorsements. That is, I'm going through and looking at posts and thinking "should this be in the review? Why or why not?", and sometimes the answer comes back "no" for somewhat interesting reasons.

The example that prompted this question is Write a Thousand Roads to Rome [LW · GW]. It's a clear statement of an important pedagogical point, but it's an exhortation to action that I don't think moved the community all that much (from my vantage point). If I want people to read it now, it's more because "hey, here's some advice we still haven't fully internalized / processed the resistance to" instead of "this became part of LW / me, and was useful." Maybe put another way, the thing I want to do more is write a "why isn't this the equilibrium?" post in reply instead of just repeating the push in that direction.

But while I feel good about leaving comments saying "this was a really great post for reasons A, B, and C" on posts from 2018, I don't feel good about saying "this is a decent post that doesn't meet my bar" on posts from 2018. (Which is different from arguing against someone else's endorsement for review later, when the author has opted in to the process of receiving feedback.) But also I'm not sure what else to do with the thoughts; keep them in a text doc in case someone else nominates it? Discard them and regenerate them later?

comment by DanielFilan · 2019-11-21T19:22:20.018Z · LW(p) · GW(p)

Personal/meta/process note:

I've particularly liked looking for posts to nominate, because it's revealed to me ideas that I now think should inform my thinking, but did not at the time. As such, it's somewhat sad that these posts are (as I understand it) not the sort of things I should nominate for the "Best of 2018", and I wish I had another way to signal-boost them, perhaps by nominating them for "Under-rated of 2018". (I guess I could just comment on them, but that doesn't seem like the sort of thing comments are for).

comment by Raemon · 2019-11-21T20:37:09.402Z · LW(p) · GW(p)

I'm uncertain about the best process here (this entire review is a bit of experiment and I think it's fine to tweak rules on the fly). I do think there's a particular value for checking which things have actually been employed in some fashion, as opposed to just "seeming good."

I think it's probably fine to go ahead and nominate them, and in the nomination, note specifically if you haven't directly made use of them.

One possible outcome is that this process reminds other people who have used it, and those people then write up their actual experiences. Another possible outcome is that we decide it's fine to include things that feel time-tested-if-not-actually-used the way Vaniver described. Another option is simply that you bump it into people's public consciousness, and then it isn't included this year, but next year people have the opportunity to suggest older posts that had previously fallen through the cracks.

(If we do this again next year, my current guess is that it'd involve not just "Best of 2019" but sort of an ongoing appraisal of the LW-o-sphere's intellectual landscape, where "Best of 2019" is the primary new focus but at least some thought is dedicated to older stuff)

That all said...

(I guess I could just comment on them, but that doesn't seem like the sort of thing comments are for).

Why ever not? That seems like a totally valid use of a comment.

 

comment by DanielFilan · 2019-11-21T20:51:57.444Z · LW(p) · GW(p)

I do think there's a particular value for checking which things have actually been employed in some fashion, as opposed to just "seeming good."

I certainly agree, and think that it makes a lot of sense to reward posts that have been valuable to their readers, as well as spreading them so that they can provide that same value to those who haven't yet read them.

I think it's probably fine to go ahead and nominate them, and in the nomination, note specifically if you haven't directly made use of them.

Understood.

Why ever not? That seems like a totally valid use of a comment.

I think that comments should be used for advancing discussions and/or providing info that can't be provided other ways. To me, a comment saying "this is a good post that you should read" communicates an upvote plus the identity of the upvoter, and therefore seems primarily a social move.

comment by Raemon · 2019-11-21T20:58:18.665Z · LW(p) · GW(p)

To me, a comment saying "this is a good post that you should read" communicates an upvote plus the identity of the upvoter, and therefore seems primarily a social move.

That sounds about right, but I think there's a few aspects that make that social move valuable:

a) on regular posts in regular circumstances, since comments are a bit higher effort than votes, and comments are at least somewhat more rewarding that votes (at least for me, as an author), I think it's good for at least a couple people to respond "this was great!". Writing a flawless excellent post and then receive upvotes-but-crickets-chirping is a sort of sad experience. I think if 2-3 people have already written such a comment it gets a bit repetitive but I think it's a fine norm.

b) there's a practical element for replying to a post which is that it bumps the post to the top of recent discussion and gives it a bit more life. I think this is bad-in-excess, but fine in moderation – if a post is still good 2 years later, it's good to give it periodic spikes of attention.

c) In particular, a comment two years after-the-fact that says "I just found this after two years and it still seems good" is conveying additional information beyond "I liked it" – it's saying something about how time-tested the content.

comment by DanielFilan · 2019-11-21T21:01:22.710Z · LW(p) · GW(p)

comments are at least somewhat more rewarding that votes (at least for me, as an author)

I'd be interested in a poll on this, since I don't have this experience for comments that don't build on the content of the post.

comment by Raemon · 2019-11-21T21:35:05.062Z · LW(p) · GW(p)

Nod. This makes sense as a thing people might vary quite a bit on. (To be clear, I certainly get dramatically more value out comments that actually engage). It'd be pretty reasonable for you to throw up a question-post about it or something.

comment by DanielFilan · 2019-11-21T21:57:59.596Z · LW(p) · GW(p)

Have thrown up the question post [LW · GW].

comment by Vaniver · 2019-11-21T19:59:12.840Z · LW(p) · GW(p)

I am most excited about this as a sort of "things that stood the test of time," whether by being sleeper hits or by being good then and good now.

comment by Kaj_Sotala · 2019-11-21T19:52:34.497Z · LW(p) · GW(p)

Curious to hear examples of this.

comment by DanielFilan · 2019-11-21T20:22:34.313Z · LW(p) · GW(p)
  • This post [LW · GW] on what academia is and isn't good at describes a true and important thing well (if somewhat verbosely), but didn't influence me partly because I already believed it and partly because I didn't pay it much attention or thought at the time. There are quite a few examples of these.
  • This post [LW · GW] on the complete class theorems described in a clear way some foundational arguments about the wisdom of using probability theory and decision theory, and how they could be extended. It hasn't made its way into my thought about other things, and I don't think about it that much, but I'm glad I have the concept, and the post is a good reference for it.
comment by Alicorn · 2019-11-21T06:49:33.484Z · LW(p) · GW(p)

The link to the 2018 posts sorted by karma is not working correctly for me; it redirects me to /allPosts for some reason.

comment by Raemon · 2019-11-21T21:40:15.168Z · LW(p) · GW(p)

I've updated some formatting in the links, see if it works now.

comment by Raemon · 2019-11-21T21:32:17.022Z · LW(p) · GW(p)

Still not sure what's causing the problem, but here are the direct links to the pages in question. (People who are having trouble – if you enter these directly, does it work?)

https://www.lesswrong.com/allPosts?after=2018-01-01&before=2019-01-01&limit=20&timeframe=monthly&includeShortform=false&reverse=true [? · GW]

https://www.lesswrong.com/allPosts?after=2018-01-01&before=2019-01-01&limit=100&timeframe=allTime [? · GW]

comment by Alicorn · 2019-11-24T03:32:14.325Z · LW(p) · GW(p)

Those both work.

comment by habryka (habryka4) · 2019-11-21T06:52:49.313Z · LW(p) · GW(p)

That's correct. It uses a set of URL parameters (all the weird stuff after the "/allPosts") to restrict the posts to the year 2018. We maybe should make the UI for that a bit clearer. 

comment by Alicorn · 2019-11-21T11:48:54.933Z · LW(p) · GW(p)

No, I mean, it redirects me to https://www.lesswrong.com/allPosts [? · GW] with the weird stuff stripped out, and shows me all posts, not sorted by karma and including the one that was posted eight hours ago and so on.

comment by mingyuan · 2019-11-21T20:38:28.762Z · LW(p) · GW(p)

This is also happening to me

comment by Raemon · 2019-11-21T20:46:44.341Z · LW(p) · GW(p)

Hmm. Super weird. Can you guys all share browser information? (either here or in a PM/intercom would be fine)

Meanwhile, if you right-click on the link and choose "copy link" and paste it into your browser, does that work?

comment by Alicorn · 2019-11-24T03:33:17.819Z · LW(p) · GW(p)

Chrome, MacBook.

comment by Kaj_Sotala · 2019-11-22T06:21:41.691Z · LW(p) · GW(p)

Google Chrome for Android, 78.0.3904.108.

comment by habryka (habryka4) · 2019-11-22T07:30:37.959Z · LW(p) · GW(p)

(This should by the way be fixed now, let us know if you still experience this problem)

comment by lifelonglearner · 2019-11-21T21:23:43.215Z · LW(p) · GW(p)

I had this happen to me as well. Firefox 70 on Ubuntu 18.04

comment by mingyuan · 2019-11-21T21:14:26.117Z · LW(p) · GW(p)

It does not - it still strips it all out and redirects me. Chrome on a Macbook Pro.

comment by Kaj_Sotala · 2019-11-21T12:05:26.601Z · LW(p) · GW(p)

I got that when I first followed the links from this page, then re-opened them and then it took me to the right version. No idea what made the difference.

comment by Tenoke · 2019-11-27T10:23:23.334Z · LW(p) · GW(p)

I got an email about this, so I decided to check if the quality of content here has really increased enough to claim to have material for a new Sequence (I stopped coming here after the in my opinion botched execution of lw2).

I checked the posts, and I don't see anywhere near enough quality content to publish something called a Sequence, without cheapening the previous essays and what 'The Sequences' means in a LessWrong context.

comment by Raemon · 2019-11-27T20:58:37.634Z · LW(p) · GW(p)

(first, noting that if the site content isn't exciting, no worries. Thanks for at least checking it out and giving it another look – I appreciate it)

I'd add to Habryka's comment that my longterm plan here is something like:

  • This year, we review the best posts of 2018. This turns into a fairly simple sequence that clusters relevant posts around each other, and helps people get a sense of the overall major conversation threads that happened in 2018. This sequence is meant to be "highly curated", but not meant to be thought in the same terms as "The Sequences™". Sequence is just a generic term meaning "a collection of posts."
  • In the coming years, there's an additional step where some older posts are considered for something more like canonization, where they are actually added to a Major Updates sequence that's more in the genre of "The Sequences™", i.e. that everyone participating on the site is supposed to have read. This process is something I'd want to put a lot of care into, and my expectation is something like there'd typically be 1-5 posts in any given year that I wanted to add to the site's common-knowledge-pool, and that I'd want multiple years to reflect on it.
comment by Vaniver · 2019-11-27T20:04:26.650Z · LW(p) · GW(p)

Not even Local Validity [LW · GW]?

Note also that you can view this on GreaterWrong, with 2018 posts [? · GW] and nominated posts [? · GW].

comment by Tenoke · 2019-11-28T14:43:46.760Z · LW(p) · GW(p)

There are definitely some decent posts, but calling a couple of good posts a official LessWrong Sequence still seems to cheapen what that used to mean.

Not to mention that I read this on facebook, so I barely associate it with here.

Note also that you can view this on GreaterWrong.

Thanks, GreaterWrong seems to still be an improvement over the redesign for me. I'm back to using it.

comment by habryka (habryka4) · 2019-11-27T17:53:08.072Z · LW(p) · GW(p)

Huh, there must be some confusion going on. The goal is not to add another sequence to Rationality: A-Z, the goal is just to compile a sequence of the type of which we already have many (like Luke's sequence on the neuroscience of happiness, or Kaj's multiagent sequence, or Anna's game theory sequence, etc.). 

comment by Vaniver · 2019-11-27T20:02:53.396Z · LW(p) · GW(p)

The goal is not to add another sequence to Rationality: A-Z, the goal is just to compile a sequence of the type of which we already have many

I note that in my mind R:AZ is a different thing from The Sequences; it's abridged, and in a different order, and there's a big difference between "posts arranged in an order" and "Eliezer unrolling and serializing the dependency tree for a concept."

comment by habryka (habryka4) · 2019-11-27T20:13:01.610Z · LW(p) · GW(p)

Yeah, I agree with that, but it seemed like the best way to disambiguate in the above context. Though note that "The Sequences" itself refers to at least three different orders and collections of posts, because the order of the posts was being actively edited on the wiki. So I don't think even that has a single coherent referent. 

comment by DanielFilan · 2019-11-21T04:43:28.605Z · LW(p) · GW(p)

Could the LW team clarify how long and in-depth the nomination text should be?

comment by Raemon · 2019-11-21T04:51:27.506Z · LW(p) · GW(p)

The perhaps somewhat-annoying answer is "at least some text is better than none, but longer and more in-depth is better." Part of the hope here is build out a pretty comprehensive picture of how a concept went on to get used.

The nomination I just wrote for Decoupled vs Contextualization [LW(p) · GW(p)] is probably what I'm expecting/hoping for median length. (Having more details about specific conversations where the concept was useful would make it a more useful nomination, though. If you have pages worth of thoughts, go for it)

(But again, just writing a sentence or two is preferable to not-that, if you're busy, and you can edit it or reply to it later if you have time)

comment by Raemon · 2019-11-27T21:04:23.183Z · LW(p) · GW(p)

I alluded to this in this comment [LW(p) · GW(p)], but wanted to put it a bit more clearly:

I think it makes sense to think of The 2018 Review as like "an academic journal", where you submit ideas, and if the ideas seem valuable it get included into a curated work – but not a work that everyone is expected to have read.

By contrast, Rationality A-Z is more like "a textbook", which is foundational to the field. My current best guess it'll make sense for next year's review process to include considering which things make sense to add to a sequence that's similar in scope to R:AZ and the Codex, in terms of "major works that most people are expected to be familiar with." 

I didn't want to tackle that this year a) because I expected there to be a lot of kinks in the system to work out, and I didn't want to try 'adding things to canon' without having had a chance to do a "medium-stakes" project. b) I think one year just isn't enough time to see if something feels really valuable enough to make into obligate reading material.

comment by DanielFilan · 2019-12-28T17:26:55.268Z · LW(p) · GW(p)

For what it's worth, I bid for the review prizes to be based off of people voting for which reviews were useful. The alternatives, and why I think they're worse:

  • Karma mixes "I found this review useful" and "I already agree with this review but am glad somebody said it", which can reward things which everybody already knew (but I guess both components are important).
  • Moderator's picks have the problem that moderators suffer from the curse of knowledge, and may not be in touch with what's useful for the average voter.
comment by Raemon · 2019-12-28T18:46:26.146Z · LW(p) · GW(p)

I'm a bit worried that if you let users vote on reviews, you'll mostly get something identical to karma.

comment by DanielFilan · 2019-12-28T19:24:50.031Z · LW(p) · GW(p)

Hopefully it's different if you explicitly say "vote for helpful reviews, not just reviews that you agree with", or if you have one button for "I agree with this review" and a different button for "This review was helpful for my assessment of the post" (and it's possible to select both buttons).

comment by Raemon · 2019-12-28T18:17:35.244Z · LW(p) · GW(p)

Mmm, interesting point.

comment by orthonormal · 2019-11-22T20:53:18.497Z · LW(p) · GW(p)

What do you think about doing this for 2017 and years previous?

comment by Raemon · 2019-11-22T20:58:16.063Z · LW(p) · GW(p)

That was actually the original plan, but we decided that this process was complicated enough (at least as a first attempt), with the (relatively) narrow target of 2018.

My guess is that in future years, once this process has gelled into something everyone understands and the kinks are ironed out, is that it will include some kind of mechanism for including older works.

comment by lifelonglearner · 2019-11-21T21:24:11.083Z · LW(p) · GW(p)

I think I have enough karma, but I can't figure out where the nomination button is. Could someone share a screenshot?

comment by Raemon · 2019-11-21T21:30:07.551Z · LW(p) · GW(p)
comment by Raemon · 2019-11-21T21:33:40.719Z · LW(p) · GW(p)

Also available:
 

comment by lifelonglearner · 2019-11-22T01:33:18.280Z · LW(p) · GW(p)

Hmmm, am I doing something wrong?

My karma: my karma

What I see when I click the three dots on a page: no button

comment by Raemon · 2019-11-22T01:34:46.693Z · LW(p) · GW(p)

That post isn't from 2018

comment by lifelonglearner · 2019-11-22T04:30:32.495Z · LW(p) · GW(p)

Ack! My error! I see now.

comment by Decius · 2019-11-21T04:54:46.506Z · LW(p) · GW(p)

Is the intent in the review phase to display the number of nominations received (which will impact which posts get reviewed) or not (which fails to display information that I am likely to find useful in using the list of posts that have been nominated by enough people to form a reading list)?

comment by Raemon · 2019-11-21T05:01:04.419Z · LW(p) · GW(p)

Number-of-nominations will probably be added as a UI element within the next day or so, and the fact that it's not there right now is mostly because of time-budgeting problems.

comment by Taryn East (taryn-east) · 2019-12-11T20:38:56.142Z · LW(p) · GW(p)

It would be nice if you could link your ""Best of 2018" sequence " text to the actual results of this process...

I am assuming that this outcome was actually reached?

Searching in the searchbox for "best of" gets me best quotes and this article, but not the "best of 2018" sequence.

comment by Raemon · 2019-12-10T23:36:51.349Z · LW(p) · GW(p)

The process isn't finished yet – it'll hopefully complete sometime in January of next year.

comment by Ben Pace (Benito) · 2020-01-11T01:15:51.831Z · LW(p) · GW(p)

Hey, actually no, we're currently reviewing 2018's posts. We've waited a year in order to give everyone the power of hindsight to figure out what was actually good.

Btw, I think you might be the same user as user taryneast. That account is eligible to vote. I suggest logging in with that account, or contacting us via intercom (bottom right corner of the screen) if you'd like to reset your password.