Voting Results for the 2022 Review

benito

Voting Results for the 2022 Review

post by Ben Pace (Benito) · 2024-02-02T20:34:59.768Z · LW · GW · 3 comments

  Review Facts
  Review Prizes
  We have been working on a new way of celebrating the best posts of the year
  Voting Results
None
3 comments

The 5th Annual LessWrong Review has come to a close!

Review Facts

There were 5330 posts published in 2022.

Here's how many posts passed through the different review phases.

Phase	No. of posts	Eligibility
Nominations Phase	579	Any 2022 post could be given preliminary votes
Review Phase	363	Posts with 2+ votes could be reviewed
Voting Phase	168	Posts with 1+ reviews could be voted on

Here how many votes and voters there were by karma bracket.

Karma Bucket	No. of Voters	No. of Votes Cast
Any	333	5007
1+	307	4944
10+	298	4902
100+	245	4538
1,000+	121	2801
10,000+	24	816

To give some context on this annual tradition, here are the absolute numbers compared to last year and to the first year of the LessWrong Review.

	2018	2021	2022
Voters	59	238	333
Nominations	75	452	579
Reviews	120	209	227
Votes	1272	2870	5007
Total LW Posts	1703	4506	5330

Review Prizes

There were lots of great reviews this year! Here's a link to all of them [? · GW]

Of 227 reviews we're giving 31 of them prizes.

This follows up on Habryka who gave out about half of these prizes 2 months ago. [LW(p) · GW(p)]

Note that two users were paid to produce reviews and so will not be receiving the prize money. They're still here because I wanted to indicate that they wrote some really great reviews.

Click below to expand and see who won prizes.

Excellent ($200) (7 reviews)

ambigram [LW · GW] for their review [LW(p) · GW(p)] of Meadow Theory
Buck [LW · GW] for his self-review [LW(p) · GW(p)] of Causal Scrubbing: a method for rigorously testing interpretability hypotheses
DirectedEvolution [LW · GW] for their paid review [LW(p) · GW(p)] of How satisfied should you expect to be with your partner?
LawrenceC [LW · GW] for their paid review [LW(p) · GW(p)] of Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
LawrenceC [LW · GW] for their paid review [LW(p) · GW(p)] of How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
LoganStrohl [LW · GW] for their self-review [LW(p) · GW(p)] of the Intro to Naturalism sequence
porby [LW · GW] for their self-review [LW(p) · GW(p)] of Why I think strong general AI is coming soon

Great ($100) (6 reviews)

DirectedEvolution [LW · GW] for their paid review [LW(p) · GW(p)] of Slack matters more than any outcome
janus [LW · GW] for their self-review [LW(p) · GW(p)] of Simulators
Lee Sharkey [LW · GW] for their self-review [LW(p) · GW(p)] of Taking features out of superposition with sparse autoencoders
Neel Nanda [LW · GW] for their review [LW(p) · GW(p)] of "Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small"
nostalgebraist [LW · GW] for their review [LW(p) · GW(p)] of Simulators
Writer [LW · GW] for their review [LW(p) · GW(p)] of "Inner and outer alignment decompose one hard problem into two extremely hard problems"

Good ($50) (18 reviews)

Alex_Altair [LW · GW] for their review [LW(p) · GW(p)] of the Intro to Naturalism sequence
Buck [LW · GW] for their review [LW(p) · GW(p)] of K-complexity is silly; use cross-entropy instead
Davidmanheim [LW · GW] for their review [LW(p) · GW(p)] of It's Probably not Lithium
[DEACTIVATED] Duncan Sabien [LW · GW] for their review [LW(p) · GW(p)] of Here's the exit.
[DEACTIVATED] Duncan Sabien [LW · GW] for their self-review [LW(p) · GW(p)] of Benign Boundary Violations
eukaryote [LW · GW] for their self-review [LW(p) · GW(p)] of Fiber arts, mysterious dodecahedrons, and waiting on “Eureka!”
Jan_Kulveit [LW · GW] for their review [LW(p) · GW(p)] of Human values & biases are inaccessible to the genome
Jan_Kulveit [LW · GW] for their review [LW(p) · GW(p)] of The shard theory of human values
johnswentworth [LW · GW] for their review [LW(p) · GW(p)] of Revisiting algorithmic progress
L Rudolf L [LW · GW] for their self-review [LW(p) · GW(p)] of Review: Amusing Ourselves to Death
Nathan Young [LW · GW] for their review [LW(p) · GW(p)] of Introducing Pastcasting: A tool for forecasting practice
Neel Nanda [LW · GW] for their self-review [LW(p) · GW(p)] of A Longlist of Theories of Impact for Interpretability
Screwtape [LW · GW] for their review [LW(p) · GW(p)] of How To: A Workshop (or anything)
Screwtape [LW · GW] for their review [LW(p) · GW(p)] of Sazen
TurnTrout [LW · GW] for their review [LW(p) · GW(p)] of Simulators
Vanessa Kosoy [LW · GW] for their post-length review [LW(p) · GW(p)] of Where I agree and disagree with Eliezer
Vika [LW · GW] for their self-review [LW(p) · GW(p)] of DeepMind alignment team opinions on AGI ruin arguments
Vika [LW · GW] for their self-review [LW(p) · GW(p)] of Refining the Sharp Left Turn threat model, part 1: claims and mechanisms

We'll reach out to prizewinners in the coming weeks to give you your prizes.

We have been working on a new way of celebrating the best posts of the year

The top 50 posts of each year are being celebrated in a new way! Read this companion post [LW · GW] to find out all the details, but for now here's a preview of the sorts of changes we've made for the top-voted posts of the annual review.

And there's a new LeastWrong page [? · GW] with the top 50 posts from all 5 annual reviews so far, sorted into categories.

You can learn more about what we've built in the companion post [LW · GW].

Okay, now onto the voting results!

Voting Results

Voting is visualized here with dots of varying sizes, roughly indicating that a user thought a post was "good" (+1), "important" (+4), or "extremely important" (+9).

Green dots indicate positive votes. Red indicate negative votes.

If a user spent more than their budget of 500 points, all of their votes were scaled down slightly, so some of the circles are slightly smaller than others.

These are the 161 posts that got a net positive score, out of 168 posts that were eligible for the vote.

0	AGI Ruin: A List of Lethalities Eliezer Yudkowsky
1	MIRI announces new "Death With Dignity" strategy Eliezer Yudkowsky
2	Where I agree and disagree with Eliezer paulfchristiano
3	Let’s think about slowing down AI KatjaGrace
4	Reward is not the optimization target TurnTrout
5	Six Dimensions of Operational Adequacy in AGI Projects Eliezer Yudkowsky
6	It Looks Like You're Trying To Take Over The World gwern
7	Staring into the abyss as a core life skill benkuhn
8	You Are Not Measuring What You Think You Are Measuring johnswentworth
9	Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover Ajeya Cotra
10	Sazen [DEACTIVATED] Duncan Sabien
11	Luck based medicine: my resentful story of becoming a medical miracle Elizabeth
12	Inner and outer alignment decompose one hard problem into two extremely hard problems TurnTrout
13	On how various plans miss the hard bits of the alignment challenge So8res
14	Simulators janus
15	Epistemic Legibility Elizabeth
16	Tyranny of the Epistemic Majority Scott Garrabrant
17	Counterarguments to the basic AI x-risk case KatjaGrace
18	What Are You Tracking In Your Head? johnswentworth
19	Safetywashing Adam Scholl
20	Threat-Resistant Bargaining Megapost: Introducing the ROSE Value Diffractor
21	Nonprofit Boards are Weird HoldenKarnofsky
22	Optimality is the tiger, and agents are its teeth Veedrac
23	chinchilla's wild implications nostalgebraist
24	Losing the root for the tree Adam Zerner
25	Worlds Where Iterative Design Fails johnswentworth
26	Decision theory does not imply that we get to have nice things So8res
27	Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality" AnnaSalamon
28	What an actually pessimistic containment strategy looks like lc
29	Introduction to abstract entropy Alex_Altair
30	A Mechanistic Interpretability Analysis of Grokking Neel Nanda
31	The Redaction Machine Ben
32	Butterfly Ideas Elizabeth
33	Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research] LawrenceC
34	Language models seem to be much better than humans at next-token prediction Buck
35	Toni Kurz and the Insanity of Climbing Mountains GeneSmith
36	Useful Vices for Wicked Problems HoldenKarnofsky
37	What should you change in response to an "emergency"? And AI risk AnnaSalamon
38	Models Don't "Get Reward" Sam Ringer
39	How To Go From Interpretability To Alignment: Just Retarget The Search johnswentworth
40	Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment elspood
41	Why Agent Foundations? An Overly Abstract Explanation johnswentworth
42	A central AI alignment problem: capabilities generalization, and the sharp left turn So8res
43	Humans provide an untapped wealth of evidence about alignment TurnTrout
44	Learning By Writing HoldenKarnofsky
45	Limerence Messes Up Your Rationality Real Bad, Yo Raemon
46	The Onion Test for Personal and Institutional Honesty chanamessinger
47	Counter-theses on Sleep Natália Coelho Mendonça
48	The shard theory of human values Quintin Pope
49	How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme Collin
50	ProjectLawful.com: Eliezer's latest story, past 1M words Eliezer Yudkowsky
51	Intro to Naturalism: Orientation LoganStrohl
52	Why I think strong general AI is coming soon porby
53	How might we align transformative AI if it’s developed very soon? HoldenKarnofsky
54	It’s Probably Not Lithium Natália Coelho Mendonça
55	(My understanding of) What Everyone in Technical Alignment is Doing and Why Thomas Larsen
56	Plans Are Predictions, Not Optimization Targets johnswentworth
57	Takeoff speeds have a huge effect on what it means to work on AI x-risk Buck
58	The Feeling of Idea Scarcity johnswentworth
59	Six (and a half) intuitions for KL divergence CallumMcDougall
60	Trigger-Action Planning CFAR!Duncan
61	Have You Tried Hiring People? rank-biserial
62	The Wicked Problem Experience HoldenKarnofsky
63	What does it take to defend the world against out-of-control AGIs? Steven Byrnes
64	On Bounded Distrust Zvi
65	Setting the Zero Point [DEACTIVATED] Duncan Sabien
66	[Interim research report] Taking features out of superposition with sparse autoencoders Lee Sharkey
67	Limits to Legibility Jan_Kulveit
68	Harms and possibilities of schooling TsviBT
69	Look For Principles Which Will Carry Over To The Next Paradigm johnswentworth
70	Steam abramdemski
71	High Reliability Orgs, and AI Companies Raemon
72	Toy Models of Superposition evhub
73	Editing Advice for LessWrong Users JustisMills
74	Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc johnswentworth
75	why assume AGIs will optimize for fixed goals? nostalgebraist
76	Lies Told To Children Eliezer Yudkowsky
77	Revisiting algorithmic progress Tamay
78	Things that can kill you quickly: What everyone should know about first aid jasoncrawford
79	Postmortem on DIY Recombinant Covid Vaccine caffemacchiavelli
80	Reflections on six months of fatherhood jasoncrawford
81	Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small KevinRoWang
82	The Plan - 2022 Update johnswentworth
83	12 interesting things I learned studying the discovery of nature's laws Ben Pace
84	Impossibility results for unbounded utilities paulfchristiano
85	Searching for outliers benkuhn
86	Greyed Out Options ozymandias
87	“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments Andrew_Critch
88	Do bamboos set themselves on fire? Malmesbury
89	Murphyjitsu: an Inner Simulator algorithm CFAR!Duncan
90	Deliberate Grieving Raemon
91	We Choose To Align AI johnswentworth
92	The alignment problem from a deep learning perspective Richard_Ngo
93	Slack matters more than any outcome Valentine
94	Consider your appetite for disagreements Adam Zerner
95	everything is okay Tamsin Leake
96	Mysteries of mode collapse janus
97	Slow motion videos as AI risk intuition pumps Andrew_Critch
98	ITT-passing and civility are good; "charity" is bad; steelmanning is niche Rob Bensinger
99	Meadow Theory [DEACTIVATED] Duncan Sabien
100	The next decades might be wild Marius Hobbhahn
101	Marriage, the Giving What We Can Pledge, and the damage caused by vague public commitments Jeffrey Ladish
102	Lessons learned from talking to >100 academics about AI safety Marius Hobbhahn
103	Activated Charcoal for Hangover Prevention: Way more than you wanted to know Maxwell Peterson
104	More Is Different for AI jsteinhardt
105	How satisfied should you expect to be with your partner? Vaniver
106	How my team at Lightcone sometimes gets stuff done jacobjacob
107	The metaphor you want is "color blindness," not "blind spot." [DEACTIVATED] Duncan Sabien
108	Logical induction for software engineers Alex Flint
109	Call For Distillers johnswentworth
110	Fiber arts, mysterious dodecahedrons, and waiting on “Eureka!” eukaryote
111	A Longlist of Theories of Impact for Interpretability Neel Nanda
112	On A List of Lethalities Zvi
113	LOVE in a simbox is all you need jacob_cannell
114	A transparency and interpretability tech tree evhub
115	DeepMind alignment team opinions on AGI ruin arguments Vika
116	Contra shard theory, in the context of the diamond maximizer problem So8res
117	On infinite ethics Joe Carlsmith
118	Wisdom Cannot Be Unzipped Sable
119	Different perspectives on concept extrapolation Stuart_Armstrong
120	Utilitarianism Meets Egalitarianism Scott Garrabrant
121	The ignorance of normative realism bot Joe Carlsmith
122	Shah and Yudkowsky on alignment failures Rohin Shah
123	Nuclear Energy - Good but not the silver bullet we were hoping for Marius Hobbhahn
124	Patient Observation LoganStrohl
125	Monks of Magnitude [DEACTIVATED] Duncan Sabien
126	AI coordination needs clear wins evhub
127	Actually, All Nuclear Famine Papers are Bunk Lao Mein
128	New Frontiers in Mojibake Adam Scherlis
129	My take on Jacob Cannell’s take on AGI safety Steven Byrnes
130	Introducing Pastcasting: A tool for forecasting practice Sage Future
131	K-complexity is silly; use cross-entropy instead So8res
132	Beware boasting about non-existent forecasting track records Jotto999
133	Clarifying AI X-risk zac_kenton
134	Narrative Syncing AnnaSalamon
135	publishing alignment research and exfohazards Tamsin Leake
136	Deontology and virtue ethics as "effective theories" of consequentialist ethics Jan_Kulveit
137	Range and Forecasting Accuracy niplav
138	Trends in GPU price-performance Marius Hobbhahn
139	How To Observe Abstract Objects LoganStrohl
140	Criticism of EA Criticism Contest Zvi
141	Takeaways from our robust injury classifier project [Redwood Research] dmz
142	Bad at Arithmetic, Promising at Math cohenmacaulay
143	Don't use 'infohazard' for collectively destructive info Eliezer Yudkowsky
144	Conditions for mathematical equivalence of Stochastic Gradient Descent and Natural Selection Oliver Sourbut
145	Human values & biases are inaccessible to the genome TurnTrout
146	I learn better when I frame learning as Vengeance for losses incurred through ignorance, and you might too chaosmage
147	Jailbreaking ChatGPT on Release Day Zvi
148	Open technical problem: A Quinean proof of Löb's theorem, for an easier cartoon guide Andrew_Critch
149	Review: Amusing Ourselves to Death L Rudolf L
150	QNR prospects are important for AI alignment research Eric Drexler
151	Disagreement with bio anchors that lead to shorter timelines Marius Hobbhahn
152	Why all the fuss about recursive self-improvement? So8res
153	LessWrong Has Agree/Disagree Voting On All New Comment Threads Ben Pace
154	Opening Session Tips & Advice CFAR!Duncan
155	Searching for Search NicholasKees
156	Refining the Sharp Left Turn threat model, part 1: claims and mechanisms Vika
157	Takeaways from a survey on AI alignment resources DanielFilan
158	Trying to disambiguate different questions about whether RLHF is “good” Buck
159	Benign Boundary Violations [DEACTIVATED] Duncan Sabien
160	How To: A Workshop (or anything) [DEACTIVATED] Duncan Sabien

3 comments

Comments sorted by top scores.

comment by Alex_Altair · 2024-02-28T17:48:10.468Z · LW(p) · GW(p)

Just noticing that every post has at least one negative vote, which feels interesting for some reason.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2024-02-28T17:53:34.997Z · LW(p) · GW(p)

Technically the optimal way to spend your points to influence the vote outcome is to center them (i.e. have the mean be zero). In-practice this means giving a -1 to lots of posts. It doesn't provide much of an advantage, but I vaguely remember some people saying they did it, which IMO would explain there being some very small number of negative votes on everything.

comment by Neil (neil-warren) · 2024-02-29T11:00:00.412Z · LW(p) · GW(p)

The new designs are cool, I'd just be worried about venturing too far into insight porn. You don't want people reading the posts just because they like how they look (although reading them superficially is probably better than not reading them at all). Clicking on the posts and seeing a giant image that bleeds color into the otherwise sober text format is distracting.

I guess if I don't like it there's always GreaterWrong.

Voting Results for the 2022 Review

Contents

Review Facts

Review Prizes

We have been working on a new way of celebrating the best posts of the year

Voting Results

3 comments