Voting Results for the 2023 Review

post by Raemon · 2025-02-06T08:00:37.461Z · LW · GW · 0 comments

Contents

  Reviews
  Weighted Vote Totals
  The Results
  Updates to the Best of LessWrong: Coming Soon
None
No comments

The votes are in for the 2023 Review!

6,264 posts were written in 2023

662 of them were nominated.

209 of them got at least one review, and a positive review-vote total.

50 of them shall be displayed in the Best of LessWrong, Year 2023.

Reviews

Exactly 100 people wrote reviews, and many of them I found particularly valuable. I want to give some particular shout outs to Ryan Greenblatt, John Wentworth, and Steve Byrnes.

I found Ryan Greenblatt's reviews to be sort of relentlessly reasonable. He reviewed many posts thoughtfully, clearly state flaws, value props, and concrete potential improvements. I particularly liked his review of Anthropic's Responsible Scaling Policy & Long-Term Benefit Trust [LW(p) · GW(p)]

I personally had the most direct "worldview update" from John Wentworth's The Case Against AI Control Research. I don't fully agree with John's take, but I found distinguishing how bad scheming is at different AI stages, and what else needs to be going right at that stage, a quite useful frame.

I was also grateful for Steve Byrnes extensive opinionated review of the  “Sharp Left Turn” discourse [LW · GW].

Here's a general screenshot of top of the Review Leaderboard. [? · GW] Thanks to everyone who participated.

 

And, reminder, reviews now appear on the Spotlight Items, at the top of the home page, and on the /bestoflesswrong [? · GW] page.

Weighted Vote Totals

In past years, we calculated the votes based on two groups of users: Established users with 1000+ karma (who got 3x the vote weight), and users with <1000 karma. 

This time, we're replacing that arbitrary cliff with a more granular "your review vote gets multiplied by your Strong Vote power." I mentioned we'd be doing something like that in the announcement post [LW · GW], noting:

I haven't been happy with the arbitrary cliff here. It gives more power than I really wanted to allocate to people with 1000 karma, and not enough weight to people who have been around much longer and have demonstrated good judgment. But, karma is still a pretty messy indicator, so I don't want to give too much power to high karma users either.

Ultimately it seemed that "Strong Vote power [LW · GW]" basically did what we wanted. Very high karma users get more voting power, but each additional point takes roughly twice as much karma as the previous point, which limits how extreme the difference can get. Users pretty quickly ramp up to around ~6x voting power. Most longterm users will have somewhere around 8-9x. A few users get as high as 14x.

That said, to avoid de-anonymization of votes, on this post I'm displaying the "raw" vote strength of each vote, before being multiplied.

The Results

Okay. You probably kinda scrolled past all that to get to what you're all here for: what were the best posts of LessWrong 2023, according to you, the people?

408 people voted. 161 people cast the six votes we asked to get the Good Citizen Stamp. Here's what they thought:

0 AI Control: Improving Safety Despite Intentional Subversion
1 Focus on the places where you feel shocked everyone's dropping the ball
2 Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
3 Social Dark Matter
4 Statement on AI Extinction - Signed by AGI Labs, Top Academics, and Many Other Notable Figures
5 Pausing AI Developments Isn't Enough. We Need to Shut it All Down
6 Please don't throw your mind away
7 How much do you believe your results?
8 AI Timelines
9 Basics of Rationalist Discourse
10 Things I Learned by Spending Five Thousand Hours In Non-EA Charities
11 What a compute-centric framework says about AI takeoff speeds
12 How to have Polygenically Screened Children
13 Natural Abstractions: Key claims, Theorems, and Critiques
14 Alignment Implications of LLM Successes: a Debate in One Act
15 Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research
16 Natural Latents: The Math
17 Steering GPT-2-XL by adding an activation vector
18 SolidGoldMagikarp (plus, prompt generation)
19 The ants and the grasshopper
20 Feedbackloop-first Rationality
21 Deep Deceptiveness
22 EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem
23 Davidad's Bold Plan for Alignment: An In-Depth Explanation
24 Fucking Goddamn Basics of Rationalist Discourse
25 Against Almost Every Theory of Impact of Interpretability
26 Guide to rationalist interior decorating
27 New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?"
28 Predictable updating about AI risk
29 [Fiction] A Disneyland Without Children
30 The Talk: a brief explanation of sexual dimorphism
31 "Carefully Bootstrapped Alignment" is organizationally hard
32 Tuning your Cognitive Strategies
33 Enemies vs Malefactors
34 The Parable of the King and the Random Process
35 GPTs are Predictors, not Imitators
36 Labs should be explicit about why they are building AGI
37 Cultivating a state of mind where new ideas are born
38 Discussion with Nate Soares on a key alignment difficulty
39 Loudly Give Up, Don't Quietly Fade
40 We don’t trade with ants
41 Neural networks generalize because of this one weird trick
42 My views on “doom”
43 Change my mind: Veganism entails trade-offs, and health is one of the axes
44 Lessons On How To Get Things Right On The First Try
45 Shallow review of live agendas in alignment & safety
46 The Learning-Theoretic Agenda: Status 2023
47 Improving the Welfare of AIs: A Nearcasted Proposal
48 Book Review: Going Infinite
49 Speaking to Congressional staffers about AI risk
50 Why it's so hard to talk about Consciousness
51 Acausal normalcy
52 Towards Developmental Interpretability
53 Dear Self; we need to talk about ambition
54 Thoughts on “AI is easy to control” by Pope & Belrose
55 Thoughts on sharing information about language model capabilities
56 FixDT
57 My Model Of EA Burnout
58 Accidentally Load Bearing
59 Comp Sci in 2027 (Short story by Eliezer Yudkowsky)
60 The Plan - 2023 Version
61 When is Goodhart catastrophic?
62 Evaluating the historical value misspecification argument
63 Contrast Pairs Drive the Empirical Performance of Contrast Consistent Search (CCS)
64 Bing Chat is blatantly, aggressively misaligned
65 Cyborgism
66 Shell games
67 Ten Levels of AI Alignment Difficulty
68 Introducing Fatebook: the fastest way to make and track predictions
69 Modal Fixpoint Cooperation without Löb's Theorem
70 How to (hopefully ethically) make money off of AGI
71 AI: Practical Advice for the Worried
72 [Valence series] 1. Introduction
73 Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
74 Responses to apparent rationalist confusions about game / decision theory
75 Why and How to Graduate Early [U.S.]
76 Updates and Reflections on Optimal Exercise after Nearly a Decade
77 Getting Started With Naturalism
78 Why Not Subagents?
79 Meta-level adversarial evaluation of oversight techniques might allow robust measurement of their adequacy
80 A case for AI alignment being difficult
81 Consciousness as a conflationary alliance term for intrinsically valued internal experiences
82 The 101 Space You Will Always Have With You
83 Coup probes: Catching catastrophes with probes trained off-policy
84 Preventing Language Models from hiding their reasoning
85 Teleosemantics!
86 My Objections to "We’re All Gonna Die with Eliezer Yudkowsky"
87 Why Not Just Outsource Alignment Research To An AI?
88 Childhoods of exceptional people
89 How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
90 The Soul Key
91 Never Drop A Ball
92 Cohabitive Games so Far
93 The salt in pasta water fallacy
94 Untrusted smart models and trusted dumb models
95 Some Rules for an Algebra of Bayes Nets
96 How to Bounded Distrust
97 Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?
98 Before smart AI, there will be many mediocre or specialized AIs
99 A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX
100 Apocalypse insurance, and the hardline libertarian take on AI risk
101 A freshman year during the AI midgame: my approach to the next year
102 Aumann-agreement is common
103 grey goo is unlikely
104 Iron deficiencies are very bad and you should treat them
105 Dark Forest Theories
106 Connectomics seems great from an AI x-risk perspective
107 Views on when AGI comes and on strategy to reduce existential risk
108 The 'Neglected Approaches' Approach: AE Studio's Alignment Agenda
109 Thinking By The Clock
110 Killing Socrates
111 You don't get to have cool flaws
112 Noting an error in Inadequate Equilibria
113 Assume Bad Faith
114 Measuring and Improving the Faithfulness of Model-Generated Reasoning
115 When can we trust model evaluations?
116 Benchmarks for Detecting Measurement Tampering [Redwood Research]
117 Learning-theoretic agenda reading list
118 Exercise: Solve "Thinking Physics"
119 Why Simulator AIs want to be Active Inference AIs
120 What I would do if I wasn’t at ARC Evals
121 Evaluations (of new AI Safety researchers) can be noisy
122 The King and the Golem
123 Love, Reverence, and Life
124 Auditing failures vs concentrated failures
125 Elements of Rationalist Discourse
126 Evolution provides no evidence for the sharp left turn
127 Latent variables for prediction markets: motivation, technical guide, and design considerations
128 UFO Betting: Put Up or Shut Up
129 Some background for reasoning about dual-use alignment research
130 Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem
131 The Base Rate Times, news through prediction markets
132 Mob and Bailey
133 Pretraining Language Models with Human Preferences
134 Touch reality as soon as possible (when doing machine learning research)
135 POC || GTFO culture as partial antidote to alignment wordcelism
136 A to Z of things
137 OpenAI, DeepMind, Anthropic, etc. should shut down.
138 Model, Care, Execution
139 Symbol/Referent Confusions in Language Model Alignment Experiments
140 Careless talk on US-China AI competition? (and criticism of CAIS coverage)
141 Alexander and Yudkowsky on AGI goals
142 Deception Chess: Game #1
143 Ethodynamics of Omelas
144 Here's Why I'm Hesitant To Respond In More Depth
145 Shutting Down the Lightcone Offices
146 shoes with springs
147 Would You Work Harder In The Least Convenient Possible World?
148 Bayesian Networks Aren't Necessarily Causal
149 One Day Sooner
150 "Publish or Perish" (a quick note on why you should try to make your work legible to existing academic communities)
151 Anthropic's Responsible Scaling Policy & Long-Term Benefit Trust
152 Think carefully before calling RL policies "agents"
153 Going Crazy and Getting Better Again
154 The benevolence of the butcher
155 Nietzsche's Morality in Plain English
156 Cryonics and Regret
157 RSPs are pauses done right
158 Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds
159 Why I’m not into the Free Energy Principle
160 The ‘ petertodd’ phenomenon
161 Which personality traits are real? Stress-testing the lexical hypothesis
162 Green goo is plausible
163 Rational Unilateralists Aren't So Cursed
164 Competitive, Cooperative, and Cohabitive
165 Have Attention Spans Been Declining?
166 Hell is Game Theory Folk Theorems
167 Reducing sycophancy and improving honesty via activation steering
168 Are there cognitive realms?
169 "Rationalist Discourse" Is Like "Physicist Motors"
170 Being at peace with Doom
171 Recreating the caring drive
172 The God of Humanity, and the God of the Robot Utilitarians
173 Agentized LLMs will change the alignment landscape
174 Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense
175 What's up with "Responsible Scaling Policies"?
176 The Real Fanfic Is The Friends We Made Along The Way
177 New User's Guide to LessWrong
178 We don't understand what happened with culture enough
179 Seth Explains Consciousness
180 [Valence series] 2. Valence & Normativity
181 Fifty Flips
182 So you want to save the world? An account in paladinhood
183 A Playbook for AI Risk Reduction (focused on misaligned AI)
184 Moral Reality Check (a short story)
185 Takeaways from calibration training
186 Efficiency and resource use scaling parity
187 Spaciousness In Partner Dance: A Naturalism Demo
188 [Valence series] 3. Valence & Beliefs
189 Fighting without hope
190 On Tapping Out
191 Beware of Fake Alternatives
192 Truth and Advantage: Response to a draft of "AI safety seems hard to measure"
193 AI #1: Sydney and Bing
194 The shape of AGI: Cartoons and back of envelope
195 Underwater Torture Chambers: The Horror Of Fish Farming
196 Finding Neurons in a Haystack: Case Studies with Sparse Probing
197 How should TurnTrout handle his DeepMind equity situation?
198 When do "brains beat brawn" in Chess? An experiment
199 There are no coherence theorems
200 The Power of High Speed Stupidity
201 Book Review: Consciousness Explained (as the Great Catalyst)
202 Large language models learn to represent the world
203 Ruining an expected-log-money maximizer
204 A problem with the most recently published version of CEV
205 In Defense of Parselmouths
206 Five Worlds of AI (by Scott Aaronson and Boaz Barak)
207 When Omnipotence is Not Enough
208 Why You Should Never Update Your Beliefs
209 Responsible Scaling Policies Are Risk Management Done Wrong
210 A Way To Be Okay
211 "Justice, Cherryl."
212 Large Language Models can Strategically Deceive their Users when Put Under Pressure.
213 [Link] A community alert about Ziz

Updates to the Best of LessWrong: 
Coming Soon

It'll take a little while to get the results polished with nice art and organization on the /bestoflesswrong page. Stay tuned for a final announcement. (This year this might still take a week or two. Hopefully next year we'll have most of the final polishing fully automated)

0 comments

Comments sorted by top scores.