2022 (and All Time) Posts by Pingback Count

post by Raemon · 2023-12-16T21:17:00.572Z · LW · GW · 14 comments

For the past couple years I've wished LessWrong had a "sort posts by number of pingbacks, or, ideally, by total karma of pingbacks". I particularly wished for this during the Annual Review, where "which posts got cited the most?" seemed like a useful thing to track for potential hidden gems.

We still haven't built a full-fledged feature for this, but I just ran a query against the database, and made it into a spreadsheet, which you can view here:

LessWrong 2022 Posts by Pingbacks

Here are the top 100 posts, sorted by Total Pingback Karma

Title/LinkPost KarmaPingback CountTotal Pingback KarmaAvg Pingback Karma
AGI Ruin: A List of Lethalities87015812,48479
MIRI announces new "Death With Dignity" strategy334738,134111
A central AI alignment problem: capabilities generalization, and the sharp left turn273967,70480
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover367835,12362
Reward is not the optimization target341624,49372
A Mechanistic Interpretability Analysis of Grokking367483,45072
How To Go From Interpretability To Alignment: Just Retarget The Search167453,37475
On how various plans miss the hard bits of the alignment challenge292403,28882
[Intro to brain-like-AGI safety] 3. Two subsystems: Learning & Steering79363,02384
How likely is deceptive alignment?101472,90762
The shard theory of human values238422,84368
Mysteries of mode collapse279322,84289
[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain57302,73191
Why Agent Foundations? An Overly Abstract Explanation285422,73065
A Longlist of Theories of Impact for Interpretability124262,589100
How might we align transformative AI if it’s developed very soon?136322,35173
A transparency and interpretability tech tree148312,34376
Discovering Language Model Behaviors with Model-Written Evaluations100192,336123
A note about differential technological development185202,270114
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]195352,26765
Supervise Process, not Outcomes132252,26290
Shard Theory: An Overview157282,01972
Epistemological Vigilance for Alignment61212,00896
A shot at the diamond-alignment problem92231,84880
Where I agree and disagree with Eliezer862271,83668
Brain Efficiency: Much More than You Wanted to Know201271,80767
Refine: An Incubator for Conceptual Alignment Research Bets143211,79385
Externalized reasoning oversight: a research direction for language model alignment117281,78864
Humans provide an untapped wealth of evidence about alignment186191,64787
Six Dimensions of Operational Adequacy in AGI Projects298201,60780
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme240161,57598
Godzilla Strategies137171,57393
(My understanding of) What Everyone in Technical Alignment is Doing and Why411231,53067
Two-year update on my personal AI timelines287181,53085
[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA90161,48293
[Intro to brain-like-AGI safety] 6. Big picture of motivation, decision-making, and RL66251,46058
Human values & biases are inaccessible to the genome90141,450104
You Are Not Measuring What You Think You Are Measuring350211,44969
Open Problems in AI X-Risk [PAIS #5]59141,446103
[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?146251,40756
Conditioning Generative Models24111,362124
Conjecture: Internal Infohazard Policy132141,34096
A challenge for AGI organizations, and a challenge for readers299181,33674
Superintelligent AI is necessary for an amazing future, but far from sufficient132111,335121
Optimality is the tiger, and agents are its teeth288141,31994
Let’s think about slowing down AI522171,27375
Niceness is unnatural121121,263105
Announcing the Alignment of Complex Systems Research Group91111,247113
[Intro to brain-like-AGI safety] 13. Symbol grounding & human social instincts67231,24354
ELK prize results135171,23573
Abstractions as Redundant Information64181,21668
[Link] A minimal viable product for alignment53121,18499
Acceptability Verification: A Research Agenda50111,182107
What an actually pessimistic containment strategy looks like647161,16873
Let's See You Write That Corrigibility Tag120101,161116
chinchilla's wild implications403181,15164
Worlds Where Iterative Design Fails185171,12266
why assume AGIs will optimize for fixed goals?138141,10379
Gradient hacking: definitions and examples38111,07998
Contra shard theory, in the context of the diamond maximizer problem10161,073179
We Are Conjecture, A New Alignment Research Startup19781,050131
Circumventing interpretability: How to defeat mind-readers109111,04795
Evolution is a bad analogy for AGI: inner alignment7371,043149
Refining the Sharp Left Turn threat model, part 1: claims and mechanisms8281,042130
MATS Models8681,035129
Common misconceptions about OpenAI239111,02893
Prizes for ELK proposals143201,02251
Current themes in mechanistic interpretability research8891,014113
Discovering Agents711399476
[Intro to brain-like-AGI safety] 12. Two paths forward: “Controlled AGI” and “Social-instinct AGI”421599266
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?1182498841
Inner and outer alignment decompose one hard problem into two extremely hard problems1151795956
Threat Model Literature Review731395373
Language models seem to be much better than humans at next-token prediction1721195287
Will Capabilities Generalise More?1227952136
Pivotal outcomes and pivotal processes918938117
Conditioning Generative Models for Alignment569934104
Training goals for large language models289930103
It’s Probably Not Lithium4415929186
Latent Adversarial Training401191483
“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments1291191383
Conditioning Generative Models with Restrictions185913183
The alignment problem from a deep learning perspective978910114
Instead of technical research, more people should focus on buying time1001590460
By Default, GPTs Think In Plain Sight849903100

[Intro to brain-like-AGI safety] 4. The “short-term predictor”
Don't leave your fingerprints on the future1091189081
Strategy For Conditioning Generative Models315883177
Call For Distillers2041987846
Thoughts on AGI organizations and capabilities work1025871174
Optimization at a Distance87986896
[Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learning521785951
What does it take to defend the world against out-of-control AGIs?1801185378
Monitoring for deceptive alignment1351185177
Late 2021 MIRI Conversations: AMA / Discussion1198849106
How to Diversify Conceptual Alignment: the Model Behind Refine872784531
wrapper-minds are the enemy1038833104
But is it really in Rome? An investigation of the ROME model editing technique1028833104
An Open Agency Architecture for Safe Transformative AI741283169


Comments sorted by top scores.

comment by jessicata (jessica.liu.taylor) · 2023-12-17T17:48:44.752Z · LW(p) · GW(p)

I have to look for a while before finding any non-AI posts. Seems LW is mainly an AI / alignment discussion forum at this point.

Replies from: ryan_greenblatt, habryka4
comment by ryan_greenblatt · 2023-12-17T19:39:51.721Z · LW(p) · GW(p)

It seems more informative to just look at top (inflation adjusted) karma for 2022 (similar to what habryka noted in the sibling). AI posts in bold.

  • AGI Ruin: A List of LethalitiesΩ
  • Where I agree and disagree with EliezerΩ
  • SimulatorsΩ
  • What an actually pessimistic containment strategy looks like
  • Let’s think about slowing down AIΩ
  • Luck based medicine: my resentful story of becoming a medical miracle
  • Counter-theses on Sleep
  • Losing the root for the tree
  • The Redaction Machine
  • It Looks Like You're Trying To Take Over The WorldΩ
  • (My understanding of) What Everyone in Technical Alignment is Doing and WhyΩ
  • Counterarguments to the basic AI x-risk caseΩ
  • It’s Probably Not Lithium
  • Reflections on six months of fatherhood
  • chinchilla's wild implicationsΩ
  • You Are Not Measuring What You Think You Are Measuring [AI related]
  • Lies Told To Children
  • What DALL-E 2 can and cannot do
  • Staring into the abyss as a core life skill
  • DeepMind alignment team opinions on AGI ruin argumentsΩ
  • Accounting For College Costs
  • A Mechanistic Interpretability Analysis of GrokkingΩ
  • Models Don't "Get Reward"Ω
  • Why I think strong general AI is coming soon
  • Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeoverΩ
  • Why Agent Foundations? An Overly Abstract ExplanationΩ
  • MIRI announces new "Death With Dignity" strategy
  • Beware boasting about non-existent forecasting track records [AI related]
  • A challenge for AGI organizations, and a challenge for readersΩ

I count 18/29 about AI. A few AI posts are technically more general. A few non-AI posts seem to indirectly be about AI.

comment by habryka (habryka4) · 2023-12-17T18:23:55.060Z · LW(p) · GW(p)

I think the AI posts are definitely substantially more interlinked than the non-AI posts, so I think specific metric oversamples AI posts. 

comment by Raemon · 2023-12-16T21:56:01.285Z · LW(p) · GW(p)

I just updated the spreadsheet to include All Time posts. Lists of Lethalities is still the winner by Total Pingback Karma, although not by Pingback Count (and this seems at least partially explained by karma inflation)

comment by Steven Byrnes (steve2152) · 2023-12-16T22:04:34.704Z · LW(p) · GW(p)

The list would look pretty different if self-cites were excluded. E.g. my posts would probably all be gone 😂

Replies from: Raemon, Raemon
comment by Raemon · 2023-12-16T22:13:42.938Z · LW(p) · GW(p)

Yeah if I have time today I'll make an "exclude self-cites" column, although fwiw I think the "total pingback karma" is fairly legit even if including self-cites. If your followup work got a lot of karma, I think that's a useful signal about your original post even if you quote yourself liberally.

comment by Raemon · 2023-12-17T01:13:50.016Z · LW(p) · GW(p)

(I've updated it to also show non-author-pingback count)

comment by Alex_Altair · 2023-12-16T21:46:38.970Z · LW(p) · GW(p)

You guys could compute a kind of Page Rank for LW posts.

comment by Said Achmiz (SaidAchmiz) · 2023-12-17T06:35:15.897Z · LW(p) · GW(p)

What is “pingback karma”…?

Replies from: Mo Nastri
comment by Mo Putera (Mo Nastri) · 2023-12-17T14:02:18.926Z · LW(p) · GW(p)

Karma of posts linking to the post in question, I think.

comment by Yoav Ravid · 2023-12-16T21:29:40.826Z · LW(p) · GW(p)

Pingback count means amount of LW comments or posts that linked to it?

Replies from: Raemon
comment by Raemon · 2023-12-16T21:30:34.899Z · LW(p) · GW(p)

The current version is just posts. It gets a little more complicated sorting out the comments.

Replies from: Yoav Ravid
comment by Yoav Ravid · 2023-12-16T21:33:48.284Z · LW(p) · GW(p)

Oh wow, so list of lethalities was linked to in 158 posts. That's a lot!

comment by Vlad Sitalo (harcisis) · 2023-12-17T19:25:16.708Z · LW(p) · GW(p)

Would love for the spreadsheet to include tags to further simplify filtering.