LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

You can remove GPT2’s LayerNorm by fine-tuning for an hour
StefanHex (Stefan42) · 2024-08-08T18:33:38.803Z · comments (11)
o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)
[link] Masterpiece
Richard_Ngo (ricraz) · 2024-02-13T23:10:35.376Z · comments (21)
And All the Shoggoths Merely Players
Zack_M_Davis · 2024-02-10T19:56:59.513Z · comments (57)
[link] Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data
Johannes Treutlein (Johannes_Treutlein) · 2024-06-21T15:54:41.430Z · comments (13)
DeepMind's "​​Frontier Safety Framework" is weak and unambitious
Zach Stein-Perlman · 2024-05-18T03:00:13.541Z · comments (14)
Most People Don't Realize We Have No Idea How Our AIs Work
Thane Ruthenis · 2023-12-21T20:02:00.360Z · comments (42)
[link] Succession
Richard_Ngo (ricraz) · 2023-12-20T19:25:03.185Z · comments (48)
[link] Making every researcher seek grants is a broken model
jasoncrawford · 2024-01-26T16:06:26.688Z · comments (41)
Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)
[link] o1: A Technical Primer
Jesse Hoogland (jhoogland) · 2024-12-09T19:09:12.413Z · comments (17)
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
scasper · 2024-05-21T20:15:36.502Z · comments (16)
Ironing Out the Squiggles
Zack_M_Davis · 2024-04-29T16:13:00.371Z · comments (36)
Deep Honesty
Aletheophile (aletheo) · 2024-05-07T20:31:48.734Z · comments (25)
[link] Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
cloud · 2024-12-06T22:19:26.717Z · comments (12)
Language Models Model Us
eggsyntax · 2024-05-17T21:00:34.821Z · comments (55)
What’s up with LLMs representing XORs of arbitrary features?
Sam Marks (samuel-marks) · 2024-01-03T19:44:33.162Z · comments (61)
Current safety training techniques do not fully transfer to the agent setting
Simon Lermen (dalasnoin) · 2024-11-03T19:24:51.537Z · comments (8)
Formal verification, heuristic explanations and surprise accounting
Jacob_Hilton · 2024-06-25T15:40:03.535Z · comments (11)
[question] things that confuse me about the current AI market.
DMMF · 2024-08-28T13:46:56.908Z · answers+comments (28)
o3
Zach Stein-Perlman · 2024-12-20T18:30:29.448Z · comments (155)
Dyslucksia
Shoshannah Tekofsky (DarkSym) · 2024-05-09T19:21:33.874Z · comments (45)
The Incredible Fentanyl-Detecting Machine
sarahconstantin · 2024-06-28T22:10:01.223Z · comments (26)
Tips for Empirical Alignment Research
Ethan Perez (ethan-perez) · 2024-02-29T06:04:54.481Z · comments (4)
Apologizing is a Core Rationalist Skill
johnswentworth · 2024-01-02T17:47:35.950Z · comments (42)
OpenAI: Exodus
Zvi · 2024-05-20T13:10:03.543Z · comments (26)
My takes on SB-1047
leogao · 2024-09-09T18:38:37.799Z · comments (8)
[link] Arithmetic is an underrated world-modeling technology
dynomight · 2024-10-17T14:00:22.475Z · comments (33)
The Plan - 2023 Version
johnswentworth · 2023-12-29T23:34:19.651Z · comments (40)
"It's a 10% chance which I did 10 times, so it should be 100%"
egor.timatkov · 2024-11-18T01:14:27.738Z · comments (57)
[link] Daniel Dennett has died (1942-2024)
kave · 2024-04-19T16:17:04.742Z · comments (5)
Deep atheism and AI risk
Joe Carlsmith (joekc) · 2024-01-04T18:58:47.745Z · comments (22)
2023 Survey Results
Screwtape · 2024-02-16T22:24:28.132Z · comments (26)
Subskills of "Listening to Wisdom"
Raemon · 2024-12-09T03:01:18.706Z · comments (19)
[link] If you weren't such an idiot...
kave · 2024-03-02T00:01:37.314Z · comments (74)
Priors and Prejudice
MathiasKB (MathiasKirkBonde) · 2024-04-22T15:00:41.782Z · comments (31)
[link] Using axis lines for good or evil
dynomight · 2024-03-06T14:47:10.989Z · comments (39)
A Rocket–Interpretability Analogy
plex (ete) · 2024-10-21T13:55:18.184Z · comments (31)
[link] Vernor Vinge, who coined the term "Technological Singularity", dies at 79
Kaj_Sotala · 2024-03-21T22:14:14.699Z · comments (25)
On Devin
Zvi · 2024-03-18T13:20:04.779Z · comments (34)
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar · 2023-12-18T11:58:39.379Z · comments (21)
Liability regimes for AI
Ege Erdil (ege-erdil) · 2024-08-19T01:25:01.006Z · comments (34)
Leading The Parade
johnswentworth · 2024-01-31T22:39:56.499Z · comments (31)
Some (problematic) aesthetics of what constitutes good work in academia
Steven Byrnes (steve2152) · 2024-03-11T17:47:28.835Z · comments (12)
OpenAI o1
Zach Stein-Perlman · 2024-09-12T17:30:31.958Z · comments (41)
OpenAI #10: Reflections
Zvi · 2025-01-07T17:00:07.348Z · comments (6)
[link] Decomposing Agency — capabilities without desires
owencb · 2024-07-11T09:38:48.509Z · comments (32)
Repeal the Jones Act of 1920
Zvi · 2024-11-27T15:00:06.801Z · comments (23)
LLMs for Alignment Research: a safety priority?
abramdemski · 2024-04-04T20:03:22.484Z · comments (24)
[link] Stanislav Petrov Quarterly Performance Review
Ricki Heicklen (bayesshammai) · 2024-09-26T21:20:11.646Z · comments (3)
← previous page (newer posts) · next page (older posts) →