LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] My hour of memoryless lucidity
Eric Neyman (UnexpectedValues) · 2024-05-04T01:40:56.717Z · comments (23)
[link] Ilya Sutskever and Jan Leike resign from OpenAI
Zach Stein-Perlman · 2024-05-15T00:45:02.436Z · comments (76)
[link] Introducing AI Lab Watch
Zach Stein-Perlman · 2024-04-30T17:00:12.652Z · comments (25)
Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-04-30T18:51:13.493Z · comments (37)
Ironing Out the Squiggles
Zack_M_Davis · 2024-04-29T16:13:00.371Z · comments (34)
Dyslucksia
Shoshannah Tekofsky (DarkSym) · 2024-05-09T19:21:33.874Z · comments (42)
Deep Honesty
Aletheophile (aletheo) · 2024-05-07T20:31:48.734Z · comments (26)
Do you believe in hundred dollar bills lying on the ground? Consider humming
Elizabeth (pktechgirl) · 2024-05-16T00:00:05.257Z · comments (10)
[question] Which skincare products are evidence-based?
Vanessa Kosoy (vanessa-kosoy) · 2024-05-02T15:22:12.597Z · answers+comments (43)
[link] introduction to cancer vaccines
bhauth · 2024-05-05T01:06:16.972Z · comments (19)
Why I'm doing PauseAI
Joseph Miller (Josephm) · 2024-04-30T16:21:54.156Z · comments (16)
DeepMind's "​​Frontier Safety Framework" is weak and unambitious
Zach Stein-Perlman · 2024-05-18T03:00:13.541Z · comments (8)
Explaining a Math Magic Trick
Robert_AIZI · 2024-05-05T19:41:52.048Z · comments (10)
Key takeaways from our EA and alignment research surveys
Cameron Berg (cameron-berg) · 2024-05-03T18:10:41.416Z · comments (10)
[link] Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant
Olli Järviniemi (jarviniemi) · 2024-05-06T07:07:05.019Z · comments (4)
We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
Lukas_Gloor · 2024-05-09T15:43:11.490Z · comments (35)
[link] "AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case
habryka (habryka4) · 2024-05-03T18:10:12.478Z · comments (10)
[link] MIRI's May 2024 Newsletter
Harlan · 2024-05-15T00:13:30.153Z · comments (1)
Teaching CS During Take-Off
andrew carle (andrew-carle) · 2024-05-14T22:45:39.447Z · comments (10)
ACX Covid Origins Post convinced readers
ErnestScribbler · 2024-05-01T13:06:20.818Z · comments (7)
Language Models Model Us
eggsyntax · 2024-05-17T21:00:34.821Z · comments (13)
MATS Winter 2023-24 Retrospective
Rocket (utilistrutil) · 2024-05-11T00:09:17.059Z · comments (28)
Q&A on Proposed SB 1047
Zvi · 2024-05-02T15:10:02.916Z · comments (6)
AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan · 2024-05-07T03:50:05.001Z · comments (4)
[link] Advice for Activists from the History of Environmentalism
Jeffrey Heninger (jeffrey-heninger) · 2024-05-16T18:40:02.064Z · comments (5)
[link] Environmentalism in the United States Is Unusually Partisan
Jeffrey Heninger (jeffrey-heninger) · 2024-05-13T21:23:10.755Z · comments (11)
[link] My thesis (Algorithmic Bayesian Epistemology) explained in more depth
Eric Neyman (UnexpectedValues) · 2024-05-09T19:43:16.543Z · comments (4)
Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry · 2024-04-29T20:57:35.127Z · comments (7)
Questions for labs
Zach Stein-Perlman · 2024-04-30T22:15:55.362Z · comments (10)
LessWrong Community Weekend 2024, open for applications
UnplannedCauliflower · 2024-05-01T10:18:21.992Z · comments (2)
Introducing AI-Powered Audiobooks of Rational Fiction Classics
Askwho · 2024-05-04T17:32:49.719Z · comments (13)
AISafety.com – Resources for AI Safety
Søren Elverlin (soren-elverlin-1) · 2024-05-17T15:57:11.712Z · comments (2)
AISC9 has ended and there will be an AISC10
Linda Linsefors · 2024-04-29T10:53:18.812Z · comments (4)
How to be an amateur polyglot
arisAlexis (arisalexis) · 2024-05-08T15:08:11.404Z · comments (16)
[link] DeepMind: Frontier Safety Framework
Zach Stein-Perlman · 2024-05-17T17:30:02.504Z · comments (0)
Transcoders enable fine-grained interpretable circuit analysis for language models
Jacob Dunefsky (jacob-dunefsky) · 2024-04-30T17:58:09.982Z · comments (14)
[link] How do open AI models affect incentive to race?
jessicata (jessica.liu.taylor) · 2024-05-07T00:33:20.658Z · comments (13)
Apply to ESPR & PAIR, Rationality and AI Camps for Ages 16-21
Anna Gajdova (anna-gajdova) · 2024-05-03T12:36:37.610Z · comments (0)
[question] Shane Legg's necessary properties for every AGI Safety plan
jacquesthibs (jacques-thibodeau) · 2024-05-01T17:15:41.233Z · answers+comments (12)
Now THIS is forecasting: understanding Epoch’s Direct Approach
Elliot_Mckernon (elliot) · 2024-05-04T12:06:48.144Z · comments (4)
[link] Questions are usually too cheap
Nathan Young · 2024-05-11T13:00:54.302Z · comments (19)
[link] OpenAI releases GPT-4o, natively interfacing with text, voice and vision
Martín Soto (martinsq) · 2024-05-13T18:50:52.337Z · comments (23)
some thoughts on LessOnline
Raemon · 2024-05-08T23:17:41.372Z · comments (5)
Towards a formalization of the agent structure problem
Alex_Altair · 2024-04-29T20:28:15.190Z · comments (4)
Can we build a better Public Doublecrux?
Raemon · 2024-05-11T19:21:53.326Z · comments (7)
Why Care About Natural Latents?
johnswentworth · 2024-05-09T23:14:30.626Z · comments (3)
[link] Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Gunnar_Zarncke · 2024-05-16T13:09:39.265Z · comments (4)
Observations on Teaching for Four Weeks
ClareChiaraVincent · 2024-05-06T16:55:59.315Z · comments (14)
Catastrophic Goodhart in RL with KL penalty
Thomas Kwa (thomas-kwa) · 2024-05-15T00:58:20.763Z · comments (7)
[link] Designing for a single purpose
Itay Dreyfus (itay-dreyfus) · 2024-05-07T14:11:22.242Z · comments (12)
next page (older posts) →