LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (4)
Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (16)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (2)
The Game Board has been Flipped: Now is a good time to rethink what you’re doing
Alex Lintz (alex-lintz) · 2025-01-28T23:36:18.106Z · comments (18)
[link] Paper: Open Problems in Mechanistic Interpretability
Lee Sharkey (Lee_Sharkey) · 2025-01-29T10:25:54.727Z · comments (0)
Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (2)
DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (13)
[link] Dario Amodei: On DeepSeek and Export Controls
Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · comments (2)
Operator
Zvi · 2025-01-28T20:00:08.374Z · comments (1)
[link] Reinforcement Learning by AI Punishment
Abhishaike Mahajan (abhishaike-mahajan) · 2025-01-28T00:57:51.715Z · comments (0)
DeepSeek: Lemon, It’s Wednesday
Zvi · 2025-01-29T15:00:07.914Z · comments (0)
[link] Anthropic CEO calls for RSI
Andrea_Miotti (AndreaM) · 2025-01-29T16:54:24.943Z · comments (8)
Efficiency spectra and “bucket of circuits” cartoons
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-29T15:06:50.768Z · comments (0)
SAE regularization produces more interpretable models
Peter Lai (peter-lai) · 2025-01-28T20:02:56.662Z · comments (2)
The memorization-generalization spectrum and learning coefficients
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-28T16:53:24.628Z · comments (0)
[link] Fertility Will Never Recover
Eneasz · 2025-01-30T01:16:41.332Z · comments (4)
Nvidia doesn’t just sell shovels
winstonBosan · 2025-01-28T04:56:38.720Z · comments (4)
[question] Those of you with lots of meditation experience: How did it influence your understanding of philosophy of mind and topics such as qualia?
SpectrumDT · 2025-01-28T14:29:47.034Z · answers+comments (13)
Learn to Develop Your Advantage
ReverendBayes (vedernikov-andrei) · 2025-01-29T22:06:00.641Z · comments (0)
Detecting out of distribution text with surprisal and entropy
Sandy Fraser (alex-fraser) · 2025-01-28T18:46:46.977Z · comments (3)
Revealing alignment faking with a single prompt
Florian_Dietz · 2025-01-29T21:01:15.000Z · comments (4)
My Mental Model of AI Optimist Opinions
tailcalled · 2025-01-29T18:44:36.485Z · comments (1)
[question] Does the ChatGPT (web)app sometimes show actual o1 CoTs now?
Sohaib Imran (sohaib-imran) · 2025-01-29T17:27:08.067Z · answers+comments (6)
[link] Whereby: The Zoom alternative you probably haven't heard of
Itay Dreyfus (itay-dreyfus) · 2025-01-29T13:01:08.564Z · comments (0)
[question] Why not train reasoning models with RLHF?
CBiddulph (caleb-biddulph) · 2025-01-30T07:58:35.742Z · answers+comments (0)
Positive jailbreaks in LLMs
dereshev · 2025-01-29T08:41:44.680Z · comments (0)
[link] Constitutions for ASI?
ukc10014 · 2025-01-28T16:32:39.307Z · comments (0)
Allegory of the Tsunami
Evan Hu (evan-hu) · 2025-01-29T19:09:33.761Z · comments (0)
Using an LLM for creative writing feels wrong to me
Declan Molony (declan-molony) · 2025-01-28T06:42:24.799Z · comments (13)
Detailed Ideal World Benchmark
Knight Lee (Max Lee) · 2025-01-30T02:31:39.852Z · comments (0)
Untrusted monitoring insights from watching ChatGPT play coordination games
jwfiredragon · 2025-01-29T04:53:33.125Z · comments (0)
How *exactly* can AI take your job in the next few years?
Ansh Juneja (ansh-juneja) · 2025-01-30T02:33:13.475Z · comments (0)
[question] Who's track record of AI predictions would you like to see evaluated?
Jonny Spicer (jonnyspicer) · 2025-01-29T12:05:30.311Z · answers+comments (0)
Should Art Carry the Weight of Shaping our Values?
Krishna Maneesha Dendukuri (krishna_maneesha-d) · 2025-01-28T18:43:32.517Z · comments (0)
Reconceptualizing the Nothingness and Existence
Htarlov (htarlov) · 2025-01-28T20:29:44.390Z · comments (0)
Absorbing Your Friends' Powers
Alice Blair (Diatom) · 2025-01-30T02:32:27.091Z · comments (0)
[link] Predation as Payment for Criticism
Benquo · 2025-01-30T01:06:27.591Z · comments (2)
Gettier cases, Rigid Designators, and Referential Opacity
Antigone (luke-st-clair) · 2025-01-28T18:46:10.180Z · comments (0)
The Road to Evil Is Paved with Good Objectives: Framework to Classify and Fix Misalignments.
Shivam · 2025-01-30T02:44:47.907Z · comments (0)
All pigeons are ugly!
Eris (anton-zheltoukhov) · 2025-01-28T15:18:25.507Z · comments (2)
Rational Utopia
ank · 2025-01-29T14:16:09.862Z · comments (2)
next page (older posts) →