LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More
Sharat Jacob Jacob (sharat-jacob-jacob) · 2024-10-29T12:41:30.337Z · comments (0)
Learn to Develop Your Advantage
ReverendBayes (vedernikov-andrei) · 2025-01-29T22:06:00.641Z · comments (0)
LLM Psychometrics and Prompt-Induced Psychopathy
Korbinian K. (korbinian-koch) · 2024-10-18T18:11:24.256Z · comments (2)
[link] Looking for humanness in the world wide social
Itay Dreyfus (itay-dreyfus) · 2025-01-15T14:50:54.966Z · comments (0)
Conversational Signposts—An Antidote to Dull Social Interactions
Declan Molony (declan-molony) · 2024-10-22T05:37:56.175Z · comments (6)
Substituting Talkbox for Breath Controller
jefftk (jkaufman) · 2024-10-27T19:10:03.768Z · comments (0)
[link] LLMs Do Not Think Step-by-step In Implicit Reasoning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-11-28T09:16:57.463Z · comments (0)
Reward Bases: A simple mechanism for adaptive acquisition of multiple reward type
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-11-23T12:45:01.067Z · comments (0)
Rethinking Laplace's Rule of Succession
Cleo Nardo (strawberry calm) · 2024-11-22T18:46:25.156Z · comments (5)
What does success look like?
Raymond D · 2025-01-23T17:48:35.618Z · comments (0)
[question] Where should one post to get into the training data?
keltan · 2025-01-15T00:41:19.405Z · answers+comments (4)
Updating the NAO Simulator
jefftk (jkaufman) · 2024-10-30T13:50:06.908Z · comments (0)
[link] The Computational Complexity of Circuit Discovery for Inner Interpretability
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-10-17T13:18:46.378Z · comments (2)
[link] Progress links and short notes, 2024-12-27: Clinical trial abundance, grid-scale fusion, permitting vs. compliance, crossword mania, and more
jasoncrawford · 2024-12-27T23:34:43.807Z · comments (0)
[link] How to Do a PhD (in AI Safety)
Lewis Hammond (lewis-hammond-1) · 2025-01-05T16:57:35.409Z · comments (0)
7. Iterate the Game: Racing Where?
Allison Duettmann (allison-duettmann) · 2025-01-02T19:06:22.165Z · comments (0)
LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (7)
[link] Picking favourites is hard
dkl9 · 2024-12-04T20:46:47.470Z · comments (3)
Spooky Recommendation System Scaling
phdead · 2024-10-31T22:00:51.728Z · comments (0)
My Mental Model of AI Optimist Opinions
tailcalled · 2025-01-29T18:44:36.485Z · comments (2)
Apply now to SPAR!
agucova · 2024-12-19T22:29:58.963Z · comments (0)
[link] Uncontrollable: A Surprisingly Good Introduction to AI Risk
PeterMcCluskey · 2025-01-24T04:30:37.499Z · comments (0)
Last Line of Defense: Minimum Viable Shelters for Mirror Bacteria
Ulrik Horn (ulrik-horn) · 2024-12-21T08:28:14.860Z · comments (25)
Rethink Wellbeing’s Year 2 Update: Foster Sustainable High Performance for Ambitious Altruists
Inga G. (inga-g) · 2024-12-08T14:32:39.902Z · comments (1)
Alignment ideas
qbolec · 2025-01-18T12:43:49.384Z · comments (1)
Untrusted monitoring insights from watching ChatGPT play coordination games
jwfiredragon · 2025-01-29T04:53:33.125Z · comments (0)
[link] OpenAI’s cybersecurity is probably regulated by NIS Regulations
Adam Jones (domdomegg) · 2024-10-25T11:06:38.392Z · comments (2)
[link] My Mental Model of AI Creativity – Creativity Kiki
Adam Newgas (BorisTheBrave) · 2024-12-09T22:24:23.096Z · comments (0)
How I'd like alignment to get done (as of 2024-10-18)
TristanTrim · 2024-10-18T23:39:03.107Z · comments (4)
[link] Forecast With GiveWell
ChristianWilliams · 2024-12-11T17:52:32.293Z · comments (0)
[question] How counterfactual are logical counterfactuals?
Donald Hobson (donald-hobson) · 2024-12-15T21:16:40.515Z · answers+comments (10)
Launching Applications for the Global AI Safety Fellowship 2025!
Aditya_SK (team-ai-safety) · 2024-11-30T14:02:16.537Z · comments (5)
Contra Dances Getting Shorter and Earlier
jefftk (jkaufman) · 2025-01-23T23:30:03.595Z · comments (0)
[Cross-post] Welcome to the Essay Meta
davekasten · 2025-01-16T23:36:49.152Z · comments (2)
Panology
JenniferRM · 2024-12-23T21:40:14.540Z · comments (8)
Fundamental Uncertainty: Chapter 9 - How do we live with uncertainty?
Gordon Seidoh Worley (gworley) · 2024-11-07T18:15:45.049Z · comments (2)
[link] Anthropic - The case for targeted regulation
anaguma · 2024-11-05T07:07:48.174Z · comments (0)
[link] The Philosophical Glossary of AI
David Gross (David_Gross) · 2025-01-14T17:36:37.241Z · comments (0)
The Three Warnings of the Zentradi
Trevor Hill-Hand (Jadael) · 2024-11-21T20:28:45.567Z · comments (1)
Doing a self-randomized study of the impacts of glycine on sleep (Science is hard)
thedissonance.net · 2025-01-17T18:49:30.989Z · comments (5)
Do you need a better map of your myriad of maps to the territory?
CstineSublime · 2024-12-24T02:00:30.426Z · comments (2)
Orange and Strawberry Truffles
jefftk (jkaufman) · 2025-01-05T01:50:01.587Z · comments (1)
Low Temperature Solomonoff Induction
dil-leik-og (samuel-buteau) · 2024-12-06T18:55:08.948Z · comments (4)
Why We Wouldn't Build Aligned AI Even If We Could
Snowyiu · 2024-11-16T20:19:59.324Z · comments (7)
Fundamental Uncertainty: Epilogue
Gordon Seidoh Worley (gworley) · 2024-11-16T00:57:48.823Z · comments (0)
[link] Proposing the Conditional AI Safety Treaty (linkpost TIME)
otto.barten (otto-barten) · 2024-11-15T13:59:01.050Z · comments (8)
Proactive 'If-Then' Safety Cases
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-18T21:16:37.237Z · comments (0)
Americans are fat and sick—and it’s their fault…right?
Declan Molony (declan-molony) · 2024-11-19T06:41:36.648Z · comments (6)
The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories
avturchin · 2025-01-22T11:48:46.071Z · comments (18)
[question] Has Someone Checked The Cold-Water-In-Left-Ear Thing?
Maloew (maloew-valenar) · 2024-12-28T20:15:35.951Z · answers+comments (0)
← previous page (newer posts) · next page (older posts) →