LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Absorbing Your Friends' Powers
Alice Blair (Diatom) · 2025-01-30T02:32:27.091Z · comments (1)
Metacompilation
Donald Hobson (donald-hobson) · 2025-02-24T22:58:00.085Z · comments (0)
Sleeping Beauty: an Accuracy-based Approach
glauberdebona · 2025-02-10T15:40:29.619Z · comments (2)
[link] The Dilemma’s Dilemma
James Stephen Brown (james-brown) · 2025-02-19T23:50:47.485Z · comments (8)
Post-hoc reasoning in chain of thought
Kyle Cox (klye) · 2025-02-05T18:58:29.802Z · comments (0)
Exploring how OthelloGPT computes its world model
JMaar (jim-maar) · 2025-02-02T21:29:09.433Z · comments (0)
Make Superintelligence Loving
Davey Morse (davey-morse) · 2025-02-21T06:07:17.235Z · comments (9)
One-dimensional vs multi-dimensional features in interpretability
charlieoneill (kingchucky211) · 2025-02-01T09:10:01.112Z · comments (0)
[question] Does human (mis)alignment pose a significant and imminent existential threat?
jr · 2025-02-23T10:03:40.269Z · answers+comments (3)
[question] p(s-risks to contemporary humans)?
mhampton · 2025-02-08T21:19:53.821Z · answers+comments (5)
Proposal for a Form of Conditional Supplemental Income (CSI) in a Post-Work World
sweenesm · 2025-01-31T01:00:55.064Z · comments (2)
[link] On AI Scaling
harsimony · 2025-02-05T20:24:56.977Z · comments (3)
[question] Should I Divest from AI?
OKlogic · 2025-02-10T03:29:33.582Z · answers+comments (4)
AIS Berlin, events, opportunities and the flipped gameboard - Fieldbuilders Newsletter, February 2025
gergogaspar (gergo-gaspar) · 2025-02-17T14:16:31.834Z · comments (0)
Build a Metaculus Forecasting Bot in 30 Minutes: A Practical Guide
ChristianWilliams · 2025-02-22T03:52:14.753Z · comments (0)
[question] Alignment Paradox and a Request for Harsh Criticism
Bridgett Kay (bridgett-kay) · 2025-02-05T18:17:22.701Z · answers+comments (7)
Beyond ELO: Rethinking Chess Skill as a Multidimensional Random Variable
Oliver Oswald (oliver-oswald) · 2025-02-10T19:19:36.233Z · comments (7)
[link] Hello World
Charlie Sanders (charlie-sanders) · 2025-01-30T15:33:57.427Z · comments (0)
[link] AI Safety at the Frontier: Paper Highlights, January '25
gasteigerjo · 2025-02-11T16:14:16.972Z · comments (0)
Fun, endless art debates v. morally charged art debates that are intrinsically endless
danielechlin · 2025-02-21T04:44:22.712Z · comments (2)
Utilitarian AI Alignment: Building a Moral Assistant with the Constitutional AI Method
Clément L · 2025-02-04T04:15:36.917Z · comments (0)
Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (1)
[question] Does the ChatGPT (web)app sometimes show actual o1 CoTs now?
Sohaib Imran (sohaib-imran) · 2025-01-29T17:27:08.067Z · answers+comments (6)
[link] Neural Scaling Laws Rooted in the Data Distribution
aribrill (Particleman) · 2025-02-20T21:22:10.306Z · comments (0)
Positive jailbreaks in LLMs
dereshev · 2025-01-29T08:41:44.680Z · comments (0)
If you wanted to actually reduce the trade deficit, how would you do it?
Logan Zoellner (logan-zoellner) · 2025-01-26T18:04:54.702Z · comments (5)
Closed-ended questions aren't as hard as you think
electroswing · 2025-02-19T03:53:11.855Z · comments (0)
What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)
gergogaspar (gergo-gaspar) · 2025-02-17T12:39:09.196Z · comments (0)
[link] Narratives as catalysts of catastrophic trajectories
EQ · 2025-01-26T19:01:21.558Z · comments (0)
Bimodal AI Beliefs
Adam Train (aetrain) · 2025-02-14T06:45:53.933Z · comments (1)
There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)
Do No Harm? Navigating and Nudging AI Moral Choices
Sinem (sinem-erisken) · 2025-02-06T19:18:31.065Z · comments (0)
Towards a Science of Evals for Sycophancy
andrejfsantos · 2025-02-01T21:17:15.406Z · comments (0)
Blackpool Applied Rationality Unconference 2025
Henry Prowbell · 2025-02-01T14:09:44.673Z · comments (0)
Retroactive If-Then Commitments
MichaelDickens · 2025-02-01T22:22:43.031Z · comments (0)
Empirical Insights into Feature Geometry in Sparse Autoencoders
Jason Boxi Zhang (jason-boxi-zhang) · 2025-01-24T19:02:19.167Z · comments (0)
Jevon's paradox and economic intuitions
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2025-01-27T23:04:23.854Z · comments (0)
Superintelligence Alignment Proposal
Davey Morse (davey-morse) · 2025-02-03T18:47:22.287Z · comments (3)
An Introduction to Evidential Decision Theory
Babić · 2025-02-02T21:27:35.684Z · comments (2)
[link] Tetherware #1: The case for humanlike AI with free will
Jáchym Fibír · 2025-01-30T10:58:11.717Z · comments (10)
Are current LLMs safe for psychotherapy?
PaperBike · 2025-02-12T19:16:34.452Z · comments (4)
[link] Medical Windfall Prizes
PeterMcCluskey · 2025-02-06T23:33:27.263Z · comments (1)
[question] Popular materials about environmental goals/agent foundations? People wanting to discuss such topics?
Q Home · 2025-01-22T03:30:38.066Z · answers+comments (0)
Safe Distillation With a Powerful Untrusted AI
Alek Westover (alek-westover) · 2025-02-20T03:14:04.893Z · comments (1)
[link] Request for Information for a new US AI Action Plan (OSTP RFI)
agucova · 2025-02-07T20:40:36.034Z · comments (0)
[link] Sparse Autoencoder Features for Classifications and Transferability
Shan23Chen (shan-chen) · 2025-02-18T22:14:12.994Z · comments (0)
[link] Pre-ASI: The case for an enlightened mind, capital, and AI literacy in maximizing the good life
Noahh (noah-jackson) · 2025-02-21T00:03:47.922Z · comments (5)
[link] Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought
Lukas Petersson (lukas-petersson-1) · 2025-02-21T15:45:00.146Z · comments (0)
Understanding Agent Preferences
martinkunev · 2025-02-24T17:46:04.022Z · comments (0)
The Dead Cradle Theory: Why Earth May Not Survive Humanity's Expansion into Space
Nicholas Andresen (nicholas-andresen) · 2025-01-22T17:43:48.950Z · comments (0)
← previous page (newer posts) · next page (older posts) →