LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] The Stag Hunt—cultivating cooperation to reap rewards
James Stephen Brown (james-brown) · 2025-02-25T23:45:07.472Z · comments (0)
[question] Sparks of Original Thought?
Annapurna (jorge-velez) · 2025-03-06T00:53:44.421Z · answers+comments (4)
Retroactive If-Then Commitments
MichaelDickens · 2025-02-01T22:22:43.031Z · comments (0)
[link] On AI Scaling
harsimony · 2025-02-05T20:24:56.977Z · comments (3)
Closed-ended questions aren't as hard as you think
electroswing · 2025-02-19T03:53:11.855Z · comments (0)
[question] Alignment Paradox and a Request for Harsh Criticism
Bridgett Kay (bridgett-kay) · 2025-02-05T18:17:22.701Z · answers+comments (7)
Bimodal AI Beliefs
Adam Train (aetrain) · 2025-02-14T06:45:53.933Z · comments (1)
[link] Recursive alignment with the principle of alignment
hive · 2025-02-27T02:34:37.940Z · comments (0)
Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (1)
Build a Metaculus Forecasting Bot in 30 Minutes: A Practical Guide
ChristianWilliams · 2025-02-22T03:52:14.753Z · comments (0)
One-dimensional vs multi-dimensional features in interpretability
charlieoneill (kingchucky211) · 2025-02-01T09:10:01.112Z · comments (0)
[link] Can a finite physical device be Turing equivalent?
Noosphere89 (sharmake-farah) · 2025-03-06T15:02:16.921Z · comments (10)
[question] shouldn't we try to get media attention?
KvmanThinking (avery-liu) · 2025-03-04T01:39:06.596Z · answers+comments (0)
Not-yet-falsifiable beliefs?
Benjamin Hendricks (benjamin-hendricks) · 2025-03-02T14:11:07.121Z · comments (4)
Beyond ELO: Rethinking Chess Skill as a Multidimensional Random Variable
Oliver Oswald (oliver-oswald) · 2025-02-10T19:19:36.233Z · comments (7)
Do No Harm? Navigating and Nudging AI Moral Choices
Sinem (sinem-erisken) · 2025-02-06T19:18:31.065Z · comments (0)
[question] Should I Divest from AI?
OKlogic · 2025-02-10T03:29:33.582Z · answers+comments (4)
[question] p(s-risks to contemporary humans)?
mhampton · 2025-02-08T21:19:53.821Z · answers+comments (5)
[question] Does human (mis)alignment pose a significant and imminent existential threat?
jr · 2025-02-23T10:03:40.269Z · answers+comments (3)
What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)
gergogaspar (gergo-gaspar) · 2025-02-17T12:39:09.196Z · comments (0)
[link] AISN #49: Superintelligence Strategy
Corin Katzke (corin-katzke) · 2025-03-06T17:46:50.965Z · comments (1)
AIS Berlin, events, opportunities and the flipped gameboard - Fieldbuilders Newsletter, February 2025
gergogaspar (gergo-gaspar) · 2025-02-17T14:16:31.834Z · comments (0)
Fun, endless art debates v. morally charged art debates that are intrinsically endless
danielechlin · 2025-02-21T04:44:22.712Z · comments (2)
Blackpool Applied Rationality Unconference 2025
Henry Prowbell · 2025-02-01T14:09:44.673Z · comments (0)
[question] Name for Standard AI Caveat?
yrimon (yehuda-rimon) · 2025-02-26T07:07:16.523Z · answers+comments (5)
[link] AI Safety at the Frontier: Paper Highlights, January '25
gasteigerjo · 2025-02-11T16:14:16.972Z · comments (0)
Towards a Science of Evals for Sycophancy
andrejfsantos · 2025-02-01T21:17:15.406Z · comments (0)
There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)
Have you actually tried raising the birth rate?
Yair Halberstadt (yair-halberstadt) · 2025-03-10T18:06:40.987Z · comments (5)
[link] Neural Scaling Laws Rooted in the Data Distribution
aribrill (Particleman) · 2025-02-20T21:22:10.306Z · comments (0)
Utilitarian AI Alignment: Building a Moral Assistant with the Constitutional AI Method
Clément L · 2025-02-04T04:15:36.917Z · comments (1)
[link] Social Dilemmas — public goods, free riders, and exploitation
James Stephen Brown (james-brown) · 2025-03-05T23:31:17.512Z · comments (0)
Arguing for the Truth? An Inference-Only Study into AI Debate
denisemester · 2025-02-11T03:04:58.852Z · comments (0)
The chessboard world
phdead · 2025-03-10T01:26:16.304Z · comments (0)
[link] Medical Windfall Prizes
PeterMcCluskey · 2025-02-06T23:33:27.263Z · comments (1)
Positional kernels of attention heads
Alex Gibson · 2025-03-10T23:17:25.068Z · comments (0)
Are current LLMs safe for psychotherapy?
PaperBike · 2025-02-12T19:16:34.452Z · comments (4)
Stress exists only where the Mind makes it
Noahh (noah-jackson) · 2025-03-10T19:44:42.887Z · comments (2)
Superintelligence Alignment Proposal
Davey Morse (davey-morse) · 2025-02-03T18:47:22.287Z · comments (3)
[question] How much do frontier LLMs code and browse while in training?
Joe Rogero · 2025-03-10T19:34:23.950Z · answers+comments (0)
Understanding Agent Preferences
martinkunev · 2025-02-24T17:46:04.022Z · comments (0)
Cross-Layer Feature Alignment and Steering in Large Language Model
dlaptev · 2025-02-08T20:18:20.331Z · comments (0)
Kairos is hiring a Head of Operations/Founding Generalist
agucova · 2025-03-12T20:58:49.661Z · comments (0)
Existentialists and Trolleys
David Gross (David_Gross) · 2025-02-28T14:01:49.509Z · comments (3)
[link] Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought
Lukas Petersson (lukas-petersson-1) · 2025-02-21T15:45:00.146Z · comments (0)
[link] Sparse Autoencoder Features for Classifications and Transferability
Shan23Chen (shan-chen) · 2025-02-18T22:14:12.994Z · comments (0)
[link] (Anti)Aging 101
George3d6 · 2025-03-12T03:59:21.859Z · comments (2)
Claude 3.5 Sonnet (New)'s AGI scenario
Nathan Young · 2025-02-17T18:47:04.669Z · comments (2)
[link] How Language Models Understand Nullability
Anish Tondwalkar (anish-tondwalkar) · 2025-03-11T15:57:28.686Z · comments (0)
An Introduction to Evidential Decision Theory
Babić · 2025-02-02T21:27:35.684Z · comments (2)
← previous page (newer posts) · next page (older posts) →