LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?
garrison · 2025-02-11T00:20:41.421Z · comments (8)
Murder plots are infohazards
Chris Monteiro (chris-topher) · 2025-02-13T19:15:09.749Z · comments (23)
[link] Research directions Open Phil wants to fund in technical AI safety
jake_mendel · 2025-02-08T01:40:00.968Z · comments (21)
The Paris AI Anti-Safety Summit
Zvi · 2025-02-12T14:00:07.383Z · comments (19)
Two hemispheres - I do not think it means what you think it means
Viliam · 2025-02-09T15:33:53.391Z · comments (16)
The News is Never Neglected
lsusr · 2025-02-11T14:59:48.323Z · comments (15)
[link] A computational no-coincidence principle
Eric Neyman (UnexpectedValues) · 2025-02-14T21:39:39.277Z · comments (4)
[link] A short course on AGI safety from the GDM Alignment team
Vika · 2025-02-14T15:43:50.903Z · comments (0)
Levels of Friction
Zvi · 2025-02-10T13:10:07.224Z · comments (0)
My model of what is going on with LLMs
Cole Wyeth (Amyr) · 2025-02-13T03:43:29.447Z · comments (35)
Ambiguous out-of-distribution generalization on an algorithmic task
Wilson Wu (wilson-wu) · 2025-02-13T18:24:36.160Z · comments (0)
The Mask Comes Off: A Trio of Tales
Zvi · 2025-02-14T15:30:15.372Z · comments (1)
[link] How do we solve the alignment problem?
Joe Carlsmith (joekc) · 2025-02-13T18:27:27.712Z · comments (8)
[link] Gary Marcus now saying AI can't do things it can already do
Benjamin_Todd · 2025-02-09T12:24:11.954Z · comments (7)
[link] Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
Matrice Jacobine · 2025-02-12T09:15:07.793Z · comments (30)
On Deliberative Alignment
Zvi · 2025-02-11T13:00:07.683Z · comments (1)
"Think it Faster" worksheet
Raemon · 2025-02-08T22:02:27.697Z · comments (8)
≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
Rafael Harth (sil-ver) · 2025-02-13T19:21:35.392Z · comments (38)
Skepticism towards claims about the views of powerful institutions
tlevin (trevor) · 2025-02-13T07:40:52.257Z · comments (2)
Not all capabilities will be created equal: focus on strategically superhuman agents
benwr · 2025-02-13T01:24:46.084Z · comments (3)
Virtue signaling, and the "humans-are-wonderful" bias, as a trust exercise
lc · 2025-02-13T06:59:17.525Z · comments (14)
Self-dialogue: Do behaviorist rewards make scheming AGIs?
Steven Byrnes (steve2152) · 2025-02-13T18:39:37.770Z · comments (0)
Extended analogy between humans, corporations, and AIs.
Daniel Kokotajlo (daniel-kokotajlo) · 2025-02-13T00:03:13.956Z · comments (1)
Proof idea: SLT to AIT
Lucius Bushnaq (Lblack) · 2025-02-10T23:14:24.538Z · comments (6)
Knocking Down My AI Optimist Strawman
tailcalled · 2025-02-08T10:52:33.183Z · comments (0)
[link] Hunting for AI Hackers: LLM Agent Honeypot
Reworr R (reworr-reworr) · 2025-02-12T20:29:32.269Z · comments (0)
Nonpartisan AI safety
Yair Halberstadt (yair-halberstadt) · 2025-02-10T14:55:50.913Z · comments (4)
Notes on Occam via Solomonoff vs. hierarchical Bayes
JesseClifton · 2025-02-10T17:55:14.689Z · comments (7)
Why you maybe should lift weights, and How to.
samusasuke · 2025-02-12T05:15:32.011Z · comments (29)
[link] Altman blog on post-AGI world
Julian Bradshaw · 2025-02-09T21:52:30.631Z · comments (10)
Towards building blocks of ontologies
Daniel C (harper-owen) · 2025-02-08T16:03:29.854Z · comments (0)
World Citizen Assembly about AI - Announcement
Camille Berger (Camille Berger) · 2025-02-11T10:51:56.948Z · comments (1)
Two flaws in the Machiavelli Benchmark
TheManxLoiner · 2025-02-12T19:34:35.241Z · comments (0)
Logical Correlation
niplav · 2025-02-10T23:29:10.518Z · comments (6)
What is a circuit? [in interpretability]
Yudhister Kumar (randomwalks) · 2025-02-14T04:40:42.978Z · comments (1)
Distilling the Internal Model Principle
JoseFaustino · 2025-02-08T14:59:29.730Z · comments (0)
Seven sources of goals in LLM agents
Seth Herd · 2025-02-08T21:54:20.186Z · comments (2)
[link] Notes on the Presidential Election of 1836
Arjun Panickssery (arjun-panickssery) · 2025-02-13T23:40:23.224Z · comments (0)
MATS Spring 2024 Extension Retrospective
HenningB (HenningBlue) · 2025-02-12T22:43:58.193Z · comments (0)
[link] What is it to solve the alignment problem?
Joe Carlsmith (joekc) · 2025-02-13T18:42:07.215Z · comments (4)
Wiki on Suspects in Lind, Zajko, and Maland Killings
Rebecca_Records · 2025-02-08T04:16:08.589Z · comments (4)
System 2 Alignment
Seth Herd · 2025-02-13T19:17:56.868Z · comments (0)
[link] Can Knowledge Hurt You? The Dangers of Infohazards (and Exfohazards)
aggliu · 2025-02-08T15:51:43.143Z · comments (0)
Celtic Knots on a hex lattice
Ben (ben-lang) · 2025-02-14T14:29:08.223Z · comments (5)
Less Laptop Velcro
jefftk (jkaufman) · 2025-02-09T03:30:03.403Z · comments (0)
[Job ad] LISA CEO
Ryan Kidd (ryankidd44) · 2025-02-09T00:18:35.254Z · comments (4)
[question] Should Open Philanthropy Make an Offer to Buy OpenAI?
mrtreasure · 2025-02-14T23:18:01.929Z · answers+comments (0)
Moral Hazard in Democratic Voting
lsusr · 2025-02-12T23:17:39.355Z · comments (8)
AI #103: Show Me the Money
Zvi · 2025-02-13T15:20:07.057Z · comments (8)
Detecting AI Agent Failure Modes in Simulations
Michael Soareverix (michael-soareverix) · 2025-02-11T11:10:26.030Z · comments (0)
next page (older posts) →