LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-04-30T18:51:13.493Z · comments (36)
2023 Survey Results
Screwtape · 2024-02-16T22:24:28.132Z · comments (26)
[link] Vernor Vinge, who coined the term "Technological Singularity", dies at 79
Kaj_Sotala · 2024-03-21T22:14:14.699Z · comments (24)
On Devin
Zvi · 2024-03-18T13:20:04.779Z · comments (30)
[link] Masterpiece
Richard_Ngo (ricraz) · 2024-02-13T23:10:35.376Z · comments (20)
Raising children on the eve of AI
juliawise · 2024-02-15T21:28:07.737Z · comments (15)
On Not Pulling The Ladder Up Behind You
Screwtape · 2024-04-26T21:58:29.455Z · comments (14)
Some (problematic) aesthetics of what constitutes good work in academia
Steven Byrnes (steve2152) · 2024-03-11T17:47:28.835Z · comments (12)
The Plan - 2023 Version
johnswentworth · 2023-12-29T23:34:19.651Z · comments (39)
Ironing Out the Squiggles
Zack_M_Davis · 2024-04-29T16:13:00.371Z · comments (34)
Tips for Empirical Alignment Research
Ethan Perez (ethan-perez) · 2024-02-29T06:04:54.481Z · comments (4)
[link] Daniel Dennett has died (1942-2024)
kave · 2024-04-19T16:17:04.742Z · comments (5)
Leading The Parade
johnswentworth · 2024-01-31T22:39:56.499Z · comments (30)
[link] Using axis lines for good or evil
dynomight · 2024-03-06T14:47:10.989Z · comments (39)
The Worst Form Of Government (Except For Everything Else We've Tried)
johnswentworth · 2024-03-17T18:11:38.374Z · comments (46)
And All the Shoggoths Merely Players
Zack_M_Davis · 2024-02-10T19:56:59.513Z · comments (57)
LLMs for Alignment Research: a safety priority?
abramdemski · 2024-04-04T20:03:22.484Z · comments (24)
What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
Hastings (hastings-greer) · 2024-01-12T13:17:23.829Z · comments (22)
Deep atheism and AI risk
Joe Carlsmith (joekc) · 2024-01-04T18:58:47.745Z · comments (22)
Read the Roon
Zvi · 2024-03-05T13:50:04.967Z · comments (6)
Processor clock speeds are not how fast AIs think
Ege Erdil (ege-erdil) · 2024-01-29T14:39:38.050Z · comments (55)
The case for training frontier AIs on Sumerian-only corpus
Alexandre Variengien (alexandre-variengien) · 2024-01-15T16:40:22.011Z · comments (14)
Notice When People Are Directionally Correct
Chris_Leong · 2024-01-14T14:12:37.090Z · comments (7)
An even deeper atheism
Joe Carlsmith (joekc) · 2024-01-11T17:28:31.843Z · comments (47)
Updatelessness doesn't solve most problems
Martín Soto (martinsq) · 2024-02-08T17:30:11.266Z · comments (43)
Community Notes by X
NicholasKees (nick_kees) · 2024-03-18T17:13:33.195Z · comments (15)
My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (31)
A Shutdown Problem Proposal
johnswentworth · 2024-01-21T18:12:48.664Z · comments (61)
Things I've Grieved
Raemon · 2024-02-18T19:32:47.169Z · comments (6)
[link] Steering Llama-2 with contrastive activation additions
Nina Rimsky (NinaR) · 2024-01-02T00:47:04.621Z · comments (29)
[link] If you weren't such an idiot...
kave · 2024-03-02T00:01:37.314Z · comments (60)
RTFB: On the New Proposed CAIP AI Bill
Zvi · 2024-04-10T18:30:08.410Z · comments (14)
Why I take short timelines seriously
NicholasKees (nick_kees) · 2024-01-28T22:27:21.098Z · comments (29)
[link] Simple probes can catch sleeper agents
Monte M (montemac) · 2024-04-23T21:10:47.784Z · comments (15)
My simple AGI investment & insurance strategy
lc · 2024-03-31T02:51:53.479Z · comments (16)
[link] Anthropic release Claude 3, claims >GPT-4 Performance
LawrenceC (LawChan) · 2024-03-04T18:23:54.065Z · comments (40)
AI Alignment Metastrategy
Vanessa Kosoy (vanessa-kosoy) · 2023-12-31T12:06:11.433Z · comments (12)
Four visions of Transformative AI success
Steven Byrnes (steve2152) · 2024-01-17T20:45:46.976Z · comments (22)
Rationality Research Report: Towards 10x OODA Looping?
Raemon · 2024-02-24T21:06:38.703Z · comments (21)
The Parable Of The Fallen Pendulum - Part 1
johnswentworth · 2024-03-01T00:25:00.111Z · comments (32)
[link] Practically A Book Review: Appendix to "Nonlinear's Evidence: Debunking False and Misleading Claims" (ThingOfThings)
tailcalled · 2024-01-03T17:07:13.990Z · comments (25)
[link] Gender Exploration
sapphire (deluks917) · 2024-01-14T18:57:32.893Z · comments (23)
Social status part 1/2: negotiations over object-level preferences
Steven Byrnes (steve2152) · 2024-03-05T16:29:07.143Z · comments (15)
Being nicer than Clippy
Joe Carlsmith (joekc) · 2024-01-16T19:44:23.893Z · comments (23)
Attitudes about Applied Rationality
Camille Berger (Camille Berger) · 2024-02-03T14:42:22.770Z · comments (18)
The Pareto Best and the Curse of Doom
Screwtape · 2024-02-21T23:10:01.359Z · comments (22)
' petertodd'’s last stand: The final days of open GPT-3 research
mwatkins · 2024-01-22T18:47:00.710Z · comments (15)
A Selection of Randomly Selected SAE Features
CallumMcDougall (TheMcDouglas) · 2024-04-01T09:09:49.235Z · comments (2)
The case for more ambitious language model evals
Jozdien · 2024-01-30T00:01:13.876Z · comments (25)
New LessWrong review winner UI ("The LeastWrong" section and full-art post pages)
kave · 2024-02-28T02:42:05.801Z · comments (63)
← previous page (newer posts) · next page (older posts) →