LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

At 87, Pearl is still able to change his mind
rotatingpaguro · 2023-10-18T04:46:29.339Z · comments (15)
LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B
Simon Lermen (dalasnoin) · 2023-10-12T19:58:02.119Z · comments (29)
[link] Vernor Vinge, who coined the term "Technological Singularity", dies at 79
Kaj_Sotala · 2024-03-21T22:14:14.699Z · comments (24)
On Devin
Zvi · 2024-03-18T13:20:04.779Z · comments (34)
Priors and Prejudice
MathiasKB (MathiasKirkBonde) · 2024-04-22T15:00:41.782Z · comments (31)
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar · 2023-12-18T11:58:39.379Z · comments (21)
Liability regimes for AI
Ege Erdil (ege-erdil) · 2024-08-19T01:25:01.006Z · comments (34)
[link] Explore More: A Bag of Tricks to Keep Your Life on the Rails
Shoshannah Tekofsky (DarkSym) · 2024-09-28T21:38:52.256Z · comments (7)
Some (problematic) aesthetics of what constitutes good work in academia
Steven Byrnes (steve2152) · 2024-03-11T17:47:28.835Z · comments (12)
Does davidad's uploading moonshot work?
jacobjacob · 2023-11-03T02:21:51.720Z · comments (35)
The Plan - 2023 Version
johnswentworth · 2023-12-29T23:34:19.651Z · comments (39)
[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (13)
Leading The Parade
johnswentworth · 2024-01-31T22:39:56.499Z · comments (31)
[link] If you weren't such an idiot...
kave · 2024-03-02T00:01:37.314Z · comments (74)
Deep atheism and AI risk
Joe Carlsmith (joekc) · 2024-01-04T18:58:47.745Z · comments (22)
[link] Moral Reality Check (a short story)
jessicata (jessica.liu.taylor) · 2023-11-26T05:03:18.254Z · comments (44)
OpenAI o1
Zach Stein-Perlman · 2024-09-12T17:30:31.958Z · comments (41)
The Information: OpenAI shows 'Strawberry' to feds, races to launch it
Martín Soto (martinsq) · 2024-08-27T23:10:18.155Z · comments (15)
[link] Nursing doubts
dynomight · 2024-08-30T02:25:36.826Z · comments (20)
[link] That Alien Message - The Animation
Writer · 2024-09-07T14:53:30.604Z · comments (9)
LLMs for Alignment Research: a safety priority?
abramdemski · 2024-04-04T20:03:22.484Z · comments (24)
The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (32)
My motivation and theory of change for working in AI healthtech
Andrew_Critch · 2024-10-12T00:36:30.925Z · comments (35)
The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)
[link] Stanislav Petrov Quarterly Performance Review
Ricki Heicklen (bayesshammai) · 2024-09-26T21:20:11.646Z · comments (3)
Value Claims (In Particular) Are Usually Bullshit
johnswentworth · 2024-05-30T06:26:21.151Z · comments (18)
[link] Arithmetic is an underrated world-modeling technology
dynomight · 2024-10-17T14:00:22.475Z · comments (32)
[link] The Checklist: What Succeeding at AI Safety Will Involve
Sam Bowman (sbowman) · 2024-09-03T18:18:34.230Z · comments (49)
AI Views Snapshots
Rob Bensinger (RobbBB) · 2023-12-13T00:45:50.016Z · comments (61)
Loudly Give Up, Don't Quietly Fade
Screwtape · 2023-11-13T23:30:25.308Z · comments (11)
Survey: How Do Elite Chinese Students Feel About the Risks of AI?
Nick Corvino (nick-corvino) · 2024-09-02T18:11:11.867Z · comments (13)
Momentum of Light in Glass
Ben (ben-lang) · 2024-10-09T20:19:42.088Z · comments (44)
[link] Decomposing Agency — capabilities without desires
owencb · 2024-07-11T09:38:48.509Z · comments (32)
My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (31)
Graphical tensor notation for interpretability
Jordan Taylor (Nadroj) · 2023-10-04T08:04:33.341Z · comments (11)
[link] Fields that I reference when thinking about AI takeover prevention
Buck · 2024-08-13T23:08:54.950Z · comments (15)
0. CAST: Corrigibility as Singular Target
Max Harms (max-harms) · 2024-06-07T22:29:12.934Z · comments (12)
Comparing Anthropic's Dictionary Learning to Ours
Robert_AIZI · 2023-10-07T23:30:32.402Z · comments (8)
What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
Hastings (hastings-greer) · 2024-01-12T13:17:23.829Z · comments (22)
Read the Roon
Zvi · 2024-03-05T13:50:04.967Z · comments (6)
EA orgs' legal structure inhibits risk taking and information sharing on the margin
Elizabeth (pktechgirl) · 2023-11-05T19:13:56.135Z · comments (17)
[Completed] The 2024 Petrov Day Scenario
Ben Pace (Benito) · 2024-09-26T08:08:32.495Z · comments (114)
The Worst Form Of Government (Except For Everything Else We've Tried)
johnswentworth · 2024-03-17T18:11:38.374Z · comments (46)
The Dark Arts
lsusr · 2023-12-19T04:41:13.356Z · comments (49)
When is a mind me?
Rob Bensinger (RobbBB) · 2024-04-17T05:56:38.482Z · comments (125)
How to (hopefully ethically) make money off of AGI
habryka (habryka4) · 2023-11-06T23:35:16.476Z · comments (81)
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
Neel Nanda (neel-nanda-1) · 2024-07-07T17:39:35.064Z · comments (15)
The 99% principle for personal problems
Kaj_Sotala · 2023-10-02T08:20:07.379Z · comments (20)
Integrity in AI Governance and Advocacy
habryka (habryka4) · 2023-11-03T19:52:33.180Z · comments (57)
How it All Went Down: The Puzzle Hunt that took us way, way Less Online
A* (agendra) · 2024-06-02T08:01:40.109Z · comments (5)
← previous page (newer posts) · next page (older posts) →