LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Chris Olah’s views on AGI safety
evhub · 2019-11-01T20:13:35.210Z · comments (38)
Worlds Where Iterative Design Fails
johnswentworth · 2022-08-30T20:48:29.025Z · comments (30)
The Treacherous Path to Rationality
Jacob Falkovich (Jacobian) · 2020-10-09T15:34:17.490Z · comments (116)
How To Go From Interpretability To Alignment: Just Retarget The Search
johnswentworth · 2022-08-10T16:08:11.402Z · comments (34)
Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds
1a3orn · 2023-04-04T17:39:39.720Z · comments (38)
[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)
Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-04-30T18:51:13.493Z · comments (42)
Hiring engineers and researchers to help align GPT-3
paulfchristiano · 2020-10-01T18:54:23.551Z · comments (13)
Common knowledge about Leverage Research 1.0
BayAreaHuman · 2021-09-24T06:56:14.729Z · comments (212)
Evolution provides no evidence for the sharp left turn
Quintin Pope (quintin-pope) · 2023-04-11T18:43:07.776Z · comments (65)
My current LK99 questions
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2023-08-01T22:48:00.733Z · comments (38)
UDT shows that decision theory is more puzzling than ever
Wei Dai (Wei_Dai) · 2023-09-13T12:26:09.739Z · comments (55)
Call For Distillers
johnswentworth · 2022-04-04T18:25:34.942Z · comments (43)
Yudkowsky and Christiano discuss "Takeoff Speeds"
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-22T19:35:27.657Z · comments (176)
Funny Anecdote of Eliezer From His Sister
Noah Birnbaum (daniel-birnbaum) · 2024-04-22T22:05:31.886Z · comments (6)
Lightcone Infrastructure/LessWrong is looking for funding
habryka (habryka4) · 2023-06-14T04:45:53.425Z · comments (39)
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC (LawChan) · 2022-12-03T00:58:36.973Z · comments (35)
Labs should be explicit about why they are building AGI
peterbarnett · 2023-10-17T21:09:20.711Z · comments (18)
Some AI research areas and their relevance to existential safety
Andrew_Critch · 2020-11-19T03:18:22.741Z · comments (37)
OpenAI: Fallout
Zvi · 2024-05-28T13:20:04.325Z · comments (25)
Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (23)
[link] Jaan Tallinn's 2023 Philanthropy Overview
jaan · 2024-05-20T12:11:39.416Z · comments (5)
Toward A Mathematical Framework for Computation in Superposition
Dmitry Vaintrob (dmitry-vaintrob) · 2024-01-18T21:06:57.040Z · comments (18)
Benign Boundary Violations
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2022-05-26T06:48:35.585Z · comments (84)
The Sun is big, but superintelligences will not spare Earth a little sunlight
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-09-23T03:39:16.243Z · comments (141)
If interpretability research goes well, it may get dangerous
So8res · 2023-04-03T21:48:18.752Z · comments (11)
Pay Risk Evaluators in Cash, Not Equity
Adam Scholl (adam_scholl) · 2024-09-07T02:37:59.659Z · comments (19)
Long Covid Is Not Necessarily Your Biggest Problem
Elizabeth (pktechgirl) · 2021-09-01T07:20:05.374Z · comments (40)
Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (25)
Consciousness as a conflationary alliance term for intrinsically valued internal experiences
Andrew_Critch · 2023-07-10T08:09:48.881Z · comments (53)
Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)
The Hopium Wars: the AGI Entente Delusion
Max Tegmark (MaxTegmark) · 2024-10-13T17:00:29.033Z · comments (55)
What does it take to defend the world against out-of-control AGIs?
Steven Byrnes (steve2152) · 2022-10-25T14:47:41.970Z · comments (47)
I Converted Book I of The Sequences Into A Zoomer-Readable Format
dkirmani · 2022-11-10T02:59:04.236Z · comments (32)
Simulacra Levels and their Interactions
Zvi · 2020-06-15T13:10:00.717Z · comments (51)
We're Not Ready: thoughts on "pausing" and responsible scaling policies
HoldenKarnofsky · 2023-10-27T15:19:33.757Z · comments (33)
What's So Bad About Ad-Hoc Mathematical Definitions?
johnswentworth · 2021-03-15T21:51:53.242Z · comments (58)
A concrete bet offer to those with short AGI timelines
Matthew Barnett (matthew-barnett) · 2022-04-09T21:41:45.106Z · comments (120)
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren · 2022-11-28T12:54:52.399Z · comments (33)
Maybe Anthropic's Long-Term Benefit Trust is powerless
Zach Stein-Perlman · 2024-05-27T13:00:47.991Z · comments (21)
Optimistic Assumptions, Longterm Planning, and "Cope"
Raemon · 2024-07-17T22:14:24.090Z · comments (46)
[link] Why haven't we celebrated any major achievements lately?
jasoncrawford · 2020-08-17T20:34:20.084Z · comments (69)
Almost everyone should be less afraid of lawsuits
alyssavance · 2021-11-27T02:06:52.176Z · comments (18)
[link] Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy
garrison · 2024-02-10T19:52:55.191Z · comments (52)
Shoulder Advisors 101
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-10-09T05:30:57.372Z · comments (124)
Do a cost-benefit analysis of your technology usage
TurnTrout · 2022-03-27T23:09:26.753Z · comments (53)
GPT-4 Plugs In
Zvi · 2023-03-27T12:10:00.926Z · comments (47)
Brain Efficiency: Much More than You Wanted to Know
jacob_cannell · 2022-01-06T03:38:00.320Z · comments (102)
Attempted Gears Analysis of AGI Intervention Discussion With Eliezer
Zvi · 2021-11-15T03:50:01.141Z · comments (49)
Seeing the Smoke
Jacob Falkovich (Jacobian) · 2020-02-28T18:26:58.839Z · comments (29)
← previous page (newer posts) · next page (older posts) →