LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Safetywashing
Adam Scholl (adam_scholl) · 2022-07-01T11:56:33.495Z · comments (20)
Sexual Abuse attitudes might be infohazardous
Pseudonymous Otter · 2022-07-19T18:06:43.956Z · comments (71)
AI alignment is distinct from its near-term applications
paulfchristiano · 2022-12-13T07:10:04.407Z · comments (21)
Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality"
AnnaSalamon · 2022-06-09T02:12:35.151Z · comments (62)
New Scaling Laws for Large Language Models
1a3orn · 2022-04-01T20:41:17.665Z · comments (22)
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Collin (collin-burns) · 2022-12-15T18:22:40.109Z · comments (39)
Jailbreaking ChatGPT on Release Day
Zvi · 2022-12-02T13:10:00.860Z · comments (77)
Common misconceptions about OpenAI
Jacob_Hilton · 2022-08-25T14:02:26.257Z · comments (142)
A Quick Guide to Confronting Doom
Ruby · 2022-04-13T19:30:48.580Z · comments (33)
The Plan - 2022 Update
johnswentworth · 2022-12-01T20:43:50.516Z · comments (37)
Slow motion videos as AI risk intuition pumps
Andrew_Critch · 2022-06-14T19:31:13.616Z · comments (41)
Contra Hofstadter on GPT-3 Nonsense
rictic · 2022-06-15T21:53:30.646Z · comments (24)
The shard theory of human values
Quintin Pope (quintin-pope) · 2022-09-04T04:28:11.752Z · comments (66)
Announcing Balsa Research
Zvi · 2022-09-25T22:50:00.626Z · comments (64)
An Observation of Vavilov Day
Elizabeth (pktechgirl) · 2022-01-03T21:10:02.107Z · comments (42)
Editing Advice for LessWrong Users
JustisMills · 2022-04-11T16:32:17.530Z · comments (14)
Introduction to abstract entropy
Alex_Altair · 2022-10-20T21:03:02.486Z · comments (78)
(briefly) RaDVaC and SMTM, two things we should be doing
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-01-12T06:20:35.555Z · comments (79)
AGI Safety FAQ / all-dumb-questions-allowed thread
Aryeh Englander (alenglander) · 2022-06-07T05:47:13.350Z · comments (526)
Replacing Karma with Good Heart Tokens (Worth $1!)
Ben Pace (Benito) · 2022-04-01T09:31:34.332Z · comments (173)
What do ML researchers think about AI in 2022?
KatjaGrace · 2022-08-04T15:40:05.024Z · comments (33)
How I buy things when Lightcone wants them fast
jacobjacob · 2022-09-26T05:02:09.003Z · comments (21)
Lessons learned from talking to >100 academics about AI safety
Marius Hobbhahn (marius-hobbhahn) · 2022-10-10T13:16:38.036Z · comments (17)
Moses and the Class Struggle
lsusr · 2022-04-01T11:55:04.911Z · comments (26)
ProjectLawful.com: Eliezer's latest story, past 1M words
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-05-11T06:18:02.738Z · comments (112)
Call For Distillers
johnswentworth · 2022-04-04T18:25:34.942Z · comments (43)
I Converted Book I of The Sequences Into A Zoomer-Readable Format
dkirmani · 2022-11-10T02:59:04.236Z · comments (31)
Unifying Bargaining Notions (1/2)
Diffractor · 2022-07-25T00:28:27.572Z · comments (41)
What it's like to dissect a cadaver
Alok Singh (OldManNick) · 2022-11-10T06:40:05.776Z · comments (23)
Visible Homelessness in SF: A Quick Breakdown of Causes
alyssavance · 2022-05-25T01:40:43.768Z · comments (32)
Benign Boundary Violations
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2022-05-26T06:48:35.585Z · comments (84)
A concrete bet offer to those with short AGI timelines
Matthew Barnett (matthew-barnett) · 2022-04-09T21:41:45.106Z · comments (116)
We Are Conjecture, A New Alignment Research Startup
Connor Leahy (NPCollapse) · 2022-04-08T11:40:13.727Z · comments (25)
Humans provide an untapped wealth of evidence about alignment
TurnTrout · 2022-07-14T02:31:48.575Z · comments (94)
Brain Efficiency: Much More than You Wanted to Know
jacob_cannell · 2022-01-06T03:38:00.320Z · comments (102)
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren · 2022-11-28T12:54:52.399Z · comments (33)
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC (LawChan) · 2022-12-03T00:58:36.973Z · comments (35)
What does it take to defend the world against out-of-control AGIs?
Steven Byrnes (steve2152) · 2022-10-25T14:47:41.970Z · comments (47)
[link] Connor Leahy on Dying with Dignity, EleutherAI and Conjecture
Michaël Trazzi (mtrazzi) · 2022-07-22T18:44:19.749Z · comments (29)
On saving one's world
Rob Bensinger (RobbBB) · 2022-05-17T19:53:58.192Z · comments (4)
A note about differential technological development
So8res · 2022-07-15T04:46:53.166Z · comments (32)
How my team at Lightcone sometimes gets stuff done
jacobjacob · 2022-09-19T05:47:06.787Z · comments (43)
Do a cost-benefit analysis of your technology usage
TurnTrout · 2022-03-27T23:09:26.753Z · comments (53)
Worlds Where Iterative Design Fails
johnswentworth · 2022-08-30T20:48:29.025Z · comments (30)
Tyranny of the Epistemic Majority
Scott Garrabrant · 2022-11-22T17:19:34.144Z · comments (13)
Conjecture: a retrospective after 8 months of work
Connor Leahy (NPCollapse) · 2022-11-23T17:10:23.510Z · comments (9)
Have You Tried Hiring People?
rank-biserial · 2022-03-02T02:06:39.656Z · comments (117)
dalle2 comments
nostalgebraist · 2022-04-26T05:30:07.748Z · comments (14)
Language models seem to be much better than humans at next-token prediction
Buck · 2022-08-11T17:45:41.294Z · comments (59)
Look For Principles Which Will Carry Over To The Next Paradigm
johnswentworth · 2022-01-14T20:22:58.606Z · comments (7)
← previous page (newer posts) · next page (older posts) →