LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Mysteries of mode collapse
janus · 2022-11-08T10:37:57.760Z · comments (56)
I Converted Book I of The Sequences Into A Zoomer-Readable Format
dkirmani · 2022-11-10T02:59:04.236Z · comments (31)
What it's like to dissect a cadaver
Alok Singh (OldManNick) · 2022-11-10T06:40:05.776Z · comments (23)
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren · 2022-11-28T12:54:52.399Z · comments (33)
Tyranny of the Epistemic Majority
Scott Garrabrant · 2022-11-22T17:19:34.144Z · comments (13)
Conjecture: a retrospective after 8 months of work
Connor Leahy (NPCollapse) · 2022-11-23T17:10:23.510Z · comments (9)
Planes are still decades away from displacing most bird jobs
guzey · 2022-11-25T16:49:32.344Z · comments (13)
Geometric Rationality is Not VNM Rational
Scott Garrabrant · 2022-11-27T19:36:00.939Z · comments (26)
The Geometric Expectation
Scott Garrabrant · 2022-11-23T18:05:12.206Z · comments (19)
The Alignment Community Is Culturally Broken
sudo · 2022-11-13T18:53:55.054Z · comments (68)
Sadly, FTX
Zvi · 2022-11-17T14:30:03.068Z · comments (18)
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak · 2022-11-22T18:57:29.604Z · comments (98)
Mechanistic anomaly detection and ELK
paulfchristiano · 2022-11-25T18:50:04.447Z · comments (21)
On the Diplomacy AI
Zvi · 2022-11-28T13:20:00.884Z · comments (29)
Clarifying AI X-risk
zac_kenton (zkenton) · 2022-11-01T11:03:01.144Z · comments (24)
Geometric Exploration, Arithmetic Exploitation
Scott Garrabrant · 2022-11-24T15:36:30.334Z · comments (4)
Utilitarianism Meets Egalitarianism
Scott Garrabrant · 2022-11-21T19:00:12.168Z · comments (16)
Speculation on Current Opportunities for Unusually High Impact in Global Health
johnswentworth · 2022-11-11T20:47:03.367Z · comments (31)
What I Learned Running Refine
adamShimi · 2022-11-24T14:49:59.366Z · comments (5)
Applying superintelligence without collusion
Eric Drexler · 2022-11-08T18:08:31.733Z · comments (63)
How could we know that an AGI system will have good consequences?
So8res · 2022-11-07T22:42:27.395Z · comments (25)
Caution when interpreting Deepmind's In-context RL paper
Sam Marks (samuel-marks) · 2022-11-01T02:42:06.766Z · comments (6)
LW Beta Feature: Side-Comments
jimrandomh · 2022-11-24T01:55:31.578Z · comments (47)
LessWrong readers are invited to apply to the Lurkshop
Jonas V (Jonas Vollmer) · 2022-11-22T09:19:05.412Z · comments (41)
Instead of technical research, more people should focus on buying time
Akash (akash-wasil) · 2022-11-05T20:43:45.215Z · comments (45)
[link] ARC paper: Formalizing the presumption of independence
Erik Jenner (ejenner) · 2022-11-20T01:22:55.110Z · comments (2)
Instrumental convergence is what makes general intelligence possible
tailcalled · 2022-11-11T16:38:14.390Z · comments (11)
[link] Trying to Make a Treacherous Mesa-Optimizer
MadHatter · 2022-11-09T18:07:03.157Z · comments (14)
[link] Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)
Jacy Reese Anthis (Jacy Reese) · 2022-11-22T16:50:20.054Z · comments (64)
Conjecture Second Hiring Round
Connor Leahy (NPCollapse) · 2022-11-23T17:11:42.524Z · comments (0)
Searching for Search
NicholasKees (nick_kees) · 2022-11-28T15:31:49.974Z · comments (8)
Current themes in mechanistic interpretability research
Lee Sharkey (Lee_Sharkey) · 2022-11-16T14:14:02.030Z · comments (2)
By Default, GPTs Think In Plain Sight
Fabien Roger (Fabien) · 2022-11-19T19:15:29.591Z · comments (33)
Announcing the Progress Forum
jasoncrawford · 2022-11-17T19:26:29.584Z · comments (9)
When AI solves a game, focus on the game's mechanics, not its theme.
Cleo Nardo (strawberry calm) · 2022-11-23T19:16:07.333Z · comments (7)
[link] Results from the interpretability hackathon
Esben Kran (esben-kran) · 2022-11-17T14:51:44.568Z · comments (0)
Exams-Only Universities
Mati_Roy (MathieuRoy) · 2022-11-06T22:05:39.373Z · comments (40)
Always know where your abstractions break
lsusr · 2022-11-27T06:32:09.643Z · comments (6)
Disagreement with bio anchors that lead to shorter timelines
Marius Hobbhahn (marius-hobbhahn) · 2022-11-16T14:40:16.734Z · comments (17)
[link] Engineering Monosemanticity in Toy Models
Adam Jermyn (adam-jermyn) · 2022-11-18T01:43:38.623Z · comments (7)
Follow up to medical miracle
Elizabeth (pktechgirl) · 2022-11-04T18:00:01.858Z · comments (5)
Threat Model Literature Review
zac_kenton (zkenton) · 2022-11-01T11:03:22.610Z · comments (4)
[link] Will we run out of ML data? Evidence from projecting dataset size trends
Pablo Villalobos (pvs) · 2022-11-14T16:42:27.135Z · comments (12)
[link] Elastic Productivity Tools
Simon Berens (sberens) · 2022-11-19T21:59:39.913Z · comments (8)
[link] What is epigenetics?
Metacelsus · 2022-11-06T01:24:05.350Z · comments (4)
Respecting your Local Preferences
Scott Garrabrant · 2022-11-26T19:04:14.252Z · comments (1)
Takeaways from a survey on AI alignment resources
DanielFilan · 2022-11-05T23:40:01.917Z · comments (10)
Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility
Akash (akash-wasil) · 2022-11-22T22:19:09.419Z · comments (20)
Update to Mysteries of mode collapse: text-davinci-002 not RLHF
janus · 2022-11-19T23:51:27.510Z · comments (8)
K-types vs T-types — what priors do you have?
Cleo Nardo (strawberry calm) · 2022-11-03T11:29:00.809Z · comments (25)
next page (older posts) →