LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Wisdom Cannot Be Unzipped
Sable · 2022-10-22T00:28:25.476Z · comments (17)
SERI MATS Program - Winter 2022 Cohort
Ryan Kidd (ryankidd44) · 2022-10-08T19:09:53.231Z · comments (12)
Maximal Lottery-Lotteries
Scott Garrabrant · 2022-10-17T20:39:20.143Z · comments (15)
[link] An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers
Neel Nanda (neel-nanda-1) · 2022-10-18T21:08:33.033Z · comments (5)
Signals of war in August 2021
yieldthought · 2022-10-26T08:11:12.847Z · comments (16)
New book on s-risks
Tobias_Baumann · 2022-10-28T09:36:57.642Z · comments (1)
QAPR 4: Inductive biases
Quintin Pope (quintin-pope) · 2022-10-10T22:08:52.069Z · comments (2)
The Balto/Togo theory of scientific development
Elizabeth (pktechgirl) · 2022-10-09T18:30:07.452Z · comments (5)
Empowerment is (almost) All We Need
jacob_cannell · 2022-10-23T21:48:55.439Z · comments (44)
Possible miracles
Akash (akash-wasil) · 2022-10-09T18:17:01.470Z · comments (33)
The harms you don't see
ViktoriaMalyasova · 2022-10-16T23:45:31.113Z · comments (54)
[link] my current outlook on AI risk mitigation
Tamsin Leake (carado-1) · 2022-10-03T20:06:48.995Z · comments (6)
[link] A Barebones Guide to Mechanistic Interpretability Prerequisites
Neel Nanda (neel-nanda-1) · 2022-10-24T20:45:27.938Z · comments (12)
The optimal timing of spending on AGI safety work; why we should probably be spending more now
Tristan Cook · 2022-10-24T17:42:05.865Z · comments (0)
Beyond Kolmogorov and Shannon
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2022-10-25T15:13:56.484Z · comments (17)
Clarifying Your Principles
Raemon · 2022-10-01T21:20:33.474Z · comments (10)
[link] Calibrate - New Chrome Extension for hiding numbers so you can guess
chanamessinger (cmessinger) · 2022-10-07T11:21:51.711Z · comments (16)
How Risky Is Trick-or-Treating?
jefftk (jkaufman) · 2022-10-27T14:10:03.440Z · comments (18)
Calibration of a thousand predictions
KatjaGrace · 2022-10-12T08:50:16.768Z · comments (7)
Notes on "Can you control the past"
So8res · 2022-10-20T03:41:43.566Z · comments (41)
aisafety.community - A living document of AI safety communities
zeshen · 2022-10-28T17:50:12.535Z · comments (23)
Covid 10/20/22: Wait, We Did WHAT?
Zvi · 2022-10-20T21:50:00.878Z · comments (16)
Anonymous advice: If you want to reduce AI risk, should you take roles that advance AI capabilities?
Benjamin Hilton (80000hours) · 2022-10-11T14:16:02.550Z · comments (9)
Looping
Jarred Filmer (4thWayWastrel) · 2022-10-05T01:47:59.257Z · comments (6)
[link] More examples of goal misgeneralization
Rohin Shah (rohinmshah) · 2022-10-07T14:38:00.288Z · comments (8)
Weekly Non-Covid News #1 (10/13/22)
Zvi · 2022-10-13T15:40:01.523Z · comments (16)
Towards a comprehensive study of potential psychological causes of the ordinary range of variation of affective gender identity in males
tailcalled · 2022-10-12T21:10:46.440Z · comments (4)
[link] A Walkthrough of A Mathematical Framework for Transformer Circuits
Neel Nanda (neel-nanda-1) · 2022-10-25T20:24:54.638Z · comments (7)
[link] Paper: Large Language Models Can Self-improve [Linkpost]
Evan R. Murphy · 2022-10-02T01:29:00.181Z · comments (14)
Smoke without fire is scary
Adam Jermyn (adam-jermyn) · 2022-10-04T21:08:33.108Z · comments (22)
[link] They gave LLMs access to physics simulators
ryan_b · 2022-10-17T21:21:56.904Z · comments (18)
Help out Redwood Research’s interpretability team by finding heuristics implemented by GPT-2 small
Haoxing Du (haoxing-du) · 2022-10-12T21:25:00.459Z · comments (11)
Why I think nuclear war triggered by Russian tactical nukes in Ukraine is unlikely
Dave Orr (dave-orr) · 2022-10-11T18:30:08.110Z · comments (7)
Good ontologies induce commutative diagrams
Erik Jenner (ejenner) · 2022-10-09T00:06:19.911Z · comments (5)
Humans aren't fitness maximizers
So8res · 2022-10-04T01:31:47.566Z · comments (46)
We can do better than argmax
Jan_Kulveit · 2022-10-10T10:32:02.788Z · comments (4)
Is GPT-N bounded by human capabilities? No.
Cleo Nardo (strawberry calm) · 2022-10-17T23:26:43.981Z · comments (8)
Prettified AI Safety Game Cards
abramdemski · 2022-10-11T19:35:18.991Z · comments (6)
[question] What sorts of preparations ought I do in case of further escalation in Ukraine?
tailcalled · 2022-10-01T16:44:58.046Z · answers+comments (7)
Are c-sections underrated?
braces · 2022-10-01T20:32:19.127Z · comments (15)
A common failure for foxes
Rob Bensinger (RobbBB) · 2022-10-14T22:50:59.614Z · comments (7)
[link] How to Take Over the Universe (in Three Easy Steps)
Writer · 2022-10-18T15:04:27.420Z · comments (17)
[link] Paper+Summary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA
Marius Hobbhahn (marius-hobbhahn) · 2022-10-04T07:22:14.975Z · comments (11)
Apollo
Jarred Filmer (4thWayWastrel) · 2022-10-10T21:30:44.496Z · comments (0)
[link] A review of the Bio-Anchors report
jylin04 · 2022-10-03T10:27:58.259Z · comments (4)
Four usages of "loss" in AI
TurnTrout · 2022-10-02T00:52:35.959Z · comments (18)
A conversation about Katja's counterarguments to AI risk
Matthew Barnett (matthew-barnett) · 2022-10-18T18:40:36.543Z · comments (9)
Trigger-based rapid checklists
VipulNaik · 2022-10-26T04:05:13.036Z · comments (0)
Recall and Regurgitation in GPT2
Megan Kinniment (megan-kinniment) · 2022-10-03T19:35:23.361Z · comments (1)
Furry Rationalists & Effective Anthropomorphism both exist
agentydragon · 2022-10-25T03:37:57.213Z · comments (3)
← previous page (newer posts) · next page (older posts) →