LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Have Attention Spans Been Declining?
niplav · 2023-09-08T14:11:55.224Z · comments (21)
[link] [Link post] Michael Nielsen's "Notes on Existential Risk from Artificial Superintelligence"
Joel Becker (joel-becker) · 2023-09-19T13:31:02.298Z · comments (12)
If influence functions are not approximating leave-one-out, how are they supposed to help?
Fabien Roger (Fabien) · 2023-09-22T14:23:45.847Z · comments (4)
List of how people have become more hard-working
Chi Nguyen · 2023-09-29T11:30:38.802Z · comments (7)
AI #29: Take a Deep Breath
Zvi · 2023-09-14T12:00:03.818Z · comments (21)
[link] Can I take ducks home from the park?
dynomight · 2023-09-14T21:03:09.534Z · comments (8)
a rant on politician-engineer coalitional conflict
bhauth · 2023-09-04T17:15:25.765Z · comments (12)
Interpretability Externalities Case Study - Hungry Hungry Hippos
Magdalena Wache · 2023-09-20T14:42:44.371Z · comments (22)
Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it)
Ruby · 2023-09-28T02:48:58.994Z · comments (73)
Eugenics Performed By A Blind, Idiot God
omnizoid · 2023-09-17T20:37:13.650Z · comments (10)
[link] GPT-4 for personal productivity: online distraction blocker
Sergii (sergey-kharagorgiev) · 2023-09-26T17:41:31.031Z · comments (11)
Instrumental Convergence Bounty
Logan Zoellner (logan-zoellner) · 2023-09-14T14:02:32.989Z · comments (24)
[link] Linkpost for Jan Leike on Self-Exfiltration
Daniel Kokotajlo (daniel-kokotajlo) · 2023-09-13T21:23:09.239Z · comments (1)
[link] Understanding strategic deception and deceptive alignment
Marius Hobbhahn (marius-hobbhahn) · 2023-09-25T16:27:47.357Z · comments (16)
[link] Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Scott Emmons · 2023-09-20T15:23:48.898Z · comments (9)
Bids To Defer On Value Judgements
johnswentworth · 2023-09-29T17:07:25.834Z · comments (6)
Who Has the Best Food?
Zvi · 2023-09-05T13:40:07.593Z · comments (61)
AI#28: Watching and Waiting
Zvi · 2023-09-07T17:20:10.559Z · comments (14)
Some reasons why I frequently prefer communicating via text
Adam Zerner (adamzerner) · 2023-09-18T21:50:48.620Z · comments (18)
Protest against Meta's irreversible proliferation (Sept 29, San Francisco)
Holly_Elmore · 2023-09-19T23:40:30.202Z · comments (33)
Is AI Safety dropping the ball on privacy?
markov (markovial) · 2023-09-13T13:07:24.358Z · comments (17)
Basic Mathematics of Predictive Coding
Adam Shai (adam-shai) · 2023-09-29T14:38:28.517Z · comments (6)
Fund Transit With Development
jefftk (jkaufman) · 2023-09-22T11:10:05.645Z · comments (22)
Three ways interpretability could be impactful
Arthur Conmy (arthur-conmy) · 2023-09-18T01:02:30.529Z · comments (8)
Competitive, Cooperative, and Cohabitive
Screwtape · 2023-09-28T23:25:52.723Z · comments (12)
Telopheme, telophore, and telotect
TsviBT · 2023-09-17T16:24:03.365Z · comments (7)
The goal of physics
Jim Pivarski (jim-pivarski) · 2023-09-02T23:08:02.125Z · comments (4)
[link] Immortality or death by AGI
ImmortalityOrDeathByAGI · 2023-09-21T23:59:59.545Z · comments (30)
[question] Where might I direct promising-to-me researchers to apply for alignment jobs/grants?
abramdemski · 2023-09-18T16:20:03.452Z · answers+comments (10)
[link] The point of a game is not to win, and you shouldn't even pretend that it is
mako yass (MakoYass) · 2023-09-28T15:54:27.990Z · comments (27)
Commonsense Good, Creative Good
jefftk (jkaufman) · 2023-09-27T19:50:07.486Z · comments (11)
[link] Amazon to invest up to $4 billion in Anthropic
Davis_Kingsley · 2023-09-25T14:55:35.983Z · comments (8)
Feedback-loops, Deliberate Practice, and Transfer Learning
jacobjacob · 2023-09-07T01:57:33.066Z · comments (5)
[link] Jacob on the Precipice
Richard_Ngo (ricraz) · 2023-09-26T21:16:39.590Z · comments (8)
Recreating the caring drive
Catnee (Dmitry Savishchev) · 2023-09-07T10:41:16.453Z · comments (14)
Sparse Coding, for Mechanistic Interpretability and Activation Engineering
David Udell · 2023-09-23T19:16:31.772Z · comments (7)
Focus on the Hardest Part First
Johannes C. Mayer (johannes-c-mayer) · 2023-09-11T07:53:33.188Z · comments (13)
[question] Strongest real-world examples supporting AI risk claims?
rosehadshar · 2023-09-05T15:12:11.307Z · answers+comments (7)
Technical AI Safety Research Landscape [Slides]
Magdalena Wache · 2023-09-18T13:56:04.418Z · comments (0)
What is the optimal frontier for due diligence?
RobertM (T3t) · 2023-09-08T18:20:03.300Z · comments (1)
Luck based medicine: inositol for anxiety and brain fog
Elizabeth (pktechgirl) · 2023-09-22T20:10:07.117Z · comments (5)
[link] ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman · 2023-09-28T04:30:37.140Z · comments (9)
Deconfusing Regret
Alex Hollow · 2023-09-15T11:52:03.294Z · comments (32)
Reflexive decision theory is an unsolved problem
Richard_Kennaway · 2023-09-17T14:15:09.222Z · comments (27)
Debate series: should we push for a pause on the development of AI?
Xodarap · 2023-09-08T16:29:51.367Z · comments (1)
Startup Roundup #1: Happy Demo Day
Zvi · 2023-09-12T13:20:03.883Z · comments (5)
Actually, "personal attacks after object-level arguments" is a pretty good rule of epistemic conduct
Max H (Maxc) · 2023-09-17T20:25:01.237Z · comments (15)
I designed an AI safety course (for a philosophy department)
Eleni Angelou (ea-1) · 2023-09-23T22:03:00.036Z · comments (15)
[link] Alignment Workshop talks
Richard_Ngo (ricraz) · 2023-09-28T18:26:30.250Z · comments (1)
[link] Neel Nanda on the Mechanistic Interpretability Researcher Mindset
Michaël Trazzi (mtrazzi) · 2023-09-21T19:47:02.745Z · comments (1)
← previous page (newer posts) · next page (older posts) →