LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow (jessica-cooper) · 2023-02-05T22:02:35.854Z · comments (204)
Focus on the places where you feel shocked everyone's dropping the ball
So8res · 2023-02-02T00:27:55.687Z · comments (61)
Bing Chat is blatantly, aggressively misaligned
evhub · 2023-02-15T05:29:45.262Z · comments (170)
Noting an error in Inadequate Equilibria
Matthew Barnett (matthew-barnett) · 2023-02-08T01:33:33.715Z · comments (56)
Please don't throw your mind away
TsviBT · 2023-02-15T21:41:05.988Z · comments (44)
Cyborgism
NicholasKees (nick_kees) · 2023-02-10T14:47:48.172Z · comments (46)
[link] Childhoods of exceptional people
Henrik Karlsson (henrik-karlsson) · 2023-02-06T17:27:09.596Z · comments (62)
Fucking Goddamn Basics of Rationalist Discourse
LoganStrohl (BrienneYudkowsky) · 2023-02-04T01:47:32.578Z · comments (97)
[link] I hired 5 people to sit behind me and make me productive for a month
Simon Berens (sberens) · 2023-02-05T01:19:39.182Z · comments (81)
You Don't Exist, Duncan
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-02-02T08:37:01.049Z · comments (107)
[link] AGI in sight: our look at the game board
Andrea_Miotti (AndreaM) · 2023-02-18T22:17:44.364Z · comments (135)
Elements of Rationalist Discourse
Rob Bensinger (RobbBB) · 2023-02-12T07:58:42.479Z · comments (47)
Cognitive Emulation: A Naive AI Safety Proposal
Connor Leahy (NPCollapse) · 2023-02-25T19:35:02.409Z · comments (45)
AI alignment researchers don't (seem to) stack
So8res · 2023-02-21T00:48:25.186Z · comments (40)
EigenKarma: trust at scale
Henrik Karlsson (henrik-karlsson) · 2023-02-08T18:52:24.490Z · comments (50)
Why Are Bacteria So Simple?
aysja · 2023-02-06T03:00:31.837Z · comments (33)
AI #1: Sydney and Bing
Zvi · 2023-02-21T14:00:00.480Z · comments (44)
My understanding of Anthropic strategy
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2023-02-15T01:56:40.961Z · comments (31)
[link] Parametrically retargetable decision-makers tend to seek power
TurnTrout · 2023-02-18T18:41:38.740Z · comments (9)
[link] [Link] A community alert about Ziz
DanielFilan · 2023-02-24T00:06:00.027Z · comments (126)
Big Mac Subsidy?
jefftk (jkaufman) · 2023-02-23T04:00:03.996Z · comments (24)
Stop posting prompt injections on Twitter and calling it "misalignment"
lc · 2023-02-19T02:21:44.061Z · comments (9)
[link] We Found An Neuron in GPT-2
Joseph Miller (Josephm) · 2023-02-11T18:27:29.410Z · comments (22)
Full Transcript: Eliezer Yudkowsky on the Bankless podcast
remember · 2023-02-23T12:34:19.523Z · comments (89)
[link] Anomalous tokens reveal the original identities of Instruct models
janus · 2023-02-09T01:30:56.609Z · comments (16)
Modal Fixpoint Cooperation without Löb's Theorem
Andrew_Critch · 2023-02-05T00:58:40.975Z · comments (32)
Pretraining Language Models with Human Preferences
Tomek Korbak (tomek-korbak) · 2023-02-21T17:57:09.774Z · comments (18)
"Rationalist Discourse" Is Like "Physicist Motors"
Zack_M_Davis · 2023-02-26T05:58:29.249Z · comments (152)
Evaluations (of new AI Safety researchers) can be noisy
LawrenceC (LawChan) · 2023-02-05T04:15:02.117Z · comments (10)
Hashing out long-standing disagreements seems low-value to me
So8res · 2023-02-16T06:20:00.899Z · comments (34)
Recommendation: Bug Bounties and Responsible Disclosure for Advanced ML Systems
Vaniver · 2023-02-17T20:11:39.255Z · comments (11)
In Defense of Chatbot Romance
Kaj_Sotala · 2023-02-11T14:30:05.696Z · comments (52)
There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs
Taran · 2023-02-19T12:25:52.212Z · comments (33)
A proposed method for forecasting transformative AI
Matthew Barnett (matthew-barnett) · 2023-02-10T19:34:01.358Z · comments (21)
There are no coherence theorems
Dan H (dan-hendrycks) · 2023-02-20T21:25:48.478Z · comments (114)
One-layer transformers aren’t equivalent to a set of skip-trigrams
Buck · 2023-02-17T17:26:13.819Z · comments (10)
GPT-175bee
Adam Scherlis (adam-scherlis) · 2023-02-08T18:58:01.364Z · comments (13)
On Investigating Conspiracy Theories
Zvi · 2023-02-20T12:50:00.891Z · comments (38)
The public supports regulating AI for safety
Zach Stein-Perlman · 2023-02-17T04:10:03.307Z · comments (9)
The Open Agency Model
Eric Drexler · 2023-02-22T10:35:12.316Z · comments (18)
Bing chat is the AI fire alarm
Ratios · 2023-02-17T06:51:51.551Z · comments (62)
GPT-4 Predictions
Stephen McAleese (stephen-mcaleese) · 2023-02-17T23:20:24.696Z · comments (27)
SolidGoldMagikarp II: technical details and more recent findings
mwatkins · 2023-02-06T19:09:01.406Z · comments (45)
A Way To Be Okay
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-02-19T20:27:10.061Z · comments (36)
Conflict Theory of Bounded Distrust
Zack_M_Davis · 2023-02-12T05:30:30.760Z · comments (29)
I don't think MIRI "gave up"
Raemon · 2023-02-03T00:26:07.552Z · comments (64)
[link] Sam Altman: "Planning for AGI and beyond"
LawrenceC (LawChan) · 2023-02-24T20:28:00.430Z · comments (54)
Cyborg Periods: There will be multiple AI transitions
Jan_Kulveit · 2023-02-22T16:09:04.858Z · comments (9)
Don't accelerate problems you're trying to solve
Andrea_Miotti (AndreaM) · 2023-02-15T18:11:30.595Z · comments (26)
H5N1
Zvi · 2023-02-13T12:50:00.694Z · comments (1)
next page (older posts) →