LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Contra Hofstadter on GPT-3 Nonsense
rictic · 2022-06-15T21:53:30.646Z · comments (24)
Thoughts on the impact of RLHF research
paulfchristiano · 2023-01-25T17:23:16.402Z · comments (101)
Announcing Balsa Research
Zvi · 2022-09-25T22:50:00.626Z · comments (64)
The shard theory of human values
Quintin Pope (quintin-pope) · 2022-09-04T04:28:11.752Z · comments (66)
An Observation of Vavilov Day
Elizabeth (pktechgirl) · 2022-01-03T21:10:02.107Z · comments (42)
The Feeling of Idea Scarcity
johnswentworth · 2022-12-31T17:34:04.306Z · comments (22)
Deep Deceptiveness
So8res · 2023-03-21T02:51:52.794Z · comments (58)
[link] More information about the dangerous capability evaluations we did with GPT-4 and Claude.
Beth Barnes (beth-barnes) · 2023-03-19T00:25:39.707Z · comments (54)
Editing Advice for LessWrong Users
JustisMills · 2022-04-11T16:32:17.530Z · comments (14)
You Don't Exist, Duncan
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-02-02T08:37:01.049Z · comments (107)
UFO Betting: Put Up or Shut Up
RatsWrongAboutUAP · 2023-06-13T04:05:32.652Z · comments (207)
My Clients, The Liars
ymeskhout · 2024-03-05T21:06:36.669Z · comments (85)
Policy discussions follow strong contextualizing norms
Richard_Ngo (ricraz) · 2023-04-01T23:51:36.588Z · comments (61)
Introduction to abstract entropy
Alex_Altair · 2022-10-20T21:03:02.486Z · comments (78)
Self-driving car bets
paulfchristiano · 2023-07-29T18:10:01.112Z · comments (41)
Lessons On How To Get Things Right On The First Try
johnswentworth · 2023-06-19T23:58:09.605Z · comments (56)
[link] Sum-threshold attacks
TsviBT · 2023-09-08T17:13:37.044Z · comments (52)
(briefly) RaDVaC and SMTM, two things we should be doing
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-01-12T06:20:35.555Z · comments (79)
[link] AGI in sight: our look at the game board
Andrea_Miotti (AndreaM) · 2023-02-18T22:17:44.364Z · comments (135)
AGI Safety FAQ / all-dumb-questions-allowed thread
Aryeh Englander (alenglander) · 2022-06-07T05:47:13.350Z · comments (526)
[link] ARC's first technical report: Eliciting Latent Knowledge
paulfchristiano · 2021-12-14T20:09:50.209Z · comments (90)
Replacing Karma with Good Heart Tokens (Worth $1!)
Ben Pace (Benito) · 2022-04-01T09:31:34.332Z · comments (173)
Catching the Eye of Sauron
Casey B. (Zahima) · 2023-04-07T00:40:46.556Z · comments (68)
Announcing MIRI’s new CEO and leadership team
Gretta Duleba (gretta-duleba) · 2023-10-10T19:22:11.821Z · comments (52)
Brute Force Manufactured Consensus is Hiding the Crime of the Century
Roko · 2024-02-03T20:36:59.806Z · comments (156)
What do ML researchers think about AI in 2022?
KatjaGrace · 2022-08-04T15:40:05.024Z · comments (33)
How I buy things when Lightcone wants them fast
jacobjacob · 2022-09-26T05:02:09.003Z · comments (21)
Recursive Middle Manager Hell
Raemon · 2023-01-01T04:33:29.942Z · comments (45)
MIRI 2024 Mission and Strategy Update
Malo (malo) · 2024-01-05T00:20:54.169Z · comments (44)
[link] AI presidents discuss AI alignment agendas
TurnTrout · 2023-09-09T18:55:37.931Z · comments (22)
Announcing Apollo Research
Marius Hobbhahn (marius-hobbhahn) · 2023-05-30T16:17:19.767Z · comments (11)
CFAR Takeaways: Andrew Critch
Raemon · 2024-02-14T01:37:03.931Z · comments (62)
What are the results of more parental supervision and less outdoor play?
juliawise · 2023-11-25T12:52:29.986Z · comments (30)
Elements of Rationalist Discourse
Rob Bensinger (RobbBB) · 2023-02-12T07:58:42.479Z · comments (47)
Ways I Expect AI Regulation To Increase Extinction Risk
1a3orn · 2023-07-04T17:32:48.047Z · comments (32)
Lessons learned from talking to >100 academics about AI safety
Marius Hobbhahn (marius-hobbhahn) · 2022-10-10T13:16:38.036Z · comments (17)
Thoughts on responsible scaling policies and regulation
paulfchristiano · 2023-10-24T22:21:18.341Z · comments (33)
Moses and the Class Struggle
lsusr · 2022-04-01T11:55:04.911Z · comments (26)
Modern Transformers are AGI, and Human-Level
abramdemski · 2024-03-26T17:46:19.373Z · comments (89)
ProjectLawful.com: Eliezer's latest story, past 1M words
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-05-11T06:18:02.738Z · comments (112)
ChatGPT can learn indirect control
Raymond D · 2024-03-21T21:11:06.649Z · comments (23)
What I would do if I wasn’t at ARC Evals
LawrenceC (LawChan) · 2023-09-05T19:19:36.830Z · comments (8)
Believing In
AnnaSalamon · 2024-02-08T07:06:13.072Z · comments (49)
Launching Lightspeed Grants (Apply by July 6th)
habryka (habryka4) · 2023-06-07T02:53:29.227Z · comments (41)
[link] Actually, Othello-GPT Has A Linear Emergent World Representation
Neel Nanda (neel-nanda-1) · 2023-03-29T22:13:14.878Z · comments (24)
[link] Cultivating a state of mind where new ideas are born
Henrik Karlsson (henrik-karlsson) · 2023-07-27T09:16:42.566Z · comments (18)
[link] Introducing AI Lab Watch
Zach Stein-Perlman · 2024-04-30T17:00:12.652Z · comments (25)
[link] Orthogonal: A new agent foundations alignment organization
Tamsin Leake (carado-1) · 2023-04-19T20:17:14.174Z · comments (4)
Natural Abstractions: Key claims, Theorems, and Critiques
LawrenceC (LawChan) · 2023-03-16T16:37:40.181Z · comments (20)
What it's like to dissect a cadaver
Alok Singh (OldManNick) · 2022-11-10T06:40:05.776Z · comments (24)
← previous page (newer posts) · next page (older posts) →