LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Embedded Interactive Predictions on LessWrong
Amandango · 2020-11-20T18:35:32.089Z · comments (88)
Common misconceptions about OpenAI
Jacob_Hilton · 2022-08-25T14:02:26.257Z · comments (154)
Jailbreaking ChatGPT on Release Day
Zvi · 2022-12-02T13:10:00.860Z · comments (77)
Book Review: Going Infinite
Zvi · 2023-10-24T15:00:02.251Z · comments (113)
My Clients, The Liars
ymeskhout · 2024-03-05T21:06:36.669Z · comments (85)
AI companies aren't really using external evaluators
Zach Stein-Perlman · 2024-05-24T16:01:21.184Z · comments (15)
Dark Matters
Diffractor · 2021-03-14T23:36:58.884Z · comments (23)
Concentration of Force
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-11-06T08:20:18.991Z · comments (23)
A Quick Guide to Confronting Doom
Ruby · 2022-04-13T19:30:48.580Z · comments (33)
Refusal in LLMs is mediated by a single direction
Andy Arditi (andy-arditi) · 2024-04-27T11:13:06.235Z · comments (93)
The Plan - 2022 Update
johnswentworth · 2022-12-01T20:43:50.516Z · comments (37)
Slow motion videos as AI risk intuition pumps
Andrew_Critch · 2022-06-14T19:31:13.616Z · comments (41)
[link] Sum-threshold attacks
TsviBT · 2023-09-08T17:13:37.044Z · comments (55)
Natural Abstractions: Key claims, Theorems, and Critiques
LawrenceC (LawChan) · 2023-03-16T16:37:40.181Z · comments (23)
An Observation of Vavilov Day
Elizabeth (pktechgirl) · 2022-01-03T21:10:02.107Z · comments (42)
Contra Hofstadter on GPT-3 Nonsense
rictic · 2022-06-15T21:53:30.646Z · comments (24)
Announcing Balsa Research
Zvi · 2022-09-25T22:50:00.626Z · comments (64)
AI Control: Improving Safety Despite Intentional Subversion
Buck · 2023-12-13T15:51:35.982Z · comments (21)
Introduction to abstract entropy
Alex_Altair · 2022-10-20T21:03:02.486Z · comments (78)
[link] Precognition
jasoncrawford · 2021-06-14T00:38:29.791Z · comments (35)
[link] Explore More: A Bag of Tricks to Keep Your Life on the Rails
Shoshannah Tekofsky (DarkSym) · 2024-09-28T21:38:52.256Z · comments (15)
Self-driving car bets
paulfchristiano · 2023-07-29T18:10:01.112Z · comments (44)
[link] More information about the dangerous capability evaluations we did with GPT-4 and Claude.
Beth Barnes (beth-barnes) · 2023-03-19T00:25:39.707Z · comments (54)
A whirlwind tour of Ethereum finance
cata · 2021-03-02T09:36:23.477Z · comments (52)
Editing Advice for LessWrong Users
JustisMills · 2022-04-11T16:32:17.530Z · comments (14)
ProjectLawful.com: Eliezer's latest story, past 1M words
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-05-11T06:18:02.738Z · comments (112)
Ways I Expect AI Regulation To Increase Extinction Risk
1a3orn · 2023-07-04T17:32:48.047Z · comments (32)
(briefly) RaDVaC and SMTM, two things we should be doing
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-01-12T06:20:35.555Z · comments (79)
Policy discussions follow strong contextualizing norms
Richard_Ngo (ricraz) · 2023-04-01T23:51:36.588Z · comments (61)
Believing In
AnnaSalamon · 2024-02-08T07:06:13.072Z · comments (51)
[link] Zoe Curzi's Experience with Leverage Research
Ilverin the Stupid and Offensive (Ilverin) · 2021-10-13T04:44:49.020Z · comments (261)
SAE feature geometry is outside the superposition hypothesis
jake_mendel · 2024-06-24T16:07:14.604Z · comments (17)
AGI Safety FAQ / all-dumb-questions-allowed thread
Aryeh Englander (alenglander) · 2022-06-07T05:47:13.350Z · comments (526)
You are not too "irrational" to know your preferences.
DaystarEld · 2024-11-26T15:01:42.996Z · comments (50)
[link] AGI in sight: our look at the game board
Andrea_Miotti (AndreaM) · 2023-02-18T22:17:44.364Z · comments (135)
What are the results of more parental supervision and less outdoor play?
juliawise · 2023-11-25T12:52:29.986Z · comments (31)
Fun with +12 OOMs of Compute
Daniel Kokotajlo (daniel-kokotajlo) · 2021-03-01T13:30:13.603Z · comments (86)
Credibility of the CDC on SARS-CoV-2
Elizabeth (pktechgirl) · 2020-03-07T02:00:00.452Z · comments (119)
interpreting GPT: the logit lens
nostalgebraist · 2020-08-31T02:47:08.426Z · comments (37)
[link] ARC's first technical report: Eliciting Latent Knowledge
paulfchristiano · 2021-12-14T20:09:50.209Z · comments (90)
[link] "How could I have thought that faster?"
mesaoptimizer · 2024-03-11T10:56:17.884Z · comments (32)
[link] Cultivating a state of mind where new ideas are born
Henrik Karlsson (henrik-karlsson) · 2023-07-27T09:16:42.566Z · comments (20)
[link] Introducing AI Lab Watch
Zach Stein-Perlman · 2024-04-30T17:00:12.652Z · comments (30)
Replacing Karma with Good Heart Tokens (Worth $1!)
Ben Pace (Benito) · 2022-04-01T09:31:34.332Z · comments (173)
Whole Brain Emulation: No Progress on C. elegans After 10 Years
niconiconi · 2021-10-01T21:44:37.397Z · comments (87)
Moses and the Class Struggle
lsusr · 2022-04-01T11:55:04.911Z · comments (26)
How I buy things when Lightcone wants them fast
jacobjacob · 2022-09-26T05:02:09.003Z · comments (21)
What Do GDP Growth Curves Really Mean?
johnswentworth · 2021-10-07T21:58:15.121Z · comments (64)
Announcing MIRI’s new CEO and leadership team
Gretta Duleba (gretta-duleba) · 2023-10-10T19:22:11.821Z · comments (52)
AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work
Rohin Shah (rohinmshah) · 2024-08-20T16:22:45.888Z · comments (33)
← previous page (newer posts) · next page (older posts) →