LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

A few thoughts on my self-study for alignment research
Thomas Kehrenberg (thomas-kehrenberg) · 2022-12-30T22:05:58.859Z · comments (0)
Christmas Microscopy
jefftk (jkaufman) · 2022-12-30T21:10:01.937Z · comments (0)
What "upside" of AI?
False Name (False Name, Esq.) · 2022-12-30T20:58:49.165Z · comments (5)
Evidence on recursive self-improvement from current ML
beren · 2022-12-30T20:53:22.462Z · comments (12)
[question] Is ChatGPT TAI?
Amal (asta-vista) · 2022-12-30T19:44:50.508Z · answers+comments (5)
My thoughts on OpenAI's alignment plan
Akash (akash-wasil) · 2022-12-30T19:33:15.019Z · comments (3)
Beyond Rewards and Values: A Non-dualistic Approach to Universal Intelligence
Akira Pyinya · 2022-12-30T19:05:24.664Z · comments (4)
10 Years of LessWrong
JohnBuridan · 2022-12-30T17:15:17.498Z · comments (2)
Chatbots as a Publication Format
derek shiller (derek-shiller) · 2022-12-30T14:11:21.015Z · comments (6)
Human sexuality as an interesting case study of alignment
beren · 2022-12-30T13:37:20.176Z · comments (26)
The Twitter Files: Covid Edition
Zvi · 2022-12-30T13:30:01.073Z · comments (2)
Worldly Positions archive, briefly with private drafts
KatjaGrace · 2022-12-30T12:20:05.430Z · comments (0)
Models Don't "Get Reward"
Sam Ringer · 2022-12-30T10:37:11.798Z · comments (61)
The hyperfinite timeline
Alok Singh (OldManNick) · 2022-12-30T09:30:06.483Z · comments (6)
Reactive devaluation: Bias in Evaluating AGI X-Risks
Remmelt (remmelt-ellen) · 2022-12-30T09:02:58.450Z · comments (9)
Things I carry almost every day, as of late December 2022
DanielFilan · 2022-12-30T07:40:01.261Z · comments (9)
More ways to spot abysses
KatjaGrace · 2022-12-30T06:30:06.301Z · comments (1)
Language models are nearly AGIs but we don't notice it because we keep shifting the bar
philosophybear · 2022-12-30T05:15:15.625Z · comments (13)
[link] Progress links and tweets, 2022-12-29
jasoncrawford · 2022-12-30T04:54:51.905Z · comments (0)
Announcing The Filan Cabinet
DanielFilan · 2022-12-30T03:10:00.494Z · comments (2)
[question] Effective Evil Causes?
Ulisse Mini (ulisse-mini) · 2022-12-30T02:56:31.459Z · answers+comments (2)
But is it really in Rome? An investigation of the ROME model editing technique
jacquesthibs (jacques-thibodeau) · 2022-12-30T02:40:36.713Z · comments (1)
A Year of AI Increasing AI Progress
ThomasW (ThomasWoodside) · 2022-12-30T02:09:39.458Z · comments (3)
Why not spend more time looking at human alignment?
ajc586 (Adrian Cable) · 2022-12-30T00:22:13.666Z · comments (3)
Why and how to write things on the Internet
benkuhn · 2022-12-29T22:40:04.636Z · comments (2)
[link] Friendly and Unfriendly AGI are Indistinguishable
ErgoEcho · 2022-12-29T22:13:00.434Z · comments (4)
200 COP in MI: Looking for Circuits in the Wild
Neel Nanda (neel-nanda-1) · 2022-12-29T20:59:53.267Z · comments (5)
Thoughts on the implications of GPT-3, two years ago and NOW [here be dragons, we're swimming, flying and talking with them]
Bill Benzon (bill-benzon) · 2022-12-29T20:05:31.062Z · comments (0)
Covid 12/29/22: Next Up is XBB.1.5
Zvi · 2022-12-29T18:20:00.943Z · comments (4)
Entrepreneurship ETG Might Be Better Than 80k Thought
Xodarap · 2022-12-29T17:51:13.412Z · comments (0)
Internal Interfaces Are a High-Priority Interpretability Target
Thane Ruthenis · 2022-12-29T17:49:27.450Z · comments (6)
CFP for Rebellion and Disobedience in AI workshop
Ram Rachum (ram@rachum.com) · 2022-12-29T16:08:05.035Z · comments (0)
My scorched-earth policy on New Year’s resolutions
PatrickDFarley · 2022-12-29T14:45:47.126Z · comments (2)
Don't feed the void. She is fat enough!
Johannes C. Mayer (johannes-c-mayer) · 2022-12-29T14:18:44.526Z · comments (0)
[question] Is there any unified resource on Eliezer's fatigue?
Johannes C. Mayer (johannes-c-mayer) · 2022-12-29T14:04:53.488Z · answers+comments (2)
Logical Probability of Goldbach’s Conjecture: Provable Rule or Coincidence?
avturchin · 2022-12-29T13:37:45.130Z · comments (15)
Where do you get your capabilities from?
tailcalled · 2022-12-29T11:39:05.449Z · comments (27)
[link] The commercial incentive to intentionally train AI to deceive us
Derek M. Jones (Derek-Jones) · 2022-12-29T11:30:28.267Z · comments (1)
Infinite necklace: the line as a circle
Alok Singh (OldManNick) · 2022-12-29T10:41:58.268Z · comments (2)
Privacy Tradeoffs
jefftk (jkaufman) · 2022-12-29T03:40:01.463Z · comments (1)
Against John Searle, Gary Marcus, the Chinese Room thought experiment and its world
philosophybear · 2022-12-29T03:26:12.485Z · comments (43)
Large Language Models Suggest a Path to Ems
anithite (obserience) · 2022-12-29T02:20:01.753Z · comments (2)
[question] Book recommendations for the history of ML?
Eleni Angelou (ea-1) · 2022-12-28T23:50:55.512Z · answers+comments (2)
Rock-Paper-Scissors Can Be Weird
winwonce · 2022-12-28T23:12:11.329Z · comments (3)
200 COP in MI: The Case for Analysing Toy Language Models
Neel Nanda (neel-nanda-1) · 2022-12-28T21:07:03.838Z · comments (3)
200 Concrete Open Problems in Mechanistic Interpretability: Introduction
Neel Nanda (neel-nanda-1) · 2022-12-28T21:06:53.853Z · comments (0)
Effective ways to find love?
anonymoususer · 2022-12-28T20:46:23.247Z · comments (7)
Classical logic based on propositions-as-subsingleton-types
Thomas Kehrenberg (thomas-kehrenberg) · 2022-12-28T20:16:37.723Z · comments (0)
In Defense of Wrapper-Minds
Thane Ruthenis · 2022-12-28T18:28:25.868Z · comments (38)
[question] What is the best way to approach Expected Value calculations when payoffs are highly skewed?
jmh · 2022-12-28T14:42:51.169Z · answers+comments (16)
next page (older posts) →