LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Predictions for shard theory mechanistic interpretability results
TurnTrout · 2023-03-01T05:16:48.043Z · comments (10)
[link] the QACI alignment plan: table of contents
Tamsin Leake (carado-1) · 2023-03-21T20:22:00.865Z · comments (1)
On the FLI Open Letter
Zvi · 2023-03-30T16:00:00.716Z · comments (11)
Parasitic Language Games: maintaining ambiguity to hide conflict while burning the commons
Hazard · 2023-03-12T05:25:26.496Z · comments (16)
AI #4: Introducing GPT-4
Zvi · 2023-03-21T14:00:01.161Z · comments (32)
Introducing Leap Labs, an AI interpretability startup
Jessica Rumbelow (jessica-cooper) · 2023-03-06T16:16:22.182Z · comments (11)
"Publish or Perish" (a quick note on why you should try to make your work legible to existing academic communities)
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2023-03-18T19:01:54.199Z · comments (48)
LLM Modularity: The Separability of Capabilities in Large Language Models
NickyP (Nicky) · 2023-03-26T21:57:03.445Z · comments (3)
Truth and Advantage: Response to a draft of "AI safety seems hard to measure"
So8res · 2023-03-22T03:36:02.945Z · comments (9)
Selective, Corrective, Structural: Three Ways of Making Social Systems Work
Said Achmiz (SaidAchmiz) · 2023-03-05T08:45:45.615Z · comments (13)
[link] New blog: Planned Obsolescence
Ajeya Cotra (ajeya-cotra) · 2023-03-27T19:46:25.429Z · comments (7)
[link] Nobody’s on the ball on AGI alignment
leopold · 2023-03-29T17:40:36.250Z · comments (37)
AI #5: Level One Bard
Zvi · 2023-03-30T23:00:00.690Z · comments (9)
RLHF does not appear to differentially cause mode-collapse
Arthur Conmy (arthur-conmy) · 2023-03-20T15:39:45.353Z · comments (9)
Abstracts should be either Actually Short™, or broken into paragraphs
Raemon · 2023-03-24T00:51:56.449Z · comments (27)
Learn the mathematical structure, not the conceptual structure
Adam Shai (adam-shai) · 2023-03-01T22:24:19.451Z · comments (35)
Practical Pitfalls of Causal Scrubbing
Jérémy Scheurer (JerrySch) · 2023-03-27T07:47:31.309Z · comments (17)
reflections on lockdown, two years out
mingyuan · 2023-03-01T06:58:38.176Z · comments (9)
[link] Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox (PandaFusion) · 2023-03-07T04:11:18.183Z · comments (7)
Contract Fraud
jefftk (jkaufman) · 2023-03-01T03:10:01.047Z · comments (10)
[link] The epistemic virtue of scope matching
jasoncrawford · 2023-03-15T13:31:39.602Z · comments (15)
Shell games
TsviBT · 2023-03-19T10:43:44.184Z · comments (8)
The Kids are Not Okay
Zvi · 2023-03-08T13:30:01.032Z · comments (43)
The 0.2 OOMs/year target
Cleo Nardo (strawberry calm) · 2023-03-30T18:15:40.735Z · comments (24)
Yudkowsky on AGI risk on the Bankless podcast
Rob Bensinger (RobbBB) · 2023-03-13T00:42:22.694Z · comments (5)
$500 Bounty/Contest: Explain Infra-Bayes In The Language Of Game Theory
johnswentworth · 2023-03-25T17:29:51.498Z · comments (7)
[link] continue working on hard alignment! don't give up!
Tamsin Leake (carado-1) · 2023-03-24T00:14:35.607Z · comments (45)
[question] Are there specific books that it might slightly help alignment to have on the internet?
AnnaSalamon · 2023-03-29T05:08:28.364Z · answers+comments (25)
How to Support Someone Who is Struggling
David Zeller · 2023-03-11T18:52:25.060Z · comments (13)
Sunlight is yellow parallel rays plus blue isotropic light
Thomas Kehrenberg (thomas-kehrenberg) · 2023-03-01T17:58:02.706Z · comments (4)
Success without dignity: a nearcasting story of avoiding catastrophe by luck
HoldenKarnofsky · 2023-03-14T19:23:15.558Z · comments (9)
You Can’t Predict a Game of Pinball
Jeffrey Heninger (jeffrey-heninger) · 2023-03-30T00:40:05.280Z · comments (12)
Microsoft Research Paper Claims Sparks of Artificial Intelligence in GPT-4
Zvi · 2023-03-24T13:20:01.241Z · comments (14)
A bunch of videos for intuition building (2x speed, skip ones that bore you)
the gears to ascension (lahwran) · 2023-03-12T00:51:39.406Z · comments (5)
Response to Tyler Cowen’s Existential risk, AI, and the inevitable turn in human history
Zvi · 2023-03-28T16:00:02.088Z · comments (27)
Imitation Learning from Language Feedback
Jérémy Scheurer (JerrySch) · 2023-03-30T14:11:56.295Z · comments (3)
[link] AI Safety in a World of Vulnerable Machine Learning Systems
AdamGleave · 2023-03-08T02:40:43.139Z · comments (27)
Dealing with infinite entropy
Alex_Altair · 2023-03-01T15:01:40.400Z · comments (9)
Probabilistic Payor Lemma?
abramdemski · 2023-03-19T17:57:04.237Z · comments (7)
[link] Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research
DragonGod · 2023-03-23T05:45:12.004Z · comments (23)
AI #2
Zvi · 2023-03-02T14:50:01.078Z · comments (18)
Tabooing "Frame Control"
Raemon · 2023-03-19T23:33:10.154Z · comments (41)
[question] What happened to the OpenPhil OpenAI board seat?
ChristianKl · 2023-03-15T16:59:06.390Z · answers+comments (2)
Plan for mediocre alignment of brain-like [model-based RL] AGI
Steven Byrnes (steve2152) · 2023-03-13T14:11:32.747Z · comments (24)
[link] Japan AI Alignment Conference
Chris Scammell (chris-scammell) · 2023-03-10T06:56:56.983Z · comments (7)
Transcript: NBC Nightly News: AI ‘race to recklessness’ w/ Tristan Harris, Aza Raskin
WilliamKiely · 2023-03-23T01:04:15.338Z · comments (4)
[link] Sam Altman on GPT-4, ChatGPT, and the Future of AI | Lex Fridman Podcast #367
Gabe M (gabe-mukobi) · 2023-03-25T19:08:55.249Z · comments (4)
[link] The Prospect of an AI Winter
Erich_Grunewald · 2023-03-27T20:55:35.619Z · comments (24)
Sydney can play chess and kind of keep track of the board state
Erik Jenner (ejenner) · 2023-03-03T09:39:52.439Z · comments (19)
Why do we assume there is a "real" shoggoth behind the LLM? Why not masks all the way down?
Robert_AIZI · 2023-03-09T17:28:43.259Z · comments (48)
← previous page (newer posts) · next page (older posts) →