LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Idealized Agents Are Approximate Causal Mirrors (+ Radical Optimism on Agent Foundations)
Thane Ruthenis · 2023-12-22T20:19:13.865Z · comments (13)
[link] OpenAI: Preparedness framework
Zach Stein-Perlman · 2023-12-18T18:30:10.153Z · comments (23)
[link] How LDT helps reduce the AI arms race
Tamsin Leake (carado-1) · 2023-12-10T16:21:44.409Z · comments (13)
Update on Chinese IQ-related gene panels
Lao Mein (derpherpize) · 2023-12-14T10:12:21.212Z · comments (7)
Some for-profit AI alignment org ideas
Eric Ho (eh42) · 2023-12-14T14:23:20.654Z · comments (19)
[Valence series] 2. Valence & Normativity
Steven Byrnes (steve2152) · 2023-12-07T16:43:49.919Z · comments (4)
Flagging Potentially Unfair Parenting
jefftk (jkaufman) · 2023-12-26T12:40:05.099Z · comments (1)
Meetup Tip: Heartbeat Messages
Screwtape · 2023-12-07T17:18:33.582Z · comments (4)
Finding Sparse Linear Connections between Features in LLMs
Logan Riggs (elriggs) · 2023-12-09T02:27:42.456Z · comments (5)
[link] We're all in this together
Tamsin Leake (carado-1) · 2023-12-05T13:57:46.270Z · comments (65)
Don't Share Information Exfohazardous on Others' AI-Risk Models
Thane Ruthenis · 2023-12-19T20:09:06.244Z · comments (11)
AI #42: The Wrong Answer
Zvi · 2023-12-14T14:50:05.086Z · comments (6)
Out-of-distribution Bioattacks
jefftk (jkaufman) · 2023-12-02T12:20:05.626Z · comments (15)
[link] Funding case: AI Safety Camp
Remmelt (remmelt-ellen) · 2023-12-12T09:08:18.911Z · comments (5)
METR is hiring!
Beth Barnes (beth-barnes) · 2023-12-26T21:00:50.625Z · comments (1)
Complex systems research as a field (and its relevance to AI Alignment)
Nora_Ammann · 2023-12-01T22:10:25.801Z · comments (9)
[Valence series] 3. Valence & Beliefs
Steven Byrnes (steve2152) · 2023-12-11T20:21:30.570Z · comments (6)
Balsa Update and General Thank You
Zvi · 2023-12-12T20:30:03.980Z · comments (8)
E.T. Jaynes Probability Theory: The logic of Science I
Jan Christian Refsgaard (jan-christian-refsgaard) · 2023-12-27T23:47:52.579Z · comments (20)
Originality vs. Correctness
alkjash · 2023-12-06T18:51:49.531Z · comments (16)
[link] shoes with springs
bhauth · 2023-12-30T21:46:55.319Z · comments (6)
[link] Are There Examples of Overhang for Other Technologies?
Jeffrey Heninger (jeffrey-heninger) · 2023-12-13T21:48:08.954Z · comments (50)
AI Safety Chatbot
markov (markovial) · 2023-12-21T14:06:48.981Z · comments (11)
The LessWrong 2022 Review: Review Phase
RobertM (T3t) · 2023-12-22T03:23:49.635Z · comments (7)
[link] Talk: "AI Would Be A Lot Less Alarming If We Understood Agents"
johnswentworth · 2023-12-17T23:46:32.814Z · comments (3)
Measurement tampering detection as a special case of weak-to-strong generalization
ryan_greenblatt · 2023-12-23T00:05:55.357Z · comments (10)
The Best of Don’t Worry About the Vase
Zvi · 2023-12-13T12:50:02.510Z · comments (4)
Some negative steganography results
Fabien Roger (Fabien) · 2023-12-09T20:22:52.323Z · comments (5)
[link] Google Gemini Announced
Jacob G-W (g-w1) · 2023-12-06T16:14:07.192Z · comments (22)
[link] the micro-fulfillment cambrian explosion
bhauth · 2023-12-04T01:15:34.342Z · comments (5)
AI #44: Copyright Confrontation
Zvi · 2023-12-28T14:30:10.237Z · comments (13)
[link] In Defense of Epistemic Empathy
Kevin Dorst · 2023-12-27T16:27:06.320Z · comments (19)
2022 (and All Time) Posts by Pingback Count
Raemon · 2023-12-16T21:17:00.572Z · comments (14)
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
leogao · 2023-12-16T05:39:10.558Z · comments (5)
AI #43: Functional Discoveries
Zvi · 2023-12-21T15:50:04.442Z · comments (26)
Pseudonymity and Accusations
jefftk (jkaufman) · 2023-12-21T19:20:19.944Z · comments (20)
[link] Meditations on Mot
Richard_Ngo (ricraz) · 2023-12-04T00:19:19.522Z · comments (11)
On OpenAI’s Preparedness Framework
Zvi · 2023-12-21T14:00:05.144Z · comments (4)
Will 2024 be very hot? Should we be worried?
A.H. (AlfredHarwood) · 2023-12-29T11:22:50.200Z · comments (12)
The Shortest Path Between Scylla and Charybdis
Thane Ruthenis · 2023-12-18T20:08:34.995Z · comments (8)
Gemini 1.0
Zvi · 2023-12-07T14:40:05.243Z · comments (7)
Anthropical Paradoxes are Paradoxes of Probability Theory
Ape in the coat · 2023-12-06T08:16:26.846Z · comments (18)
Goal-Completeness is like Turing-Completeness for AGI
Liron · 2023-12-19T18:12:29.947Z · comments (26)
n of m ring signatures
DanielFilan · 2023-12-04T20:00:06.580Z · comments (7)
Bounty: Diverse hard tasks for LLM agents
Beth Barnes (beth-barnes) · 2023-12-17T01:04:05.460Z · comments (31)
On ‘Responsible Scaling Policies’ (RSPs)
Zvi · 2023-12-05T16:10:06.310Z · comments (3)
AI #41: Bring in the Other Gemini
Zvi · 2023-12-07T15:10:05.552Z · comments (16)
[link] Contra Scott on Abolishing the FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-15T14:00:17.247Z · comments (3)
Some open-source dictionaries and dictionary learning infrastructure
Sam Marks (samuel-marks) · 2023-12-05T06:05:21.903Z · comments (7)
Environmental allergies are curable? (Sublingual immunotherapy)
Chipmonk · 2023-12-26T19:05:08.880Z · comments (10)
← previous page (newer posts) · next page (older posts) →