LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

What I Would Do If I Were Working On AI Governance
johnswentworth · 2023-12-08T06:43:42.565Z · comments (9)
You're optimized for IGF (Source: trust me bro)
dkirmani · 2023-12-08T05:32:04.150Z · comments (2)
[link] Whither Prison Abolition?
MadHatter · 2023-12-08T05:27:26.985Z · comments (0)
Class consciousness for those against the class system
TekhneMakre · 2023-12-08T01:02:49.613Z · comments (1)
Building selfless agents to avoid instrumental self-preservation.
blallo · 2023-12-07T18:59:24.531Z · comments (0)
Does Chat-GPT display ‘Scope Insensitivity’?
callum · 2023-12-07T18:58:43.276Z · comments (0)
LLM keys - A Proposal of a Solution to Prompt Injection Attacks
Peter Hroššo (peter-hrosso) · 2023-12-07T17:36:23.311Z · comments (2)
Meetup Tip: Heartbeat Messages
Screwtape · 2023-12-07T17:18:33.582Z · comments (1)
[Valence series] 2. Valence & Normativity
Steven Byrnes (steve2152) · 2023-12-07T16:43:49.919Z · comments (1)
[link] AISN #27: Defensive Accelerationism, A Retrospective On The OpenAI Board Saga, And A New AI Bill From Senators Thune And Klobuchar
aogara (Aidan O'Gara) · 2023-12-07T15:59:11.622Z · comments (0)
AI #41: Bring in the Other Gemini
Zvi · 2023-12-07T15:10:05.552Z · comments (7)
Simplicity arguments for scheming (Section 4.3 of "Scheming AIs")
Joe Carlsmith (joekc) · 2023-12-07T15:05:54.267Z · comments (1)
Results from the Turing Seminar hackathon
Charbel-Raphaël (charbel-raphael-segerie) · 2023-12-07T14:50:38.377Z · comments (0)
Gemini 1.0
Zvi · 2023-12-07T14:40:05.243Z · comments (6)
Random Musings on Theory of Impact for Activation Vectors
Chris_Leong · 2023-12-07T13:07:08.215Z · comments (0)
[question] Is AlphaGo actually a consequentialist utility maximizer?
faul_sname · 2023-12-07T12:41:05.132Z · answers+comments (6)
[link] (Report) Evaluating Taiwan's Tactics to Safeguard its Semiconductor Assets Against a Chinese Invasion
Gauraventh (aryangauravyadav) · 2023-12-07T11:50:59.543Z · comments (4)
Would AIs trapped in the Metaverse pine to enter the real world and would the ramifications cause trouble?
ProfessorFalken · 2023-12-07T10:17:44.732Z · comments (0)
[link] The GiveWiki’s Top Picks in AI Safety for the Giving Season of 2023
Dawn Drescher (Telofy) · 2023-12-07T09:23:05.018Z · comments (4)
Language Model Memorization, Copyright Law, and Conditional Pretraining Alignment
RogerDearnaley (roger-d-1) · 2023-12-07T06:14:13.816Z · comments (0)
Reflective consistency, randomized decisions, and the dangers of unrealistic thought experiments
Radford Neal · 2023-12-07T03:33:16.149Z · comments (13)
[question] For fun: How long can you hold your breath?
Yoyo Yuan (yoyo-yuan) · 2023-12-06T23:36:11.320Z · answers+comments (2)
Mathematics As Physics
Nox ML · 2023-12-06T22:27:54.140Z · comments (4)
The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs")
Joe Carlsmith (joekc) · 2023-12-06T19:28:19.393Z · comments (0)
On Trust
johnswentworth · 2023-12-06T19:19:07.680Z · comments (22)
Originality vs. Correctness
alkjash · 2023-12-06T18:51:49.531Z · comments (10)
Proposal for improving the global online discourse through personalised comment ordering on all websites
Roman Leventov · 2023-12-06T18:51:37.645Z · comments (8)
[link] Google Gemini Announced
g-w1 · 2023-12-06T16:14:07.192Z · comments (21)
Based Beff Jezos and the Accelerationists
Zvi · 2023-12-06T16:00:08.380Z · comments (29)
Bucket Brigade: Likely End-of-Life
jefftk (jkaufman) · 2023-12-06T15:30:06.871Z · comments (0)
[link] Why Yudkowsky is wrong about "covalently bonded equivalents of biology"
titotal (lombertini) · 2023-12-06T14:09:15.402Z · comments (34)
[link] Metaculus Launches Chinese AI Chips Tournament, Supporting Institute for AI Policy and Strategy Research
ChristianWilliams · 2023-12-06T11:26:15.790Z · comments (1)
Minimal Viable Paradise: How do we get The Good Future(TM)?
Nathan Young · 2023-12-06T09:24:09.699Z · comments (0)
Anthropical Paradoxes are Paradoxes of Probability Theory
Ape in the coat · 2023-12-06T08:16:26.846Z · comments (16)
Digital humans vs merge with AI? Same or different?
Nathan Helm-Burger (nathan-helm-burger) · 2023-12-06T04:56:38.261Z · comments (10)
EA Infrastructure Fund's Plan to Focus on Principles-First EA
Linch · 2023-12-06T03:24:55.844Z · comments (0)
[link] **In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley**
mrtreasure · 2023-12-06T02:02:32.004Z · comments (2)
Some quick thoughts on "AI is easy to control"
Mikhail Samin (mikhail-samin) · 2023-12-06T00:58:53.681Z · comments (9)
ACX Corvallis, OR
kenakofer · 2023-12-06T00:23:25.706Z · comments (0)
Multinational corporations as optimizers: a case for reaching across the aisle
sudo-nym · 2023-12-06T00:14:35.831Z · comments (0)
[question] How do you feel about LessWrong these days? [Open feedback thread]
jacobjacob · 2023-12-05T20:54:42.317Z · answers+comments (140)
Critique-a-Thon of AI Alignment Plans
Iknownothing · 2023-12-05T20:50:07.661Z · comments (3)
Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")
Joe Carlsmith (joekc) · 2023-12-05T18:48:12.917Z · comments (0)
[link] In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley (OpenAI post)
mrtreasure · 2023-12-05T18:40:19.740Z · comments (2)
Studying The Alien Mind
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-12-05T17:27:28.049Z · comments (8)
Deep Forgetting & Unlearning for Safely-Scoped LLMs
scasper · 2023-12-05T16:48:18.177Z · comments (22)
On ‘Responsible Scaling Policies’ (RSPs)
Zvi · 2023-12-05T16:10:06.310Z · comments (3)
[link] We're all in this together
Tamsin Leake (carado-1) · 2023-12-05T13:57:46.270Z · comments (58)
A Socratic dialogue with my student
lsusr · 2023-12-05T09:31:05.266Z · comments (7)
Neural uncertainty estimation review article (for alignment)
Charlie Steiner · 2023-12-05T08:01:32.723Z · comments (1)
next page (older posts) →