LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde (kola-ayonrinde) · 2024-10-30T22:50:45.642Z · comments (0)
[link] Generic advice caveats
Saul Munn (saul-munn) · 2024-10-30T21:03:07.185Z · comments (1)
I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (20)
AI as a powerful meme, via CGP Grey
TheManxLoiner · 2024-10-30T18:31:58.544Z · comments (6)
[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)
Motivation control
Joe Carlsmith (joekc) · 2024-10-30T17:15:50.881Z · comments (7)
Updating the NAO Simulator
jefftk (jkaufman) · 2024-10-30T13:50:06.908Z · comments (0)
Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)
Three Notions of "Power"
johnswentworth · 2024-10-30T06:10:08.326Z · comments (43)
Introduction to Choice set Misspecification in Reward Inference
Rahul Chand (rahul-chand) · 2024-10-29T22:57:34.310Z · comments (0)
Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)
[link] The Alignment Trap: AI Safety as Path to Power
crispweed · 2024-10-29T15:21:26.545Z · comments (17)
Housing Roundup #10
Zvi · 2024-10-29T13:50:09.416Z · comments (2)
[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations
Steven Byrnes (steve2152) · 2024-10-29T13:36:16.325Z · comments (2)
Review: “The Case Against Reality”
David Gross (David_Gross) · 2024-10-29T13:13:29.643Z · comments (9)
A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More
Sharat Jacob Jacob (sharat-jacob-jacob) · 2024-10-29T12:41:30.337Z · comments (0)
Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean (euanmclean) · 2024-10-29T12:16:18.448Z · comments (7)
AI #87: Staying in Character
Zvi · 2024-10-29T07:10:08.212Z · comments (3)
A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (12)
D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset
aphyer · 2024-10-29T01:21:03.075Z · comments (12)
[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (10)
[link] October 2024 Progress in Guaranteed Safe AI
Quinn (quinn-dougherty) · 2024-10-28T23:34:51.689Z · comments (0)
5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)
How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)
Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations
ozziegooen · 2024-10-28T21:44:42.352Z · comments (0)
[link] AI & wisdom 3: AI effects on amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:08:56.604Z · comments (0)
[link] AI & wisdom 2: growth and amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:07:39.449Z · comments (0)
[link] AI & wisdom 1: wisdom, amortised optimisation, and AI
L Rudolf L (LRudL) · 2024-10-28T21:02:51.215Z · comments (0)
[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)
[link] Towards the Operationalization of Philosophy & Wisdom
Thane Ruthenis · 2024-10-28T19:45:07.571Z · comments (2)
Quantitative Trading Bootcamp [Nov 6-10]
Ricki Heicklen (bayesshammai) · 2024-10-28T18:39:58.480Z · comments (0)
Winners of the Essay competition on the Automation of Wisdom and Philosophy
AI Impacts (AI Imacts) · 2024-10-28T17:10:04.272Z · comments (3)
[link] Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority
Zach Stein-Perlman · 2024-10-28T17:00:18.660Z · comments (3)
[question] somebody explain the word "epistemic" to me
KvmanThinking (avery-liu) · 2024-10-28T16:40:24.275Z · answers+comments (8)
~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)
[link] AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels
Corin Katzke (corin-katzke) · 2024-10-28T16:03:39.258Z · comments (0)
[link] Death notes - 7 thoughts on death
Nathan Young · 2024-10-28T15:01:13.532Z · comments (1)
SAEs you can See: Applying Sparse Autoencoders to Clustering
Robert_AIZI · 2024-10-28T14:48:16.744Z · comments (0)
Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph (redhat) · 2024-10-28T14:41:41.969Z · comments (5)
[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)
[link] Care Doesn't Scale
stavros · 2024-10-28T11:57:38.742Z · comments (1)
Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)
San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)
[link] Nerdtrition: simple diets via spreadsheet abuse
dkl9 · 2024-10-27T21:45:15.117Z · comments (0)
AGI Fermi Paradox
jrincayc (nerd_gatherer) · 2024-10-27T20:14:54.490Z · comments (2)
Substituting Talkbox for Breath Controller
jefftk (jkaufman) · 2024-10-27T19:10:03.768Z · comments (0)
Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)
Hiring a writer to co-author with me (Spencer Greenberg for ClearerThinking.org)
spencerg · 2024-10-27T17:34:50.479Z · comments (0)
Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)
[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (18)
next page (older posts) →