LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Concrete benefits of making predictions
Jonny Spicer (jonnyspicer) · 2024-10-17T14:23:17.613Z · comments (5)
If all trade is voluntary, then what is "exploitation?"
Darmani · 2024-12-27T11:21:30.036Z · comments (61)
Context-dependent consequentialism
Jeremy Gillen (jeremy-gillen) · 2024-11-04T09:29:24.310Z · comments (6)
People aren't properly calibrated on FrontierMath
cakubilo · 2024-12-23T19:35:44.467Z · comments (4)
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Mallen (alex-mallen) · 2024-10-24T16:49:00.939Z · comments (1)
Two Weeks Without Sweets
jefftk (jkaufman) · 2024-12-31T03:30:02.003Z · comments (0)
Incentive design and capability elicitation
Joe Carlsmith (joekc) · 2024-11-12T20:56:05.088Z · comments (0)
1. Meet the Players: Value Diversity
Allison Duettmann (allison-duettmann) · 2025-01-02T19:00:52.696Z · comments (2)
[link] A progress policy agenda
jasoncrawford · 2024-12-19T18:42:37.327Z · comments (1)
Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks
Tom DAVID (tom-david) · 2024-11-27T02:54:16.263Z · comments (0)
Why Aligning an LLM is Hard, and How to Make it Easier
RogerDearnaley (roger-d-1) · 2025-01-23T06:44:04.048Z · comments (2)
Quantum without complication
Optimization Process · 2025-01-16T08:53:11.347Z · comments (2)
[link] What I expected from this site: A LessWrong review
Nathan Young · 2024-12-20T11:27:39.683Z · comments (5)
Theory of Change for AI Safety Camp
Linda Linsefors · 2025-01-22T22:07:10.664Z · comments (3)
Mini Go: Gateway Game
jefftk (jkaufman) · 2025-01-14T03:30:02.020Z · comments (1)
[link] Safety tax functions
owencb · 2024-10-20T14:08:38.099Z · comments (0)
AI Safety Seed Funding Network - Join as a Donor or Investor
Alexandra Bos (AlexandraB) · 2024-12-16T19:30:43.812Z · comments (0)
Extending control evaluations to non-scheming threats
joshc (joshua-clymer) · 2025-01-12T01:42:54.614Z · comments (1)
Compositionality and Ambiguity:  Latent Co-occurrence and Interpretable Subspaces
Matthew A. Clarke (Antigone) · 2024-12-20T15:16:51.857Z · comments (0)
Aligning AI Safety Projects with a Republican Administration
Deric Cheng (deric-cheng) · 2024-11-21T22:12:27.502Z · comments (1)
Renormalization Redux: QFT Techniques for AI Interpretability
Lauren Greenspan (LaurenGreenspan) · 2025-01-18T03:54:28.652Z · comments (12)
[link] Our new video about goal misgeneralization, plus an apology
Writer · 2025-01-14T14:07:21.648Z · comments (0)
You can validly be seen and validated by a chatbot
Kaj_Sotala · 2024-12-20T12:00:03.015Z · comments (3)
[question] Why are there no interesting (1D, 2-state) quantum cellular automata?
Optimization Process · 2024-11-26T00:11:37.833Z · answers+comments (13)
[Cross-post] Every Bay Area "Walled Compound"
davekasten · 2025-01-23T15:05:08.629Z · comments (3)
Agents don't have to be aligned to help us achieve an indefinite pause.
Hastings (hastings-greer) · 2025-01-25T18:51:03.523Z · comments (0)
The new ruling philosophy regarding AI
Mitchell_Porter · 2024-11-11T13:28:24.476Z · comments (0)
[link] AI & wisdom 1: wisdom, amortised optimisation, and AI
L Rudolf L (LRudL) · 2024-10-28T21:02:51.215Z · comments (0)
Per Tribalismum ad Astra
Martin Sustrik (sustrik) · 2025-01-19T06:50:07.763Z · comments (5)
Gratitudes: Rational Thanks Giving
Seth Herd · 2024-11-29T03:09:47.410Z · comments (2)
Disagreement on AGI Suggests It’s Near
tangerine · 2025-01-07T20:42:43.456Z · comments (15)
Acknowledging Background Information with P(Q|I)
JenniferRM · 2024-12-24T18:50:25.323Z · comments (8)
[link] Why Recursion Pharmaceuticals abandoned cell painting for brightfield imaging
Abhishaike Mahajan (abhishaike-mahajan) · 2024-11-05T14:51:41.310Z · comments (1)
NYC Congestion Pricing: Early Days
Zvi · 2025-01-14T14:00:07.445Z · comments (0)
Distinguishing ways AI can be "concentrated"
Matthew Barnett (matthew-barnett) · 2024-10-21T22:21:13.666Z · comments (2)
[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (4)
Two flavors of computational functionalism
EuanMcLean (euanmclean) · 2024-11-25T10:47:04.584Z · comments (9)
Option control
Joe Carlsmith (joekc) · 2024-11-04T17:54:03.073Z · comments (0)
[question] Which things were you surprised to learn are metaphors?
Gordon Seidoh Worley (gworley) · 2024-11-22T03:46:02.845Z · answers+comments (18)
Corrigibility's Desirability is Timing-Sensitive
RobertM (T3t) · 2024-12-26T22:24:17.435Z · comments (4)
Is AI Alignment Enough?
Aram Panasenco (panasenco) · 2025-01-10T18:57:48.409Z · comments (6)
[link] Our Digital and Biological Children
Eneasz · 2024-10-24T18:36:38.719Z · comments (0)
Trading Candy
jefftk (jkaufman) · 2024-11-01T01:10:08.024Z · comments (4)
First Solo Bus Ride
jefftk (jkaufman) · 2024-12-03T12:20:02.344Z · comments (1)
Concrete Methods for Heuristic Estimation on Neural Networks
Oliver Daniels (oliver-daniels-koch) · 2024-11-14T05:07:55.240Z · comments (0)
[link] Impact in AI Safety Now Requires Specific Strategic Insight
MiloSal (milosal) · 2024-12-29T00:40:53.780Z · comments (1)
There aren't enough smart people in biology doing something boring
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-21T15:52:04.482Z · comments (13)
[link] Human-AI Complementarity: A Goal for Amplified Oversight
rishubjain · 2024-12-24T09:57:55.111Z · comments (3)
the Daydication technique
chaosmage · 2024-10-18T21:47:46.448Z · comments (0)
Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde (kola-ayonrinde) · 2024-10-30T22:50:45.642Z · comments (0)
← previous page (newer posts) · next page (older posts) →