LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Scaling Sparse Feature Circuit Finding to Gemma 9B
Diego Caples (diego-caples) · 2025-01-10T11:08:11.999Z · comments (4)
Human takeover might be worse than AI takeover
Tom Davidson (tom-davidson-1) · 2025-01-10T16:53:27.043Z · comments (11)
[link] Recommendations for Technical AI Safety Research Directions
Sam Marks (samuel-marks) · 2025-01-10T19:34:04.920Z · comments (1)
MATS mentor selection
DanielFilan · 2025-01-10T03:12:52.141Z · comments (6)
On Dwarkesh Patel’s 4th Podcast With Tyler Cowen
Zvi · 2025-01-10T13:50:05.563Z · comments (0)
[link] NAO Updates, January 2025
jefftk (jkaufman) · 2025-01-10T03:37:36.698Z · comments (0)
Is AI Alignment Enough?
Aram Panasenco (panasenco) · 2025-01-10T18:57:48.409Z · comments (3)
Dmitry's Koan
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-10T04:27:30.346Z · comments (0)
Beliefs and state of mind into 2025
RussellThor · 2025-01-10T22:07:01.060Z · comments (7)
The Alignment Mapping Program: Forging Independent Thinkers in AI Safety - A Pilot Retrospective
Alvin Ånestrand (alvin-anestrand) · 2025-01-10T16:22:16.905Z · comments (0)
We need a universal definition of 'agency' and related words
CstineSublime · 2025-01-11T03:22:56.623Z · comments (0)
[question] AI for medical care for hard-to-treat diseases?
CronoDAS · 2025-01-10T23:55:39.902Z · answers+comments (0)
[question] What are some scenarios where an aligned AGI actually helps humanity, but many/most people don't like it?
RomanS · 2025-01-10T18:13:11.900Z · answers+comments (2)
[link] AI Forecasting Benchmark: Congratulations to Q4 Winners + Q1 Practice Questions Open
ChristianWilliams · 2025-01-10T03:02:05.856Z · comments (0)
Activation Magnitudes Matter On Their Own: Insights from Language Model Distributional Analysis
Matt Levinson · 2025-01-10T06:53:02.228Z · comments (0)
[question] How do you decide to phrase predictions you ask of others? (and how do you make your own?)
CstineSublime · 2025-01-10T02:44:26.737Z · answers+comments (0)
Have frontier AI systems surpassed the self-replicating red line?
nsage (wheelspawn) · 2025-01-11T05:31:31.672Z · comments (0)
[question] Is Musk still net-positive for humanity?
mikbp · 2025-01-10T09:34:42.630Z · answers+comments (10)
Deleted
Yanling Guo (yanling-guo) · 2025-01-10T01:36:47.950Z · comments (0)
next page (older posts) →