LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

[question] Should I fundraise for open source search engine?
samuelshadrach (xpostah) · 2025-03-23T13:04:16.149Z · answers+comments (2)
[link] Privateers Reborn: Cyber Letters of Marque
arealsociety (shane-zabel) · 2025-03-23T03:39:25.990Z · comments (2)
Beware nerfing AI with opinionated human-centric sensors
Haotian (haotian-huang) · 2025-03-23T01:09:16.770Z · comments (0)
Reframing AI Safety as a Neverending Institutional Challenge
scasper · 2025-03-23T00:13:48.614Z · comments (12)
The Dangerous Illusion of AI Deterrence: Why MAIM Isn’t Rational
mc1soft · 2025-03-22T22:55:02.355Z · comments (0)
Dayton, Ohio, ACX Meetup
Lunawarrior · 2025-03-22T19:45:55.510Z · comments (0)
[Replication] Crosscoder-based Stage-Wise Model Diffing
annas (annasoli) · 2025-03-22T18:35:19.003Z · comments (0)
The Principle of Satisfying Foreknowledge
Randall Reams (randall-reams) · 2025-03-22T18:20:27.998Z · comments (0)
[question] Urgency in the ITN framework
Shaïman · 2025-03-22T18:16:07.900Z · answers+comments (2)
Transhumanism and AI: Toward Prosperity or Extinction?
Shaïman · 2025-03-22T18:16:07.868Z · comments (2)
Tied Crosscoders: Explaining Chat Behavior from Base Model
Santiago Aranguri (aranguri) · 2025-03-22T18:07:21.751Z · comments (0)
Dusty Hands and Geo-arbitrage
Tomás B. (Bjartur Tómas) · 2025-03-22T16:05:30.364Z · comments (3)
100+ concrete projects and open problems in evals
Marius Hobbhahn (marius-hobbhahn) · 2025-03-22T15:21:40.970Z · comments (1)
Do models say what they learn?
Andy Arditi (andy-arditi) · 2025-03-22T15:19:18.800Z · comments (12)
AGI Morality and Why It Is Unlikely to Emerge as a Feature of Superintelligence
funnyfranco · 2025-03-22T12:06:55.723Z · comments (9)
2025 Q3 Pivotal Research Fellowship: Applications Open
Tobias H (clearthis) · 2025-03-22T10:54:35.492Z · comments (0)
[link] Good Research Takes are Not Sufficient for Good Strategic Takes
Neel Nanda (neel-nanda-1) · 2025-03-22T10:13:38.257Z · comments (28)
Grammatical Roles and Social Roles: A Structural Analogy
Lucien (lucien) · 2025-03-22T07:44:07.383Z · comments (0)
Legibility
lsusr · 2025-03-22T06:54:35.259Z · comments (22)
Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (38)
A Short Diatribe on Hidden Assertions.
Eggs (donald-sampson) · 2025-03-22T03:14:37.577Z · comments (2)
Transformer Attention’s High School Math Mistake
Max Ma (max-ma) · 2025-03-22T00:16:34.082Z · comments (1)
[link] Making Sense of President Trump’s Annexation Obsession
Annapurna (jorge-velez) · 2025-03-21T21:10:30.872Z · comments (3)
How I force LLMs to generate correct code
claudio · 2025-03-21T14:40:19.211Z · comments (7)
Prospects for Alignment Automation: Interpretability Case Study
Jacob Pfau (jacob-pfau) · 2025-03-21T14:05:51.528Z · comments (4)
[link] Epoch AI released a GATE Scenario Explorer
Lee.aao (leonid-artamonov) · 2025-03-21T13:57:09.172Z · comments (0)
They Took MY Job?
Zvi · 2025-03-21T13:30:38.507Z · comments (4)
Silly Time
jefftk (jkaufman) · 2025-03-21T12:30:08.560Z · comments (2)
[link] Towards a scale-free theory of intelligent agency
Richard_Ngo (ricraz) · 2025-03-21T01:39:42.251Z · comments (24)
[question] Any mistakes in my understanding of Transformers?
Kallistos · 2025-03-21T00:34:30.667Z · answers+comments (7)
[link] A Critique of “Utility”
Zero Contradictions · 2025-03-20T23:21:41.900Z · comments (10)
Intention to Treat
Alicorn · 2025-03-20T20:01:19.456Z · comments (4)
[link] Anthropic: Progress from our Frontier Red Team
UnofficialLinkpostBot (LinkpostBot) · 2025-03-20T19:12:02.151Z · comments (3)
Everything's An Emergency
omnizoid · 2025-03-20T17:12:23.006Z · comments (0)
Non-Consensual Consent: The Performance of Choice in a Coercive World
Alex_Steiner · 2025-03-20T17:12:16.302Z · comments (4)
Minor interpretability exploration #4: LayerNorm and the learning coefficient
Rareș Baron · 2025-03-20T16:18:04.801Z · comments (0)
[question] How far along Metr's law can AI start automating or helping with alignment research?
Christopher King (christopher-king) · 2025-03-20T15:58:08.369Z · answers+comments (21)
Human alignment
Lucien (lucien) · 2025-03-20T15:52:22.081Z · comments (2)
[question] Seeking: more Sci Fi micro reviews
Yair Halberstadt (yair-halberstadt) · 2025-03-20T14:31:37.597Z · answers+comments (0)
AI #108: Straight Line on a Graph
Zvi · 2025-03-20T13:50:00.983Z · comments (5)
[link] What is an alignment tax?
Vishakha (vishakha-agrawal) · 2025-03-20T13:06:58.087Z · comments (0)
Longtermist Implications of the Existence Neutrality Hypothesis
Maxime Riché (maxime-riche) · 2025-03-20T12:20:40.661Z · comments (2)
You don't have to be "into EA" to attend EAG(x) Conferences
gergogaspar (gergo-gaspar) · 2025-03-20T10:44:12.271Z · comments (0)
Defense Against The Super-Worms
viemccoy · 2025-03-20T07:24:56.975Z · comments (1)
Socially Graceful Degradation
Screwtape · 2025-03-20T04:03:41.213Z · comments (9)
Daniel Dennett, the Unity of Consciousness, and Animal Minds
stormykat (sandypawbs) · 2025-03-20T03:43:38.108Z · comments (0)
Apply to MATS 8.0!
Ryan Kidd (ryankidd44) · 2025-03-20T02:17:58.018Z · comments (4)
Improved visualizations of METR Time Horizons paper.
LDJ (luigi-d) · 2025-03-19T23:36:52.771Z · comments (4)
Is CCP authoritarianism good for building safe AI?
Hruss (henry-russell) · 2025-03-19T23:13:36.397Z · comments (0)
The case against "The case against AI alignment"
KvmanThinking (avery-liu) · 2025-03-19T22:40:33.812Z · comments (0)
← previous page (newer posts) · next page (older posts) →