LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

[question] Superintelligence Strategy: A Pragmatic Path to… Doom?
Mr Beastly (mr-beastly) · 2025-03-19T22:30:50.796Z · answers+comments (0)
SHIFT relies on token-level features to de-bias Bias in Bios probes
Tim Hua · 2025-03-19T21:29:15.974Z · comments (2)
Janet must die
Shmi (shminux) · 2025-03-19T20:35:09.768Z · comments (3)
[question] Why am I getting downvoted on Lesswrong?
Oxidize · 2025-03-19T18:32:47.243Z · answers+comments (14)
[link] Forecasting AI Futures Resource Hub
Alvin Ånestrand (alvin-anestrand) · 2025-03-19T17:26:28.059Z · comments (0)
[link] TBC episode w Dave Kasten from Control AI on AI Policy
Eneasz · 2025-03-19T17:09:50.841Z · comments (0)
Prioritizing threats for AI control
ryan_greenblatt · 2025-03-19T17:09:45.044Z · comments (2)
The Illusion of Transparency as a Trust-Building Mechanism
Priyanka Bharadwaj (priyanka-bharadwaj) · 2025-03-19T17:09:05.830Z · comments (0)
How Do We Govern AI Well?
kaime (khalid-ali) · 2025-03-19T17:08:49.601Z · comments (0)
[link] METR: Measuring AI Ability to Complete Long Tasks
Zach Stein-Perlman · 2025-03-19T16:00:54.874Z · comments (104)
Why I think AI will go poorly for humanity
Alek Westover (alek-westover) · 2025-03-19T15:52:18.373Z · comments (0)
The principle of genomic liberty
TsviBT · 2025-03-19T14:27:57.175Z · comments (51)
Going Nova
Zvi · 2025-03-19T13:30:01.293Z · comments (14)
Equations Mean Things
abstractapplic · 2025-03-19T08:16:35.312Z · comments (10)
[link] Elite Coordination via the Consensus of Power
Richard_Ngo (ricraz) · 2025-03-19T06:56:44.825Z · comments (15)
What I am working on right now and why: representation engineering edition
Lukasz G Bartoszcze (lukasz-g-bartoszcze) · 2025-03-18T22:37:45.363Z · comments (0)
Boots theory and Sybil Ramkin
philh · 2025-03-18T22:10:08.855Z · comments (17)
[link] Schmidt Sciences Technical AI Safety RFP on Inference-Time Compute – Deadline: April 30
Ryan Gajarawala (ryan-gajarawala) · 2025-03-18T18:05:34.757Z · comments (0)
PRISM: Perspective Reasoning for Integrated Synthesis and Mediation (Interactive Demo)
Anthony Diamond (anthony-diamond) · 2025-03-18T18:03:26.804Z · comments (0)
Subspace Rerouting: Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
Le magicien quantique · 2025-03-18T17:55:07.016Z · comments (1)
[link] Progress links and short notes, 2025-03-18
jasoncrawford · 2025-03-18T17:14:35.365Z · comments (0)
The Convergent Path to the Stars
Maxime Riché (maxime-riche) · 2025-03-18T17:09:37.046Z · comments (0)
[link] Sapir-Whorf Ego Death
Jonathan Moregård (JonathanMoregard) · 2025-03-18T16:57:21.437Z · comments (7)
[link] Smelling Nice is Good, Actually
Gordon Seidoh Worley (gworley) · 2025-03-18T16:54:43.324Z · comments (8)
[link] A Taxonomy of Jobs Deeply Resistant to TAI Automation
Deric Cheng (deric-cheng) · 2025-03-18T16:25:55.562Z · comments (0)
Why Are The Human Sciences Hard? Two New Hypotheses
Aydin Mohseni (aydin-mohseni) · 2025-03-18T15:45:52.239Z · comments (14)
Go home GPT-4o, you’re drunk: emergent misalignment as lowered inhibitions
Stuart_Armstrong · 2025-03-18T14:48:54.762Z · comments (12)
[question] What is the theory of change behind writing papers about AI safety?
Kajus · 2025-03-18T12:51:31.405Z · answers+comments (1)
OpenAI #11: America Action Plan
Zvi · 2025-03-18T12:50:03.880Z · comments (3)
I changed my mind about orca intelligence
Towards_Keeperhood (Simon Skade) · 2025-03-18T10:15:29.860Z · comments (24)
[question] Is Peano arithmetic trying to kill us? Do we care?
Q Home · 2025-03-18T08:22:27.761Z · answers+comments (2)
Do What the Mammals Do
CrimsonChin · 2025-03-18T03:57:56.083Z · comments (6)
What Actually Matters Until We Reach the Singularity
Lexius (Convalexius) · 2025-03-18T02:17:16.144Z · comments (0)
Meaning as a cognitive substitute for survival instincts: A thought experiment
Ovidijus Šimkus (ovidijus-simkus) · 2025-03-18T01:53:52.411Z · comments (0)
Against Yudkowsky's evolution analogy for AI x-risk [unfinished]
Fiora Sunshine (Fiora from Rosebloom) · 2025-03-18T01:41:06.453Z · comments (18)
An "AI researcher" has written a paper on optimizing AI architecture and optimized a language model to several orders of magnitude more efficiency.
Y B (y-b) · 2025-03-18T01:15:34.589Z · comments (1)
LessOnline 2025: Early Bird Tickets On Sale
Ben Pace (Benito) · 2025-03-18T00:22:02.653Z · comments (4)
Feedback loops for exercise (VO2Max)
Elizabeth (pktechgirl) · 2025-03-18T00:10:06.827Z · comments (9)
FrontierMath Score of o3-mini Much Lower Than Claimed
YafahEdelman (yafah-edelman-1) · 2025-03-17T22:41:06.527Z · comments (7)
Proof-of-Concept Debugger for a Small LLM
Peter Lai (peter-lai) · 2025-03-17T22:27:52.386Z · comments (0)
Effectively Communicating with DC Policymakers
PolicyTakes · 2025-03-17T22:11:56.197Z · comments (0)
[link] Mind the Gap
Bridgett Kay (bridgett-kay) · 2025-03-17T21:59:35.113Z · comments (0)
EIS XV: A New Proof of Concept for Useful Interpretability
scasper · 2025-03-17T20:05:30.580Z · comments (2)
[link] Sentinel's Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise.
NunoSempere (Radamantis) · 2025-03-17T19:34:01.850Z · comments (3)
Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations
Nicholas Goldowsky-Dill (nicholas-goldowsky-dill) · 2025-03-17T19:11:00.813Z · comments (7)
Things Look Bleak for White-Collar Jobs Due to AI Acceleration
Declan Molony (declan-molony) · 2025-03-17T17:03:35.585Z · comments (0)
[link] Three Types of Intelligence Explosion
rosehadshar · 2025-03-17T14:47:46.696Z · comments (8)
An Advent of Thought
Kaarel (kh) · 2025-03-17T14:21:08.765Z · comments (8)
Interested in working from a new Boston AI Safety Hub?
agucova · 2025-03-17T13:42:19.509Z · comments (0)
Other Civilizations Would Recover 84+% of Our Cosmic Resources - A Challenge to Extinction Risk Prioritization
Maxime Riché (maxime-riche) · 2025-03-17T13:12:09.770Z · comments (0)
← previous page (newer posts) · next page (older posts) →