LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Gradual Disempowerment: Simplified
Annapurna (jorge-velez) · 2025-02-22T16:59:39.072Z · comments (1)

[link] DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability
garrison · 2025-02-19T21:02:42.879Z · comments (1)

The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research
Arthur Conmy (arthur-conmy) · 2025-02-24T02:17:12.991Z · comments (0)

[link] Metaculus Q4 AI Benchmarking: Bots Are Closing The Gap
Molly (hickman-santini) · 2025-02-19T22:42:39.055Z · comments (0)

Deep sparse autoencoders yield interpretable features too
Armaan A. Abraham (armaanabraham) · 2025-02-23T05:46:59.189Z · comments (0)

[link] Published report: Pathways to short TAI timelines
Zershaaneh Qureshi (zershaaneh-qureshi) · 2025-02-20T22:10:12.276Z · comments (0)

The case for corporal punishment
Yair Halberstadt (yair-halberstadt) · 2025-02-23T15:05:28.149Z · comments (1)

What makes a theory of intelligence useful?
Cole Wyeth (Amyr) · 2025-02-20T19:22:29.725Z · comments (0)

Call for Applications: XLab Summer Research Fellowship
JoNeedsSleep (joanna-j-1) · 2025-02-18T19:19:20.155Z · comments (0)

[link] The Geometry of Linear Regression versus PCA
criticalpoints · 2025-02-23T21:01:33.415Z · comments (2)

[link] Are SAE features from the Base Model still meaningful to LLaVA?
Shan23Chen (shan-chen) · 2025-02-18T22:16:14.449Z · comments (2)

Talking to laymen about AI development
David Steel · 2025-02-17T18:42:23.289Z · comments (0)

A fable on AI x-risk
bgaesop · 2025-02-18T20:15:24.933Z · comments (2)

[link] Progress links and short notes, 2025-02-17
jasoncrawford · 2025-02-17T19:18:29.422Z · comments (0)

THE ARCHIVE
Jason Reid (jason-reid) · 2025-02-17T01:12:41.486Z · comments (0)

Make Superintelligence Loving
Davey Morse (davey-morse) · 2025-02-21T06:07:17.235Z · comments (9)

[link] The Dilemma’s Dilemma
James Stephen Brown (james-brown) · 2025-02-19T23:50:47.485Z · comments (8)

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)
gergogaspar (gergo-gaspar) · 2025-02-17T12:39:09.196Z · comments (0)

Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (1)

Intelligence as Privilege Escalation
Cole Wyeth (Amyr) · 2025-02-23T19:31:27.604Z · comments (0)

There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)

[link] Neural Scaling Laws Rooted in the Data Distribution
aribrill (Particleman) · 2025-02-20T21:22:10.306Z · comments (0)

Fun, endless art debates v. morally charged art debates that are intrinsically endless
danielechlin · 2025-02-21T04:44:22.712Z · comments (2)

AIS Berlin, events, opportunities and the flipped gameboard - Fieldbuilders Newsletter, February 2025
gergogaspar (gergo-gaspar) · 2025-02-17T14:16:31.834Z · comments (0)

Safe Distillation With a Powerful Untrusted AI
Alek Westover (alek-westover) · 2025-02-20T03:14:04.893Z · comments (1)

[link] Sparse Autoencoder Features for Classifications and Transferability
Shan23Chen (shan-chen) · 2025-02-18T22:14:12.994Z · comments (0)

Closed-ended questions aren't as hard as you think
electroswing · 2025-02-19T03:53:11.855Z · comments (0)

[link] Pre-ASI: The case for an enlightened mind, capital, and AI literacy in maximizing the good life
Noahh (noah-jackson) · 2025-02-21T00:03:47.922Z · comments (5)

[link] Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought
Lukas Petersson (lukas-petersson-1) · 2025-02-21T15:45:00.146Z · comments (0)

Claude 3.5 Sonnet (New)'s AGI scenario
Nathan Young · 2025-02-17T18:47:04.669Z · comments (2)

[link] AISN #48: Utility Engineering and EnigmaEval
Corin Katzke (corin-katzke) · 2025-02-18T19:15:16.751Z · comments (0)

Permanent properties of things are a self-fulfilling prophecy
YanLyutnev (YanLutnev) · 2025-02-19T00:08:20.776Z · comments (0)

Build a Metaculus Forecasting Bot in 30 Minutes: A Practical Guide
ChristianWilliams · 2025-02-22T03:52:14.753Z · comments (0)

Moral gauge theory: A speculative suggestion for AI alignment
James Diacoumis (james-diacoumis) · 2025-02-23T11:42:31.083Z · comments (2)

[link] Demonstrating specification gaming in reasoning models
Matrice Jacobine · 2025-02-20T19:26:20.563Z · comments (0)

Undesirable Conclusions and Origin Adjustment
Jerdle (daniel-amdurer) · 2025-02-19T18:35:23.732Z · comments (0)

Transformer Dynamics: a neuro-inspired approach to MechInterp
guitchounts · 2025-02-22T21:33:23.855Z · comments (0)

[link] New LLM Scaling Law
wrmedford · 2025-02-19T20:21:17.475Z · comments (0)

[question] Does human (mis)alignment pose a significant and imminent existential threat?
jr · 2025-02-23T10:03:40.269Z · answers+comments (3)

Workshop: Interpretability in LLMs Using Geometric and Statistical Methods
Karthik Viswanathan (vkarthik095) · 2025-02-22T09:39:26.446Z · comments (0)

[link] Modularity and assembly: AI safety via thinking smaller
D Wong (d-nell) · 2025-02-20T00:58:39.714Z · comments (0)

[link] Forecasting Uncontrolled Spread of AI
Alvin Ånestrand (alvin-anestrand) · 2025-02-22T13:05:57.171Z · comments (0)

On Static Space-Like Nature of Intelligence & Superintelligence
ank · 2025-02-22T00:12:36.263Z · comments (0)

Biological humans collectively exert at most 400 gigabits/s of control over the world.
benwr · 2025-02-20T23:44:06.509Z · comments (1)

[question] Why do we have the NATO logo?
KvmanThinking (avery-liu) · 2025-02-19T22:59:41.755Z · answers+comments (4)

AI alignment for mental health supports
hiki_t · 2025-02-24T04:21:42.379Z · comments (0)

An Alternate History of the Future, 2025-2040
Mr Beastly (mr-beastly) · 2025-02-24T05:53:25.521Z · comments (0)

Recursive Cognitive Refinement (RCR): A Self-Correcting Approach for LLM Hallucinations
mxTheo · 2025-02-22T21:32:50.832Z · comments (0)

Poll on AI opinions.
Niclas Kupper (niclas-kupper) · 2025-02-23T22:39:09.027Z · comments (0)

Places of Loving Grace [Story]
ank · 2025-02-18T23:49:18.580Z · comments (0)

← previous page (newer posts) · next page (older posts) →

^{^}

It occurs to me I said something similar to Rohin Shah recently, which was completely sincere, but honestly Jigsaw is likely an even stronger mutual match. When I reflect on how Jigsaw could have possibly fallen off my radar, I have to admit it has been an extremely stressful year, there had been no open positions, and it appeared all positions were on-site in New York (a difficult proposition for my wife and kids), so I had forced myself to relegate that to perhaps a longer-term goal.

LessWrong 2.0 Reader

Archive

Recent comments