LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Crime and Punishment #1
Zvi · 2025-04-21T15:30:06.420Z · comments (10)

Cautions about LLMs in Human Cognitive Loops
Alice Blair (Diatom) · 2025-03-02T19:53:10.253Z · comments (9)

AI #104: American State Capacity on the Brink
Zvi · 2025-02-20T14:50:06.375Z · comments (9)

Notable runaway-optimiser-like LLM failure modes on Biologically and Economically aligned AI safety benchmarks for LLMs with simplified observation format
Roland Pihlakas (roland-pihlakas) · 2025-03-16T23:23:30.989Z · comments (6)

LessOnline 2025: Early Bird Tickets On Sale
Ben Pace (Benito) · 2025-03-18T00:22:02.653Z · comments (4)

They Took MY Job?
Zvi · 2025-03-21T13:30:38.507Z · comments (4)

[link] Three Types of Intelligence Explosion
rosehadshar · 2025-03-17T14:47:46.696Z · comments (8)

[link] Existing Safety Frameworks Imply Unreasonable Confidence
Joe Rogero · 2025-04-10T16:31:50.240Z · comments (2)

We need (a lot) more rogue agent honeypots
Ozyrus · 2025-03-23T22:24:52.785Z · comments (12)

Boots theory and Sybil Ramkin
philh · 2025-03-18T22:10:08.855Z · comments (17)

On Writing #1
Zvi · 2025-03-04T13:30:06.103Z · comments (2)

AI #113: The o3 Era Begins
Zvi · 2025-04-24T13:40:06.043Z · comments (4)

Grok Grok
Zvi · 2025-02-24T14:20:08.877Z · comments (2)

ParaScopes: Do Language Models Plan the Upcoming Paragraph?
NickyP (Nicky) · 2025-02-21T16:50:20.745Z · comments (2)

The Rise of Hyperpalatability
Jack (jack-3) · 2025-04-02T20:18:04.407Z · comments (10)

Worries About AI Are Usually Complements Not Substitutes
Zvi · 2025-04-25T20:00:03.421Z · comments (3)

Announcing EXP: Experimental Summer Workshop on Collective Cognition
Jan_Kulveit · 2025-03-15T20:14:47.972Z · comments (2)

[Cross-post] Every Bay Area "Walled Compound"
davekasten · 2025-01-23T15:05:08.629Z · comments (3)

Extended analogy between humans, corporations, and AIs.
Daniel Kokotajlo (daniel-kokotajlo) · 2025-02-13T00:03:13.956Z · comments (2)

2024 was the year of the big battery, and what that means for solar power
transhumanist_atom_understander · 2025-02-01T06:27:39.082Z · comments (1)

[question] Will LLM agents become the first takeover-capable AGIs?
Seth Herd · 2025-03-02T17:15:37.056Z · answers+comments (10)

Any-Benefit Mindset and Any-Reason Reasoning
silentbob · 2025-03-15T17:10:14.682Z · comments (9)

Meditation and Reduced Sleep Need
niplav · 2025-04-04T14:42:54.792Z · comments (8)

Scaffolding Skills
Screwtape · 2025-04-18T17:39:25.634Z · comments (8)

[link] Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]
elifland · 2025-04-10T23:10:23.063Z · comments (0)

Call for Collaboration: Renormalization for AI safety
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T21:01:56.500Z · comments (0)

Operator
Zvi · 2025-01-28T20:00:08.374Z · comments (1)

System 2 Alignment
Seth Herd · 2025-02-13T19:17:56.868Z · comments (0)

[link] SuperBabies podcast with Gene Smith
Eneasz · 2025-02-19T19:36:49.852Z · comments (1)

[link] Introducing MASK: A Benchmark for Measuring Honesty in AI Systems
Richard Ren (RichardR) · 2025-03-05T22:56:46.155Z · comments (5)

Split Personality Training: Revealing Latent Knowledge Through Personality-Shift Tokens
Florian_Dietz · 2025-03-10T16:07:45.215Z · comments (3)

[link] AI Can't Write Good Fiction
JustisMills · 2025-03-12T06:11:57.786Z · comments (19)

[link] Forecasting Frontier Language Model Agent Capabilities
Govind Pimpale (govind-pimpale) · 2025-02-24T16:51:32.022Z · comments (0)

Can SAE steering reveal sandbagging?
jordine · 2025-04-15T12:33:41.264Z · comments (3)

[link] Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo (ricraz) · 2025-04-07T00:31:34.098Z · comments (7)

ARENA 5.0 - Call for Applicants
JamesH (AtlasOfCharts) · 2025-01-30T13:18:27.052Z · comments (2)

AI #106: Not so Fast
Zvi · 2025-03-06T15:40:05.919Z · comments (4)

Why Are The Human Sciences Hard? Two New Hypotheses
Aydin Mohseni (aydin-mohseni) · 2025-03-18T15:45:52.239Z · comments (14)

[link] Hunting for AI Hackers: LLM Agent Honeypot
Reworr R (reworr-reworr) · 2025-02-12T20:29:32.269Z · comments (0)

Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth (pktechgirl) · 2025-04-07T19:00:08.039Z · comments (3)

Writing experiments and the banana escape valve
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-23T13:11:24.215Z · comments (1)

Avoid the Counterargument Collapse
marknm · 2025-03-26T03:19:58.655Z · comments (3)

Everything I Know About Semantics I Learned From Music Notation
J Bostock (Jemist) · 2025-03-09T18:09:11.789Z · comments (2)

Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
Alex Mallen (alex-mallen) · 2025-03-24T17:55:59.358Z · comments (0)

More Fun With GPT-4o Image Generation
Zvi · 2025-04-03T02:10:02.317Z · comments (3)

Reasons-based choice and cluelessness
JesseClifton · 2025-02-07T22:21:47.232Z · comments (0)

FLAKE-Bench: Outsourcing Awkwardness in the Age of AI
annas (annasoli) · 2025-04-01T17:08:25.092Z · comments (0)

OpenAI rewrote its Preparedness Framework
Zach Stein-Perlman · 2025-04-15T20:00:50.614Z · comments (1)

[link] Meta: Frontier AI Framework
Zach Stein-Perlman · 2025-02-03T22:00:17.103Z · comments (2)

DeepSeek: Lemon, It’s Wednesday
Zvi · 2025-01-29T15:00:07.914Z · comments (0)

← previous page (newer posts) · next page (older posts) →

^{^}

Perhaps we could set it up so that, e. g., the first time you instantiate the connection between two representations, the task is handed off to a big LLM, which infers a bunch of rules and writes a bunch of code snippets regarding how to manage the connection, and the subsequent calls are forwarded to smaller, faster LLMs with a bunch of context provided by the big LLM to assist them. But again: would that work? Is there a way to frontload the work like this? Would smaller LLMs be up for the task?

LessWrong 2.0 Reader

Archive

Recent comments