LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Can LLMs learn Steganographic Reasoning via RL?
robert mccarthy (robert-mccarthy) · 2025-04-11T16:33:03.378Z · comments (0)

Understanding Trust - Overview Presentations
abramdemski · 2025-04-16T18:05:39.792Z · comments (0)

Anti-memes: x-risk edition
WillPetillo · 2025-04-10T23:35:30.756Z · comments (0)

[link] Taxonomy of possibility
dkl9 · 2025-04-09T04:24:09.439Z · comments (1)

Quarter Inch Cables are Devious
jefftk (jkaufman) · 2025-04-05T02:40:05.054Z · comments (4)

Can I learn language faster? Or, perhaps, can I memorize the foreign words and recall them faster?
jmh · 2025-04-11T00:01:25.530Z · comments (6)

Crash scenario 1: Rapidly mobilise for a 2025 AI crash
Remmelt (remmelt-ellen) · 2025-04-11T06:54:47.974Z · comments (4)

What alignment-relevant abilities might Terence Tao lack?
Towards_Keeperhood (Simon Skade) · 2025-04-07T19:44:18.620Z · comments (2)

Kamelo: A Rule-Based Constructed Language for Universal, Logical Communication
Saif Khan (saif-khan) · 2025-04-16T18:44:00.139Z · comments (7)

Calling Bullshit - the Cheatsheet
Niklas Lehmann · 2025-04-12T11:43:23.822Z · comments (3)

[link] METR’s preliminary evaluation of o3 and o4-mini
Christopher King (christopher-king) · 2025-04-16T20:23:00.285Z · comments (2)

[link] The Russell Conjugation Illuminator
TimmyM (timmym) · 2025-04-17T19:33:06.924Z · comments (1)

[link] Should AIs be Encouraged to Cooperate?
PeterMcCluskey · 2025-04-15T21:57:06.096Z · comments (2)

Moonlight Reflected
Jacob Falkovich (Jacobian) · 2025-04-07T15:35:11.708Z · comments (0)

The world according to ChatGPT
Richard_Kennaway · 2025-04-07T13:44:43.781Z · comments (0)

Theories of Impact for Causality in AI Safety
alexisbellot (alexis-1) · 2025-04-11T20:16:37.571Z · comments (1)

What does Yann LeCun think about AGI? A summary of his talk, "Mathematical Obstacles on the Way to Human-Level AI"
Adam Jones (domdomegg) · 2025-04-05T12:21:25.024Z · comments (0)

A Talmudic Rationalist Cautionary Tale
Noah Birnbaum (daniel-birnbaum) · 2025-04-15T04:11:16.972Z · comments (1)

[question] How likely are the USA to decay and how will it influence the AI development?
StanislavKrym · 2025-04-12T04:42:27.604Z · answers+comments (0)

[link] Can LLM-based models do model-based planning?
jylin04 · 2025-04-16T12:38:00.793Z · comments (1)

Host Keys and SSHing to EC2
jefftk (jkaufman) · 2025-04-17T15:10:29.139Z · comments (2)

[link] Announcing Progress Conference 2025
jasoncrawford · 2025-04-17T17:12:44.191Z · comments (0)

What are good safety standards for open source AIs from China?
ChristianKl · 2025-04-12T13:06:16.663Z · comments (2)

[link] Telescoping
za3k (lispalien) · 2025-04-16T17:05:52.392Z · comments (1)

Misinformation is the default, and information is the government telling you your tap water is safe to drink
danielechlin · 2025-04-07T22:28:18.158Z · comments (2)

Coupling for Decouplers — Intro
Jacob Falkovich (Jacobian) · 2025-04-07T15:12:26.892Z · comments (0)

The Mirror Problem in AI: Why Language Models Say Whatever You Want
RobT · 2025-04-15T18:40:02.793Z · comments (2)

[link] Grounded Ghosts in the Machine - Friston Blankets, Mirror Neurons, and the Quest for Cooperative AI
Davidmanheim · 2025-04-10T10:15:54.880Z · comments (0)

Risers for Foot Percussion
jefftk (jkaufman) · 2025-04-15T11:10:08.577Z · comments (0)

What empirical research directions has Eliezer commented positively on?
Chris_Leong · 2025-04-15T08:53:41.677Z · comments (1)

Nuanced Models for the Influence of Information
ozziegooen · 2025-04-10T18:28:34.082Z · comments (0)

[link] Human-level is not the limit
Vishakha (vishakha-agrawal) · 2025-04-16T08:33:15.498Z · comments (2)

[link] Paper Highlights, March '25
gasteigerjo · 2025-04-07T20:17:42.944Z · comments (0)

MATS is hiring!
Ryan Kidd (ryankidd44) · 2025-04-08T20:45:15.280Z · comments (0)

Linkpost to a Summary of "Imagining and building wise machines: The centrality of AI metacognition" by Johnson, Karimi, Bengio, et al.
Chris_Leong · 2025-04-10T11:54:37.484Z · comments (0)

[Research sprint] Single-model crosscoder feature ablation and steering
Thomas Read (thjread) · 2025-04-06T14:42:30.357Z · comments (0)

Breaking down the MEAT of Alignment
JasonBrown · 2025-04-07T08:47:22.080Z · comments (2)

Commitment Races are a technical problem ASI can easily solve
Knight Lee (Max Lee) · 2025-04-12T22:22:47.790Z · comments (6)

An Optimistic 2027 Timeline
Yitz (yitz) · 2025-04-06T16:39:36.554Z · comments (13)

Mass Exposure Paradox
max-sixty · 2025-04-16T20:18:00.492Z · comments (0)

I Have No Mouth but I Must Speak
Jack (jack-3) · 2025-04-05T07:42:54.424Z · comments (8)

[link] EA Reflections on my Military Career
TomGardiner (HorusXVI) · 2025-04-10T19:01:42.844Z · comments (0)

The Three Boxes: A Simple Model for Spreading Ideas
JohnGreer · 2025-04-10T17:15:55.163Z · comments (0)

Some OthelloGPT Circuits
Alfred Wong (alfred-wong) · 2025-04-15T18:41:36.216Z · comments (0)

[link] AISN #51: AI Frontiers
Corin Katzke (corin-katzke) · 2025-04-15T16:01:56.701Z · comments (1)

You Are Not a Thought Experiment
Jacob Falkovich (Jacobian) · 2025-04-07T15:27:42.956Z · comments (0)

Arguing all sides with ChatGPT 4.5
Richard_Kennaway · 2025-04-07T13:10:11.562Z · comments (0)

$500 bounty for best short-form fiction about our near future world; $100 for recommending winning piece: new “Art of Near Future World” quarterly art project
Ramon Gonzalez (ramon-gonzalez) · 2025-04-15T00:46:10.637Z · comments (0)

[link] Distributed whistleblowing
samuelshadrach (xpostah) · 2025-04-12T06:36:05.952Z · comments (5)

Gamify life from BayesianMind
P. João (gabriel-brito) · 2025-04-16T16:17:49.284Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

viliam on 8 PRIME SKILLS - A simplified construction from MaxEnt Informational Efficiency in 4 questions

That's too abstract, I have no idea what it is supposed to mean and how it is supposed to be used.

niplav on shortplav

Hm, good point. I'll amend the previous post.

viliam on shortplav

When I think about a good business idea, but end up doing nothing, I often later find out that someone else did it.

viliam on Kamelo: A Rule-Based Constructed Language for Universal, Logical Communication

How would a language like this survive a change in ontology? You take a category and split it into 5 subcategories. What if two years later you find out that a sixth subcategory exists?

If you update the language, you would have to rewrite all existing texts. The problem would not be that they contain archaic words -- it would be that all the words are still used, but now they mean something different.

Seemingly similar words (prepending one syllable to a long word or a sentence) will result in a wildly different meaning.

shankar-sivarajan on AI-enabled coups: a small group could use AI to seize power

It's called "defensive democracy," and is standard practice in most of Europe.

jiro on A Dissent on Honesty

Advocating for more lying seems like especially bad advice to give to people with poor social skills, because they lack the skills to detect if they’re succeeding at learning how to lie or if they’re just burning what little social capital they have for no gain.

I think the advice works better as "if it's a social situation, and the situation calls for what you consider to be a lie, don't let that stop you." You do not have to tell someone that you're not feeling fine when they ask how you're doing. You do not need to tell them that actually the color they painted their house in is really ugly. And you certainly shouldn't go to a job interview, get asked for your biggest weakness, and actually state your biggest weakness.

If someone reads the advice and thinks "Lying, that's an idea! I'll use it every time I can" they've overcorrected by far too much.

viliam on Doing Prioritization Better

I think this article would be much better with many specific examples. (If that would make it too long, just split it into a series of articles.)

alphaandomega on The Russell Conjugation Illuminator

The input of 2k characters is rather limiting, albeit understandable. Giving these instructions to an existing LLM (I used Gemini 2.5 Pro) gives longer, better results without the need for a dedicated tool.

viliam on Gamify life from BayesianMind

I agree. Any punishment in a system has the side effect of punishing you for using the system.

The second suggestion is an interesting one. It would probably work better if you had an AI watching you constantly and summarizing your daily activities. If doing some seemingly unimportant X predictably makes you more likely to do some desirable Y later, you want to know about it. But if you write your diary manually, there is a chance that you won't notice X, or won't consider it important enough to mention.

hleumas on The Bell Curve of Bad Behavior

I wonder if this lurch happens at the two meter mark in countries that use the metric system?

No way. First, we do centimeters, so 195cm not 1.95m.

Second, 2m is crazy high. You pity people over 2m for their terrible life in a society that is not accustomed to that height, you don’t envy them.