LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] AI 2027: What Superintelligence Looks Like
Daniel Kokotajlo (daniel-kokotajlo) · 2025-04-03T16:23:44.619Z · comments (204)

LessWrong has been acquired by EA
habryka (habryka4) · 2025-04-01T13:09:11.153Z · comments (45)

[link] Will Jesus Christ return in an election year?
Eric Neyman (UnexpectedValues) · 2025-03-24T16:50:53.019Z · comments (45)

VDT: a solution to decision theory
L Rudolf L (LRudL) · 2025-04-01T21:04:09.509Z · comments (25)

Policy for LLM Writing on LessWrong
jimrandomh · 2025-03-24T21:41:30.965Z · comments (65)

[link] Recent AI model progress feels mostly like bullshit
lc · 2025-03-24T19:28:43.450Z · comments (79)

[link] Playing in the Creek
Hastings (hastings-greer) · 2025-04-10T17:39:28.883Z · comments (6)

Why Have Sentence Lengths Decreased?
Arjun Panickssery (arjun-panickssery) · 2025-04-03T17:50:29.962Z · comments (67)

[link] Tracing the Thoughts of a Large Language Model
Adam Jermyn (adam-jermyn) · 2025-03-27T17:20:02.162Z · comments (22)

[link] Thoughts on AI 2027
Max Harms (max-harms) · 2025-04-09T21:26:23.926Z · comments (48)

Why Should I Assume CCP AGI is Worse Than USG AGI?
Tomás B. (Bjartur Tómas) · 2025-04-19T14:47:52.167Z · comments (60)

Short Timelines Don't Devalue Long Horizon Research
Vladimir_Nesov · 2025-04-09T00:42:07.324Z · comments (23)

Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI
Kaj_Sotala · 2025-04-15T15:56:19.466Z · comments (48)

Accountability Sinks
Martin Sustrik (sustrik) · 2025-04-22T05:00:02.617Z · comments (5)

[link] Conceptual Rounding Errors
Jan_Kulveit · 2025-03-26T19:00:31.549Z · comments (15)

[link] Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study
Adam Karvonen (karvonenadam) · 2025-04-14T17:38:02.918Z · comments (41)

Alignment Faking Revisited: Improved Classifiers and Open Source Extensions
John Hughes (john-hughes) · 2025-04-08T17:32:55.315Z · comments (19)

OpenAI #12: Battle of the Board Redux
Zvi · 2025-03-31T15:50:02.156Z · comments (1)

Training AGI in Secret would be Unsafe and Unethical
Daniel Kokotajlo (daniel-kokotajlo) · 2025-04-18T12:27:35.795Z · comments (14)

The Pando Problem: Rethinking AI Individuality
Jan_Kulveit · 2025-03-28T21:03:28.374Z · comments (13)

AI-enabled coups: a small group could use AI to seize power
Tom Davidson (tom-davidson-1) · 2025-04-16T16:51:29.561Z · comments (17)

Ctrl-Z: Controlling AI Agents via Resampling
Aryan Bhatt (abhatt349) · 2025-04-16T16:21:23.781Z · comments (0)

Learned pain as a leading cause of chronic pain
SoerenMind · 2025-04-09T11:57:58.523Z · comments (13)

Downstream applications as validation of interpretability progress
Sam Marks (samuel-marks) · 2025-03-31T01:35:02.722Z · comments (3)

Three Months In, Evaluating Three Rationalist Cases for Trump
Arjun Panickssery (arjun-panickssery) · 2025-04-18T08:27:27.257Z · comments (24)

[link] Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
lewis smith (lsgos) · 2025-03-26T19:07:48.710Z · comments (15)

[link] Explaining British Naval Dominance During the Age of Sail
Arjun Panickssery (arjun-panickssery) · 2025-03-28T05:47:28.561Z · comments (5)

New Cause Area Proposal
CallumMcDougall (TheMcDouglas) · 2025-04-01T07:12:34.360Z · comments (4)

AI 2027: Responses
Zvi · 2025-04-08T12:50:02.197Z · comments (3)

AI 2027 is a Bet Against Amdahl's Law
snewman · 2025-04-21T03:09:40.751Z · comments (44)

Among Us: A Sandbox for Agentic Deception
7vik (satvik-golechha) · 2025-04-05T06:24:49.000Z · comments (4)

Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red
Julian Bradshaw · 2025-04-21T03:52:34.759Z · comments (16)

How training-gamers might function (and win)
Vivek Hebbar (Vivek) · 2025-04-11T21:26:18.669Z · comments (5)

Third-wave AI safety needs sociopolitical thinking
Richard_Ngo (ricraz) · 2025-03-27T00:55:30.548Z · comments (23)

Impact, agency, and taste
benkuhn · 2025-04-19T21:10:06.960Z · comments (1)

The Lizardman and the Black Hat Bobcat
Screwtape · 2025-04-06T19:02:01.238Z · comments (13)

How I talk to those above me
Maxwell Peterson (maxwell-peterson) · 2025-03-30T06:54:59.869Z · comments (13)

Show, not tell: GPT-4o is more opinionated in images than in text
Daniel Tan (dtch1997) · 2025-04-02T08:51:02.571Z · comments (41)

How To Believe False Things
Eneasz · 2025-04-02T16:28:29.055Z · comments (10)

[link] ASI existential risk: Reconsidering Alignment as a Goal
habryka (habryka4) · 2025-04-15T19:57:42.547Z · comments (14)

One-shot steering vectors cause emergent misalignment, too
Jacob Dunefsky (jacob-dunefsky) · 2025-04-14T06:40:41.503Z · comments (6)

$500 Bounty Problem: Are (Approximately) Deterministic Natural Latents All You Need?
johnswentworth · 2025-04-21T20:19:30.808Z · comments (4)

A Slow Guide to Confronting Doom
Ruby · 2025-04-06T02:10:56.483Z · comments (20)

Is Gemini now better than Claude at Pokémon?
Julian Bradshaw · 2025-04-19T23:34:43.298Z · comments (12)

Keltham's Lectures in Project Lawful
Morpheus · 2025-04-01T10:39:47.973Z · comments (4)

You will crash your car in front of my house within the next week
Richard Korzekwa (Grothor) · 2025-04-01T21:43:21.472Z · comments (6)

Mistral Large 2 (123B) exhibits alignment faking
Marc Carauleanu (Marc-Everin Carauleanu) · 2025-03-27T15:39:02.176Z · comments (4)

What Makes an AI Startup "Net Positive" for Safety?
jacquesthibs (jacques-thibodeau) · 2025-04-18T20:33:22.682Z · comments (23)

Announcing ILIAD2: ODYSSEY
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2025-04-03T17:01:06.004Z · comments (1)

Why does LW not put much more focus on AI governance and outreach?
Severin T. Seehrich (sts) · 2025-04-12T14:24:54.197Z · comments (31)

next page (older posts) →

Archive

Recent comments

katalina-hernandez on The EU Is Asking for Feedback on Frontier AI Regulation (Open to Global Experts)—This Post Breaks Down What’s at Stake for AI Safety

OpenAI, Anthropic and Google DeepMind are the main signatories already to these Codes of Practice.

So, whatever is agreed / negotiated is what will impact frontier AI companies. That is the problem.

I'd love to see specific criticisms from you on sections 3, 4 or 5 of this post! I am happy to provide feedback myself based on useful suggestions that come up in this thread.

gum1h0x on gum1h0x's Shortform

Most benchmarks face inherent problem of goodhart's law: as soon as they become a target metric, efforts converge on optimizing for the benchmark itself, potentially diverging from the capabilities it was meant to measure.

towards_keeperhood on Introduction to Representing Sentences as Logical Statements

Btw, I think my explanation of why to not have objects for events was not very good. I think I can explain it a bit better now. If you think that would be useful to you, lmk.

artemium on The EU Is Asking for Feedback on Frontier AI Regulation (Open to Global Experts)—This Post Breaks Down What’s at Stake for AI Safety

In an ideal world, global enforcement of AI regulation might make sense. However, in reality, I see little value in EU-specific regulations like these. They are unlikely to impact frontier AI companies such as OpenAI, Anthropic, Google DeepMind, xAI, and DeepSeek, all of which are based outside the EU. These firms might accept the cost of exiting the EU market if regulations become too burdensome.

While the EU market is significant, in a fast-takeoff, winner-takes-all AI race (as outlined in the AI-2027 forecast), market access alone may not sway these companies’ safety policies. Worse, such regulations could backfire, locking the EU out of advanced AI models and crippling its competitiveness. This could deter other nations from adopting similar rules, further isolating the EU.

As an EU citizen, I view the game theory in an "AGI-soon" world as follows:

Alignment Hard
EU imposes strict AI regulations → Frontier companies exit the EU or withhold their latest models, continuing the AI race → Unaligned AI emerges, potentially catastrophic for all, including Europeans. Regulations prove futile.

Alignment Easy
EU imposes strict AI regulations → Frontier companies exit the EU, continuing the AI race → Aligned AI creates a utopia elsewhere (e.g., the US), while the EU lags, stuck in a technological "stone age."

Both scenarios are grim for Europe.

I could be mistaken, but the current US administration and leaders of top AI labs seem fully committed to a cutthroat AGI race, as articulated in situational awareness narratives. They appear prepared to go to extraordinary lengths to maintain supremacy, undeterred by EU demands. Their primary constraints are compute and, soon, energy - not money! If AI becomes a national security priority, access to near-infinite resources could render EU market losses a minor inconvenience. Notably, the comprehensive AI-2027 forecast barely mentions Europe, underscoring its diminishing relevance.

For the EU to remain significant, I see two viable strategies:

Full integration with US AI efforts, securing a guarantee of equal benefits from aligned superintelligence. This could also give EU AI safety labs a seat at the table for alignment discussions.
Develop an autonomous EU AI leader, excelling in capabilities and alignment research to negotiate with the US and China as an equal. This would demand a drastic policy shift, massive investment in data centers and nuclear power, and deregulation, likely unrealistic in the short term.

yonge on D&D.Sci Tax Day: Adventurers and Assessments

The tax is always the same for the same set of monster parts so no randomness is involved.

I then looked for entries where only one type of part was present. With the exception of the heads this gave some obvious formulas:
When only eyes are present no tax is paid
When only heads are present tax is 2.8 for 1, 8.4 for 2 21 for 3 and 29.4 for 4.
When only skulls are present tax is the number of skulls
When only hands are present the tax is 0.2 times the number of hands.
When only horns are present and their number is < 5 the tax is 1.4*number of horns, and 1.75*number of horns when >= 5 are present.

Next I looked for records where only two types of parts were present, but with the following the exceptions it didn't give anything obvious:
When only skulls and hands are present the tax rate is #SKULL + 0.2*#HAND
When only horns and hands are present the tax rate is:
1.4*#HORN + 0.4*#HAND provided the total tax bill is less than 6
else when horns < 5 and the total tax is < 18: 2.1*#HORN + 0.6*#HAND
else when horns < 5: 2.8*#HORN + 0.8*#HAND
When horns >= 5 : 1.75*#HORN + 0.5*#HAND

After much looking at the data a lot I was then able to find the following formulas when skulls, horns, and hands were all present:
1.4*#HORN + 0.4*#HAND + 2*#SKULL provided the result is < 6
Else 1.75*HORN + 0.5*HAND + 2.5*SKULL provided there are at least 5 horns
Else 2.1*#HORN + 0.6*#HAND + 3*SKULL provided the result is less than 18
Else 2.8*#HORN + 0.8*#HAND + 4*SKULL provided ther result is less than 40
Else 3.5*#HORN + 1*#HAND + 5*#SKULL

Eyes and particularly heads seem to introduce a lot of extra complexity.

The best record I could find with 4 eyes and 4 heads had 4 eyes, 4 heads and 1 hand, so I tried to give these to 1 adventurer, and then allocate the rest amonst the remaining 3 according to these formulas. However the result was worse than the best I could find by looking up the tax for various combinations in the datafile. I will therefor use this as my entry if I can't work out what is going on with the eyes/heads.
Adventurer 1: EYE(1)HEAD(1)SKULL(5)HORN(6)HAND(2)TAX: 23
Adventurer2: EYE(1)HEAD(1)SKULL(0)HORN(1)HAND(0)TAX: 0
Adventurer3: EYE(1)HEAD(1)SKULL(0)HORN(0)HAND(3)TAX: 0
Adventurer4: EYE(1)HEAD(1)SKULL(0)HORN(0)HAND(3)TAX: 0
Total estimated tax is 23

gergogaspar on Why Experienced Professionals Fail to Land High-Impact Roles (FBB #5)

I fixed this at some point in the meantime, thanks for flagging!

aidan-o-gara on aog's Shortform

Yeah I think that’d be reasonable too. You could talk about these clusters at many different levels of granularity, and there are tons I haven’t named.

luck-1 on Moral patienthood of simulated minds allows uncountabe infinity of value on finite hardware

You're correct that this is what happens at one of the abstraction layers. But the choice of that layer is pretty arbitrary. By abstraction layers:

L1: hypervisor interface: uncountably many VMs

L2: hypervisor implementation: countably many VMs

L3: semiconductors: no VMs, only high and low signals

L4: electrons: no high and low signals, only electromagnetic fields

So yes, on L2 the number of VMs is finite. But why morality should count what happens on L2 and not on L1 or L3, L4? This is too arbitrary.

guive on Three Months In, Evaluating Three Rationalist Cases for Trump

I agree with your broader point, but it's actually more than 10,000 people per year.

katalina-hernandez on The EU Is Asking for Feedback on Frontier AI Regulation (Open to Global Experts)—This Post Breaks Down What’s at Stake for AI Safety

It will probably be lengthy but thank you very much for contributing! DM me if you come across any "legal" question about the AI Act :).