LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] AI 2027: What Superintelligence Looks Like
Daniel Kokotajlo (daniel-kokotajlo) · 2025-04-03T16:23:44.619Z · comments (208)

How to Make Superbabies
GeneSmith · 2025-02-19T20:39:38.971Z · comments (332)

[link] How AI Takeover Might Happen in 2 Years
joshc (joshua-clymer) · 2025-02-07T17:10:10.530Z · comments (137)

A Bear Case: My Predictions Regarding AI Progress
Thane Ruthenis · 2025-03-05T16:41:37.639Z · comments (155)

LessWrong has been acquired by EA
habryka (habryka4) · 2025-04-01T13:09:11.153Z · comments (45)

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Jan Betley (jan-betley) · 2025-02-25T17:39:31.059Z · comments (91)

[link] Will Jesus Christ return in an election year?
Eric Neyman (UnexpectedValues) · 2025-03-24T16:50:53.019Z · comments (45)

VDT: a solution to decision theory
L Rudolf L (LRudL) · 2025-04-01T21:04:09.509Z · comments (26)

Policy for LLM Writing on LessWrong
jimrandomh · 2025-03-24T21:41:30.965Z · comments (65)

[link] Recent AI model progress feels mostly like bullshit
lc · 2025-03-24T19:28:43.450Z · comments (79)

[link] Playing in the Creek
Hastings (hastings-greer) · 2025-04-10T17:39:28.883Z · comments (6)

Murder plots are infohazards
Chris Monteiro (chris-topher) · 2025-02-13T19:15:09.749Z · comments (44)

[link] Good Research Takes are Not Sufficient for Good Strategic Takes
Neel Nanda (neel-nanda-1) · 2025-03-22T10:13:38.257Z · comments (28)

So You Want To Make Marginal Progress...
johnswentworth · 2025-02-07T23:22:19.825Z · comments (42)

Arbital has been imported to LessWrong
RobertM (T3t) · 2025-02-20T00:47:33.983Z · comments (30)

Why Have Sentence Lengths Decreased?
Arjun Panickssery (arjun-panickssery) · 2025-04-03T17:50:29.962Z · comments (78)

[link] METR: Measuring AI Ability to Complete Long Tasks
Zach Stein-Perlman · 2025-03-19T16:00:54.874Z · comments (104)

Accountability Sinks
Martin Sustrik (sustrik) · 2025-04-22T05:00:02.617Z · comments (21)

[link] Tracing the Thoughts of a Large Language Model
Adam Jermyn (adam-jermyn) · 2025-03-27T17:20:02.162Z · comments (22)

[link] Trojan Sky
Richard_Ngo (ricraz) · 2025-03-11T03:14:00.681Z · comments (39)

[link] A History of the Future, 2025-2040
L Rudolf L (LRudL) · 2025-02-17T12:03:58.355Z · comments (41)

Why Should I Assume CCP AGI is Worse Than USG AGI?
Tomás B. (Bjartur Tómas) · 2025-04-19T14:47:52.167Z · comments (71)

[link] Thoughts on AI 2027
Max Harms (max-harms) · 2025-04-09T21:26:23.926Z · comments (49)

[link] Power Lies Trembling: a three-book review
Richard_Ngo (ricraz) · 2025-02-22T22:57:59.720Z · comments (24)

[link] Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?
garrison · 2025-02-11T00:20:41.421Z · comments (8)

Eliezer's Lost Alignment Articles / The Arbital Sequence
Ruby · 2025-02-20T00:48:10.338Z · comments (9)

“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (26)

Why White-Box Redteaming Makes Me Feel Weird
Zygi Straznickas (nonagon) · 2025-03-16T18:54:48.078Z · comments (34)

Will alignment-faking Claude accept a deal to reveal its misalignment?
ryan_greenblatt · 2025-01-31T16:49:47.316Z · comments (28)

Intention to Treat
Alicorn · 2025-03-20T20:01:19.456Z · comments (4)

[link] To Understand History, Keep Former Population Distributions In Mind
Arjun Panickssery (arjun-panickssery) · 2025-04-23T04:51:26.936Z · comments (7)

[link] OpenAI: Detecting misbehavior in frontier reasoning models
Daniel Kokotajlo (daniel-kokotajlo) · 2025-03-11T02:17:21.026Z · comments (25)

Catastrophe through Chaos
Marius Hobbhahn (marius-hobbhahn) · 2025-01-31T14:19:08.399Z · comments (17)

Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations
Nicholas Goldowsky-Dill (nicholas-goldowsky-dill) · 2025-03-17T19:11:00.813Z · comments (7)

So how well is Claude playing Pokémon?
Julian Bradshaw · 2025-03-07T05:54:45.357Z · comments (74)

[link] On the Rationality of Deterring ASI
Dan H (dan-hendrycks) · 2025-03-05T16:11:37.855Z · comments (34)

Short Timelines Don't Devalue Long Horizon Research
Vladimir_Nesov · 2025-04-09T00:42:07.324Z · comments (23)

[link] Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
Jan_Kulveit · 2025-01-30T17:03:45.545Z · comments (52)

Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI
Kaj_Sotala · 2025-04-15T15:56:19.466Z · comments (48)

I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
shrimpy · 2025-03-16T16:52:42.177Z · comments (25)

Reducing LLM deception at scale with self-other overlap fine-tuning
Marc Carauleanu (Marc-Everin Carauleanu) · 2025-03-13T19:09:43.620Z · comments (40)

[question] Have LLMs Generated Novel Insights?
abramdemski · 2025-02-23T18:22:12.763Z · answers+comments (36)

[link] Self-fulfilling misalignment data might be poisoning our AI models
TurnTrout · 2025-03-02T19:51:14.775Z · comments (27)

It's been ten years. I propose HPMOR Anniversary Parties.
Screwtape · 2025-02-16T01:43:14.586Z · comments (3)

[link] Jaan Tallinn's 2024 Philanthropy Overview
jaan · 2025-04-23T11:06:11.779Z · comments (6)

Statistical Challenges with Making Super IQ babies
Jan Christian Refsgaard (jan-christian-refsgaard) · 2025-03-02T20:26:22.103Z · comments (26)

[link] Conceptual Rounding Errors
Jan_Kulveit · 2025-03-26T19:00:31.549Z · comments (15)

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better
Thane Ruthenis · 2025-02-21T20:15:11.545Z · comments (51)

Levels of Friction
Zvi · 2025-02-10T13:10:07.224Z · comments (8)

Methods for strong human germline engineering
TsviBT · 2025-03-03T08:13:49.414Z · comments (28)

next page (older posts) →

Archive

Recent comments

knight-lee on Worries About AI Are Usually Complements Not Substitutes

Are there any suggestions for how to get this message across? To all those AI x-risk disbelievers?

michaeldickens on o3 Is a Lying Liar

Huh. I knew that's how ChatGPT worked but I had assumed they would've worked out a less hacky solution by now!

ete on Jaan Tallinn's 2024 Philanthropy Overview

You've funded a what looks from my vantage point to be a huge portion of the quality-adjusted attempts to avert doom, perhaps a majority. Much appreciation for stepping up for humanity.

ebenezer-dukakis on less-wronger-numb89's Shortform

Technically the point of going to college is to help you thrive in the rest of your life after college. If you believe in AI 2027, the most important thing for the rest of your life is for AI to be developed responsibly. So, maybe work on that instead of college?

I think the EU could actually be good place to protest for an AI pause. Because the EU doesn't have national AI ambitions, and the EU is increasingly skeptical of the US, it seems to me that a bit of protesting could do a lot to raise awareness of the reckless path that the US is taking. That, in turn, could motivate the EU to apply leverage via ASML, sanctions, etc.

The only thing I'm worried about is that EU criticism of the US could create anti-EU polarization among the GOP in the US, which motivates them to be more reckless on AI. This question seems worth a lot more study.

nate-showell on This prompt (sometimes) makes ChatGPT think about terrorist organisations

Have you tried seeing how ChatGPT responds to individual lines of code from that excerpt? There might be an anomalous token in it along the lines of " petertodd" [LW · GW].

anthonyc on Fish and Faces

I'd say that in most contexts in normal human life, (3) is the thing that makes this less of an issue for (1) and (2). If the thing I'm hearing about it real, I'll probably keep hearing about it, and from more sources. If I come across 100 new crazy-seeming ideas and decide to indulge them 1% of the time, and so do many other people, that's usually, probably enough to amplify the ones that (seem to) pan out. By the time I hear about the thing from 2, 5, or 20 sources, I will start to suspect it's worth thinking about at a higher level.

veedrac on less-wronger-numb89's Shortform

Ultimately you have to make a bet on your guesses of reality. If your modal guess is civilizational collapse in 2-3 years, skipping uni is hardly a disproportionate action, but at the same time it's not going to win you much either. Personally I'd leave the uni-or-not decision to the plausible worlds where the choice matters more, and look for some higher leverage change you can make for the rest.

snewman on AI 2027 is a Bet Against Amdahl's Law

I added up the median "Predictions for gap size" in the "How fast can the task difficulty gaps be crossed?" table, summing each set of predictions separately ("Eli", "Nikola", "FutureSearch") to get three numbers ranging from 30-75.

Does this table cover the time between now and superhuman coder? I thought it started at RE-Bench, because:

I took all of this to be in context of the phrase, about one page back, "For each gap after RE-Bench saturation"
The earlier explanation that Method 2 is "a more complex model starting from a forecast saturation of an AI R&D benchmark (RE-Bench), and then how long it will take to go from that system to one that can handle real-world tasks at the best AGI company" [emphasis added]
The first entry in the table ("Time horizon: Achieving tasks that take humans lots of time") sounds more difficult than saturating RE-Bench.
Earlier, there's a separate discussion forecasting time to RE-bench saturation.

But sounds like I was misinterpreting?

anthonyc on AI 2027 is a Bet Against Amdahl's Law

Exactly. More fundamentally, that is not a probability graph, it's a probability density graph, and we're not shown the line beyond 2032 but just have to assume the integral from 2100-->infinity is >10% of the integral from 0-->infinity. Infinity is far enough away that the decay doesn't even need to be all that slow for the total to be that high.

linch on You Better Mechanize

(There’s a funny aside on Thermopylae, and the limits of ‘excellent leadership,’ yes they did well but they ultimately lost. To which I would respond, they only ultimately lost because they got outflanked, but also in this case ‘good leadership’ involves a much bigger edge. A better example is, classically, Cortes, who they mention later. Who had to fight off another Spanish force and then still won. But hey.)

I know this is an aside to your aside, but as an avid Sparta-hater, I want to point out that we don't have much evidence that Spartans are good at military leadership, and indeed plenty of evidence in the other direction:

Sparta's opponents often won using clever, innovative tactics, such as at the Battle of Phyle (404 BC), the Battle of Olpae (426 BC), the Battle of Cyzicus (410 BC), the Battle of Arginusae (406 BC), Battle of Tegyra (375 BC), the Battle of Leuctra (371 BC), and the 2nd Battle of Mantinea (362 BC). The Spartans, so far as I have found in reviewing these 51 battles, never made a single creative strategic or tactical innovation. They were sometimes clever; mostly through treachery and trickery.

That said, you have to remember that pretty much all of the primary sources you read on this topic are written by Athenians for Athenians, and less about preserving an accurate historical record for future generations (or even propaganda/promoting internal Athenian solidarity) and more about making specific political points for their own internecine fights. So you should pay more attention to what's objectively verifiable (things like win-loss records, inputs and outputs) and less about the overall vibe that they present.