LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

I'm a bit skeptical of AlphaFold 3
Oleg Trott (oleg-trott) · 2024-06-25T00:04:41.274Z · comments (14)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman (UnexpectedValues) · 2024-10-07T19:29:29.033Z · comments (2)

How well do truth probes generalise?
mishajw · 2024-02-24T14:12:19.729Z · comments (11)

[link] Detecting Genetically Engineered Viruses With Metagenomic Sequencing
jefftk (jkaufman) · 2024-06-27T14:01:34.868Z · comments (10)

Parable of the vanilla ice cream curse (and how it would prevent a car from starting!)
Mati_Roy (MathieuRoy) · 2024-12-08T06:57:45.783Z · comments (21)

The subset parity learning problem: much more than you wanted to know
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-03T09:13:59.245Z · comments (17)

[link] Re: Anthropic's suggested SB-1047 amendments
RobertM (T3t) · 2024-07-27T22:32:39.447Z · comments (13)

Natural Latents: The Concepts
johnswentworth · 2024-03-20T18:21:19.878Z · comments (18)

[link] Should you be worried about H5N1?
gw · 2024-12-05T21:11:06.996Z · comments (2)

OpenAI: Helen Toner Speaks
Zvi · 2024-05-30T21:10:02.938Z · comments (8)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

Circling as practice for “just be yourself”
Kaj_Sotala · 2024-12-16T07:40:04.482Z · comments (5)

The Aspiring Rationalist Congregation
maia · 2024-01-10T22:52:54.298Z · comments (23)

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (34)

MATS Winter 2023-24 Retrospective
utilistrutil · 2024-05-11T00:09:17.059Z · comments (28)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning
keith_wynroe · 2024-07-02T13:17:16.352Z · comments (7)

Addressing Feature Suppression in SAEs
Benjamin Wright (Benw8888) · 2024-02-16T18:32:51.927Z · comments (4)

Apply to be a Safety Engineer at Lockheed Martin!
yanni kyriacos (yanni) · 2024-03-31T21:02:08.499Z · comments (3)

[link] Anxiety vs. Depression
Sable · 2024-03-17T00:15:08.255Z · comments (35)

Rejecting Television
Declan Molony (declan-molony) · 2024-04-23T04:59:50.253Z · comments (10)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)

[link] The Intelligence Curse
lukedrago · 2025-01-03T19:07:43.493Z · comments (26)

[link] [Paper] Stress-testing capability elicitation with password-locked models
Fabien Roger (Fabien) · 2024-06-04T14:52:50.204Z · comments (10)

[link] What are you getting paid in?
Austin Chen (austin-chen) · 2024-07-17T19:23:04.219Z · comments (14)

Is "VNM-agent" one of several options, for what minds can grow up into?
AnnaSalamon · 2024-12-30T06:36:20.890Z · comments (49)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (5)

[link] Environmentalism in the United States Is Unusually Partisan
Jeffrey Heninger (jeffrey-heninger) · 2024-05-13T21:23:10.755Z · comments (26)

Scalable oversight as a quantitative rather than qualitative problem
Buck · 2024-07-06T17:42:41.325Z · comments (11)

Why you should be using a retinoid
GeneSmith · 2024-08-19T03:07:41.722Z · comments (60)

Reflections on Less Online
Error · 2024-07-07T03:49:44.534Z · comments (15)

Fluent, Cruxy Predictions
Raemon · 2024-07-10T18:00:06.424Z · comments (14)

[link] What Depression Is Like
Sable · 2024-08-27T17:43:22.549Z · comments (23)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

[link] [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij (teun-van-der-weij) · 2024-06-13T10:04:49.556Z · comments (10)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (14)

AI #51: Altman’s Ambition
Zvi · 2024-02-20T19:50:07.439Z · comments (5)

Sparse Autoencoders Work on Attention Layer Outputs
Connor Kissane (ckkissane) · 2024-01-16T00:26:14.767Z · comments (9)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (25)

A simple case for extreme inner misalignment
Richard_Ngo (ricraz) · 2024-07-13T15:40:37.518Z · comments (41)

Actually, Power Plants May Be an AI Training Bottleneck.
Lao Mein (derpherpize) · 2024-06-20T04:41:33.567Z · comments (13)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (21)

Remap your caps lock key
bilalchughtai (beelal) · 2024-12-15T14:03:33.623Z · comments (17)

Some Vacation Photos
johnswentworth · 2024-01-04T17:15:01.187Z · comments (0)

Retirement Accounts and Short Timelines
jefftk (jkaufman) · 2024-02-19T18:50:05.231Z · comments (35)

Secular interpretations of core perennialist claims
zhukeepa · 2024-08-25T23:41:02.683Z · comments (33)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers
Yitz (yitz) · 2024-01-17T09:48:07.930Z · comments (11)

Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon (joy_void_joy) · 2024-04-27T16:04:45.894Z · comments (13)

[link] Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb · 2024-04-16T10:10:13.338Z · comments (12)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (20)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

bryce-robertson on Bryce Robertson's Shortform

That's great feedback, thanks! I've gone ahead and put it under the page title for each resource.

cstinesublime on You are too dumb to understand insurance

Moriarty looks at the paper.

The switch from first person pronouns to discussing a third person with quotations is confusing.

christiankl on ChristianKl's Shortform

With Trump speaking about the US making Greenland a US territory, it would be worth speaking publically about how the US screws their territories with the Jones Act.

It might be a way to build momentum to get rid of the Jones Act.

robert-cousineau on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

I've donated 5k. Lesswrong (and the people it brings together) deserve credit for the majority of my intellectual growth over the last 6 years. I cannot think of a higher signal:noise place to learn, nor can I think of a more enjoyable and growth inducing community than the community which has grown around it.

Thank you to both those who directly work on it and those who contribute to it!

Lighthaven's wonder is self evident.

christiankl on ChristianKl's Shortform

Drug legalisation is probably the best way to prevent fentanyl deaths. Many of the people who are fentanyl addicted are addicted because they wanted to buy other drugs buy got them laced with fentanyl. Drug legalisation will allow for quality control and end the lacing with fentanyl.

Now, nitazene which is even more potent than fentanyl gets added and might produce similar effects as fentanyl where people die to nitazene overdose.

If the US wants to actually prevent those deaths drug legalization that allows for quality control is the step forward that would work.

erich_grunewald on Disagreement on AGI Suggests It’s Near

In the New York example, it could be that when someone says “Guys, we should really buy those Broadway tickets. The trip to New York is next month already.” they prompt the response “What? I thought we were going the month after!”, hence the disagreement. If this detail had been discussed earlier, there might have been the “February trip” and the “March trip” in order to disambiguate the trip(s) to New York.

I guess I don't understand what focusing on disagreements add. Sure, in this situation, the disagreement stems from some people thinking the trip is near (and others thinking it's further away). But we already knew that some people think AGI is near and others think it's further away! What does observing that people disagree about that stuff add?

What seems to have happened is that people at one point latched on to the concept of AGI, thinking that their interpretation was virtually the same as those of others because of its lack of definition. Again, if they had disagreed with the definition to begin with, they would have used a different word altogether. Now that some people are claiming that AGI is here or here soon, it turns out that the interpretations do in fact differ.

Yeah, I would say that as those early benchmarks ("can beat anyone at chess", etc.) are achieved without producing what "feels like" AGI, people are forced to make their intuitions concrete, or anyway reckon with their old bad operationalizations of AGI. And that naturally leads to lots of discussion around what actually constitutes AGI. But again, all this is evidence of is that those early benchmarks have been achieved without producing what "feels like" AGI. But we already knew that.

pablo_stafforini on Orienting to 3 year AGI timelines

I'm still thinking about how to hedge incase the upcoming chaos turns the market sour

Have you thought more about this? How about VIX call options?

ryan_greenblatt on How will we update about scheming?

Yes, I would count it if the CoT is total gibberish which is (steganographically) encoding reasoning.

vladimir_nesov on Is AI Hitting a Wall or Moving Faster Than Ever?

Noticing progress in long reasoning models like o3 creates a different blind spot compared to popular reporting on how scaling of pretraining is stalling out. It can appear that long reasoning models reconcile the central point of pretraining stalling out with AI progress moving fast. But plausible success of reasoning models instead suggests that pretraining will continue scaling even better than could be expected before.

Training systems were already on track to go from 50 MW, training current models for up to 1e26 FLOPs, to 150 MW in late 2024, and then 1 GW by end on 2025, training models for up to 5e27 FLOPs in 2026, 250x compute of original GPT-4. But with o3, it now seems more plausible that $150bn training systems will be built in 2026-2027 [LW · GW], training models for up to 5e28 FLOPs in 2027-2028, which is 500x compute of the currently deployed 1e26 FLOPs models or 2500x compute of the original GPT-4.

Scaling of pretraining is not stalling out, even without the new long reasoning paradigm. It might begin stalling out in 2026 at the earliest, but now more likely only in 2028. The issue is that the scale of training systems is not directly visible, there is a 2 year lag between decisions to build them and the observed resulting AI progress.

ryan_greenblatt on How will we update about scheming?

To be clear, I think a large update against neuralese is that this seems like the sort of thing that would be pretty likely to leak and I'm not aware of any public leaks. Probably this should yield more like 10% likely. I didn't think very carefully about the 25%.