LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Renormalization Roadmap
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:34:16.352Z · comments (7)

Book Review: Affective Neuroscience
sarahconstantin · 2025-03-10T06:50:04.602Z · comments (8)

Weirdness Points
lsusr · 2025-02-28T02:23:56.508Z · comments (19)

FrontierMath Score of o3-mini Much Lower Than Claimed
YafahEdelman (yafah-edelman-1) · 2025-03-17T22:41:06.527Z · comments (7)

Fuzzing LLMs sometimes makes them reveal their secrets
Fabien Roger (Fabien) · 2025-02-26T16:48:48.878Z · comments (13)

[link] The first RCT for GLP-1 drugs and alcoholism isn't what we hoped
dynomight · 2025-02-20T22:30:07.536Z · comments (4)

Falsehoods you might believe about people who are at a rationalist meetup
Screwtape · 2025-02-01T23:32:50.398Z · comments (12)

[link] How to Corner Liars: A Miasma-Clearing Protocol
ymeskhout · 2025-02-27T17:18:36.028Z · comments (23)

[link] Softmax, Emmett Shear's new AI startup focused on "Organic Alignment"
Chipmonk · 2025-03-28T21:23:46.220Z · comments (1)

"Think it Faster" worksheet
Raemon · 2025-02-08T22:02:27.697Z · comments (8)

Tail SP 500 Call Options
sapphire (deluks917) · 2025-01-23T05:21:51.221Z · comments (28)

Not all capabilities will be created equal: focus on strategically superhuman agents
benwr · 2025-02-13T01:24:46.084Z · comments (8)

[link] Sentinel's Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise.
NunoSempere (Radamantis) · 2025-03-17T19:34:01.850Z · comments (3)

[link] How Gay is the Vatican?
rba · 2025-04-06T21:27:50.530Z · comments (32)

Escape from Alderaan I
lsusr · 2025-02-02T10:48:06.533Z · comments (2)

On OpenAI’s Safety and Alignment Philosophy
Zvi · 2025-03-05T14:00:07.302Z · comments (5)

Solving willpower seems easier than solving aging
Yair Halberstadt (yair-halberstadt) · 2025-03-23T15:25:40.861Z · comments (28)

Alignment faking CTFs: Apply to my MATS stream
joshc (joshua-clymer) · 2025-04-04T16:29:02.070Z · comments (0)

Socially Graceful Degradation
Screwtape · 2025-03-20T04:03:41.213Z · comments (9)

Map of AI Safety v2
Bryce Robertson (bryceerobertson) · 2025-04-15T13:04:40.993Z · comments (4)

On Google’s Safety Plan
Zvi · 2025-04-11T12:51:12.112Z · comments (6)

A sketch of an AI control safety case
Tomek Korbak (tomek-korbak) · 2025-01-30T17:28:47.992Z · comments (0)

Go Grok Yourself
Zvi · 2025-02-19T20:20:09.371Z · comments (2)

On polytopes
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-25T13:56:35.681Z · comments (5)

How I switched careers from software engineer to AI policy operations
Lucie Philippon (lucie-philippon) · 2025-04-13T06:37:33.507Z · comments (1)

Housing Roundup #11
Zvi · 2025-04-01T16:30:03.694Z · comments (1)

Consider showering
bohaska (Bohaska) · 2025-04-01T23:54:26.714Z · comments (16)

On DeepSeek’s r1
Zvi · 2025-01-22T19:50:17.168Z · comments (2)

What's Behind the SynBio Bust?
sarahconstantin · 2025-01-30T22:30:06.916Z · comments (8)

To be legible, evidence of misalignment probably has to be behavioral
ryan_greenblatt · 2025-04-15T18:14:53.022Z · comments (12)

[Closed] Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme
Vanessa Kosoy (vanessa-kosoy) · 2025-02-16T16:24:57.654Z · comments (5)

The Manus Marketing Madness
Zvi · 2025-03-10T20:10:07.845Z · comments (0)

Do models know when they are being evaluated?
Govind Pimpale (govind-pimpale) · 2025-02-17T23:13:22.017Z · comments (3)

On Deliberative Alignment
Zvi · 2025-02-11T13:00:07.683Z · comments (1)

On MAIM and Superintelligence Strategy
Zvi · 2025-03-14T12:30:07.451Z · comments (2)

[link] Dario Amodei: On DeepSeek and Export Controls
Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · comments (3)

Childhood and Education #9: School is Hell
Zvi · 2025-03-07T12:40:05.324Z · comments (36)

HPMOR Anniversary Parties: Coordination, Resources, and Discussion
Screwtape · 2025-03-11T01:30:41.177Z · comments (6)

≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
Rafael Harth (sil-ver) · 2025-02-13T19:21:35.392Z · comments (59)

Reframing AI Safety as a Neverending Institutional Challenge
scasper · 2025-03-23T00:13:48.614Z · comments (12)

My "infohazards small working group" Signal Chat may have encountered minor leaks
Linch · 2025-04-02T01:03:05.311Z · comments (0)

OpenAI Responses API changes models' behavior
Jan Betley (jan-betley) · 2025-04-11T13:27:29.942Z · comments (6)

Gemini 2.5 is the New SoTA
Zvi · 2025-03-28T14:20:03.176Z · comments (1)

Notes on countermeasures for exploration hacking (aka sandbagging)
ryan_greenblatt · 2025-03-24T18:39:36.665Z · comments (6)

DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (14)

[link] Conference Report: Threshold 2030 - Modeling AI Economic Futures
Deric Cheng (deric-cheng) · 2025-02-24T18:56:51.682Z · comments (0)

On OpenAI’s Model Spec 2.0
Zvi · 2025-02-21T14:10:06.827Z · comments (4)

[link] You should read Hobbes, Locke, Hume, and Mill via EarlyModernTexts.com
Arjun Panickssery (arjun-panickssery) · 2025-01-30T12:35:03.564Z · comments (3)

Don't over-update on FrontierMath results
David Matolcsi (matolcsid) · 2025-03-11T20:44:04.459Z · comments (5)

AI #110: Of Course You Know…
Zvi · 2025-04-03T13:10:05.674Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

yair-halberstadt on Experimental testing: can I treat myself as a random sample?

I'm not using this is a prior, I'm using it to update my existing prior (whatever that was). I believe the posterior will be well defined, so long as the prior was.

yair-halberstadt on Experimental testing: can I treat myself as a random sample?

It would also update you towards 1600 over 2000.

elifland on AI 2027 is a Bet Against Amdahl's Law

The basic arguments are that (a) becoming fully superhuman at something which involves long-horizon agency across a diverse range of situations seems like it requires agency skills that will transfer pretty well to other domains (b) once AIs have superhuman data efficiency, they can pick up whatever domain knowledge they need for new tasks very quickly.

I agree we didn't justify it thoroughly in our supplement, the reason it's not justified more is because we didn't get around to it.

frank-bellamy on The US Executive vs Supreme Court Deportations Clash

> Bukele could release Abrego Garcia to the US

Could he? I don't know El Salvador's legal system very well, but I do know that just because someone is incarcerated in an American prison it does not follow that Trump could release them. The US President has no legal authority to release a person from state prison. Do we actually know that Bukele has authority to release Garcia?

and the US could bring him back, if the administration wanted to do so. As a point of reference, the US has routinely negotiated for the return or release of citizens and even of non-citizens considered at risk in other countries.

That is one hell of an unwarranted assumption. The US does sometimes try to get people released from foreign prisons, sometimes successfully, sometimes not. For the not case, see here for an example. I don't know if the Trump administration even could get Garcia released if it wanted to. Neither do you. Neither do any of American judges issuing orders in this case. Even if there is something that El Salvador would accept in exchange for Garcia, (1) there is necessarily a weighing to be done of the cost of that thing versus the benefit of bringing Garcia back. Judges are not in a position to do that weighing, and are not attempting to do it. (2) If El Salvador can see that Trump is under a court order to get Garcia back, that puts Trump in a terrible negotiating position. It would essentially allow El Salvador to coerce whatever it wants from our government. That would be a terrible outcome. This is why negotiation with foreign governments is a function of the executive branch, not the judicial.

elifland on AI 2027 is a Bet Against Amdahl's Law

As a prerequisite, it will be necessary to enumerate the set of activities that are necessary for "AI R&D"

As I think you're aware, Epoch took a decent stab at this IMO here. I also spent a bunch of time thinking about all the sub-tasks involved in AI R&D early on in the scenario development. Tbh, I don't feel like it was a great use of time compared to thinking at a higher level, but perhaps I was doing it poorly or am understimating its usefulness.

elifland on AI 2027 is a Bet Against Amdahl's Law

What is the profile of acceleration across all tasks relating to AI R&D? What percentage of tasks are getting accelerated by 1.1x, 1.5x, 2x?

A late 2024 n=4 survey of frontier AI researchers estimated a median of a 1.15x AI R&D progress multiplier relative to no post-2022 AIs. I'd like to see bigger surveys here but FWIW my best guess is that we're already at a ~1.1x progress multiplier.

elifland on AI 2027 is a Bet Against Amdahl's Law

Readers are likely familiar with Hofstadter's Law:
It always takes longer than you expect, even when you take into account Hofstadter's Law.
It's a good law. There's a reason it exists in many forms (see also the Programmer's Credo^[9], the 90-90 rule, Murphy's Law, etc.) It is difficult to anticipate all of the complexity and potential difficulties of a project in advance, and on average this contributes to things taking longer than expected. Constructing ASI will be an extremely complex project, and the AI 2027 attempt to break it down into a fairly simple set of milestones and estimate the difficulty of each milestone seems like fertile territory for Hofstadter's Law.

One reason I don't put much weight on this for timelines forecasts is that to the extent I might have done so before, I would have been more wrong from my current view. For example, my AGI timelines median 3 years ago was 2060ish, and over the last I've updated toward an AGI median of more like 2031 due to reasons including underpredicting benchmark scores [LW · GW], underpredicting real-world impacts, and the model we built for AI 2027.

(wow, I didn't remember that my median 3 years ago was 2060ish, wild)

lemonhope on lemonhope's Shortform

That's true. Would be nice if I could also discuss ideas like this one here in a friendly way. Don't always have time to go to in person stuff. https://www.lesswrong.com/posts/7LFbhL5vfSbTAFLmR/family-line-selection-optimizer [LW · GW]

Or this one https://www.lesswrong.com/posts/t8jwPBrxccf5bhMcC/aimless-ace-analyzes-active-amateur-a-micro-aaaaalignment [LW · GW]

purple-fire on Experimental testing: can I treat myself as a random sample?

The intuition is that if we both saw bus 1546, and you guessed that there were 1546 buses and I guessed that there were 1547, you would be a little more likely to be correct but I would almost certainly be closer to the real number.

The Bayesian update isn't generally well-defined because you get a divergent mean. Your implicit prior is 1/n which is an improper prior. This is fine for deriving a posterior median, which in this case happens to be about 3,100 buses, and a posterior distribution, which in this case is a truncated zeta distribution with s=2 and k=1546. But the posterior mean does not exist.

barniclebarn on AI 2027 is a Bet Against Amdahl's Law

In essence, this is saying that if the pace of progress is the product of two factors (experiment implementation time, and quality of experiment choice), then AI only needs to accelerate one factor in order to achieve an overall speedup. However, AI R&D involves a large number of heterogeneous activities, and overall progress is not simply the product of progress in each activity. Not all bottlenecks will be easily compensated for or worked around.

I agree with this.

I also think that there are some engineering/infrastructure challenges to executing training runs, that one would not necessarily cede to AI, not because it may not be desirable, but because it would involve a level of embodiment that is likely beyond the timeline proposed in the AI 2027 thesis. (I do agree with most of the thesis however).

I'm not sure there's a research basis (that I could find at least, though I am very open to correction on this point), for embodiment of AI systems (robotic bodies) being able to keep pace with algorithmic improvement.

While an AI system could likely design a new model architecture, and training architecture, it comes down to very human supply chain and technician speed that enables that physical training to be run at the scales required.

Further, there are hardware challenges to large training runs of AI systems, which may not be resolvable by an AI system as readily, due to lack of exposure to those kinds of physical issues in their inherent reasoning space. (They have never opened a server during a training run, and resolved an overheat issue for instance).

Some oft overlooked items involved in training, are based on the fact that the labs tend to not own their own data centers but rather rely on cloud providers. This means they have to contend with:

Cluster allocation: Scheduling the time on thousands of GPUs across multiple cloud providers, and reserving time blocks, securing budget, etc. I can easily buy the concept of an AI system recursively self-improving on a baked in infrastructure, but the speed with which its human colleagues may be able to secure additional infrastructure for it may be challenging. I understand that in the article that the model has taken over the 'day to day' operations, but I'm not sure I characterize a significant training run as a 'day to day' activity. This scheduling goes beyond just 'calling some colos', and involves potentially running additional power and fiber to buildings, construction schedules, etc.
Topology: Someone has to physically lay out the training network used. This goes beyond the networking per se, but also involves actually moving hardware around, building in redundancies in the data hall (extra PDUs, etc.), running networking cable, and putting mitigations in place for transients, etc. This all requires technicians, parts, hardware, etc. Lead times in some cases for those parts exceed the timeline to ASI proposed.
Hardware/Firmware Validation: People physically have to check the server infrastructure, the hardware and the firmware, and ensure that all of the cards are up to date, etc. Moving at speed in AI, a lot of 'second hand', or 'relocated' servers and infrastructure tend to be used. It is not a small task to catalogue all of that and place it into a DCIM framework.
Stress Testing: Running large power loads to check thermal limits, power draw, and inter-GPU comms. Parts fail here routinely, requiring replacement, etc.
Power: Assuming that in the proposed timeline, compute remains linked to power, we are looking at a generational data center capability issue.
- The data centers under construction now, set to go on-line in the 2029/2030 timeline, will be the first to use Blackwell GPUs at scale.
- This then implies that to achieve the 2027 timeline, we'll be able to stretch Hoppers and existing power infrastructure to the point that these improvements emerge out of existing physical hardware.
- I do tend to agree that if we were unconstrained by power, and physical infrastructure, that algorithmically there is no reason at all to believe that we could not achieve ASI by 2027 - however the infrastructure challenges are absolutely enormous.
- Land with sufficient water, power (including natural gas lines for onsite power) isn't available. Utilities in the US are currently restricting power access to data centers, by leveraging significant power tariffs and long term take or pay commitments. (The AEP decision in Ohio for instance). This makes life harder for colocation providers in terms of financing and siting large infrastructure.

I believe that an AI system could well be match for the data cleaning and validation, and even the launch and orchestration using Slurm, Kubernetes or similar, but the initial launch phase is also something that I think will be slowed by the need for human hands.

This phase results in:

Out of memory errors on GPUs, which can often only be resolved by 'turning a wrench' on the server.
Unexpected Hardware failures (GPUs breaking, NVLinks breaking, Network timeouts, cabling issues, fiber optic degradation, power transcients, etc.) All of these require human technicians.

These errors are also insidious, because the software running the training can't tell the impact of these failures on which parts of the network is being trained, and which isn't. This would make it challenging for an AI director to really understand what was causing issues in the desired training outcome. This makes it unlikely that a runaway situation would take place where a model is just recursively self-improving on a rapid timeline without human input, unless it first cracked the design, and mass manufacture of embodied AI workers that could move and act as quickly as it can.

A good case study on this is the woes faced by OpenAI in training GPT-4.5, where all of this came to a head, taking a training run scheduled for a month or two, and stretching it over a year. OpenAI spoke very openly about this in a Youtube video they released.

What's more, at scale, if we are going to be relying on existing data centers for a model of this sophistication, we'd have the model split across multiple clusters, potentially in multiple locations. This causes latency issues, etc.

That's the part that to me is missing from the near-term timeline. I think the thesis around zones created just to build power and data centers, following ASI, seems very credible, especially with that level of infiltration of government/infrastructure.

I don't however see a way of getting to a model capable of ASI with current data center infrastructure, prior to the largest new campuses coming online, and power running to Blackwell GPUs.