LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

"Real AGI"
Seth Herd · 2024-09-13T14:13:24.124Z · comments (20)

Word Spaghetti
Gordon Seidoh Worley (gworley) · 2024-10-23T05:39:20.105Z · comments (9)

In the Name of All That Needs Saving
pleiotroth · 2024-11-07T15:26:12.252Z · comments (2)

Can Large Language Models effectively identify cybersecurity risks?
emile delcourt (emile-delcourt) · 2024-08-30T20:20:21.345Z · comments (0)

Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph (redhat) · 2024-10-28T14:41:41.969Z · comments (5)

[question] How great is the utility of "saving" endangered languages?
SpectrumDT · 2024-08-20T13:14:32.895Z · answers+comments (29)

[link] Should Sports Betting Be Banned?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-21T14:13:35.404Z · comments (2)

Finding Deception in Language Models
Esben Kran (esben-kran) · 2024-08-20T09:42:13.060Z · comments (4)

[link] AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-14T23:23:26.296Z · comments (1)

Training a Sparse Autoencoder in < 30 minutes on 16GB of VRAM using an S3 cache
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-08-24T07:39:00.057Z · comments (0)

My career exploration: Tools for building confidence
lynettebye · 2024-09-13T11:37:55.843Z · comments (0)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (0)

[link] GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
ChengCheng (ccstan99) · 2024-11-01T00:10:50.718Z · comments (0)

[link] Jonothan Gorard:The territory is isomorphic to an equivalence class of its maps
Daniel C (harper-owen) · 2024-09-07T10:04:47.840Z · comments (18)

Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)
Linda Linsefors · 2024-08-23T14:18:24.327Z · comments (2)

[question] Is this voting system strategy proof?
Donald Hobson (donald-hobson) · 2024-09-06T20:44:46.691Z · answers+comments (9)

[link] Will we ever run out of new jobs?
Kevin Kohler (KevinKohler) · 2024-08-19T15:04:03.849Z · comments (7)

[link] Why Swiss watches and Taylor Swift are AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T13:23:27.033Z · comments (11)

Automating LLM Auditing with Developmental Interpretability
htlou · 2024-09-04T15:50:04.337Z · comments (0)

[question] Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-09-04T12:40:07.678Z · answers+comments (7)

OpenAI defected, but we can take honest actions
Remmelt (remmelt-ellen) · 2024-10-21T08:41:25.728Z · comments (15)

[link] some questionable space launch guns
bhauth · 2024-10-13T22:52:26.418Z · comments (0)

Is Text Watermarking a lost cause?
egor.timatkov · 2024-10-01T16:20:51.113Z · comments (13)

[link] Instruction Following without Instruction Tuning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-24T13:49:09.078Z · comments (0)

[link] Four Levels of Voting Methods
hive · 2024-09-26T18:15:00.565Z · comments (3)

Appealing to the Public
jefftk (jkaufman) · 2024-10-23T19:00:07.669Z · comments (0)

Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
Declan Molony (declan-molony) · 2024-09-10T05:54:47.000Z · comments (12)

[question] Is there a CFAR handbook audio option?
FinalFormal2 · 2024-10-26T17:08:36.480Z · answers+comments (0)

Reducing global AI competition through the Commerce Control List and Immigration reform: a dual-pronged approach
Ben Smith (ben-smith) · 2024-09-03T05:28:24.549Z · comments (2)

Hiring a writer to co-author with me (Spencer Greenberg for ClearerThinking.org)
spencerg · 2024-10-27T17:34:50.479Z · comments (0)

[question] Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?
SpectrumDT · 2024-11-04T15:20:14.822Z · answers+comments (49)

[link] My lukewarm take on GLP-1 agonists
George3d6 · 2024-08-26T12:34:27.929Z · comments (0)

Interview with Robert Kralisch on Simulators
WillPetillo · 2024-08-26T05:49:15.543Z · comments (0)

[link] Why good things often don’t lead to better outcomes
DMMF · 2024-09-19T16:37:07.778Z · comments (1)

Slave Morality: A place for every man and every man in his place
Martin Sustrik (sustrik) · 2024-09-19T04:20:04.491Z · comments (7)

Review: Dr Stone
ProgramCrafter (programcrafter) · 2024-09-29T10:35:53.175Z · comments (5)

[link] Levers for Biological Progress - A Response to "Machines of Loving Grace"
Niko_McCarty (niko-2) · 2024-11-01T16:35:08.221Z · comments (0)

LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction
Tristan Tran (tristan-tran) · 2024-11-09T20:58:09.182Z · comments (5)

[link] Where is the Learn Everything System?
Shoshannah Tekofsky (DarkSym) · 2024-09-27T21:30:16.379Z · comments (8)

Join a LessWrong Team for the Unaging System Challenge
Crissman · 2024-10-23T06:01:08.018Z · comments (5)

2024 NYC Secular Solstice & Megameetup
Joe Rogero · 2024-11-12T17:46:18.674Z · comments (0)

Electric Grid Cyberattack: An AI-Informed Threat Model
moonlightmaze · 2024-11-11T21:34:17.190Z · comments (0)

Announcing the Ultimate Jailbreaking Championship
InnerHufflepuff (grayswan) · 2024-09-04T00:35:31.234Z · comments (1)

New Funding Category Open in Foresight's AI Safety Grants
Allison Duettmann (allison-duettmann) · 2024-11-06T22:59:41.065Z · comments (0)

Two arguments against longtermist thought experiments
momom2 (amaury-lorin) · 2024-11-02T10:22:11.311Z · comments (5)

[link] Pronouns are Annoying
ymeskhout · 2024-09-18T13:30:04.620Z · comments (21)

Current Attitudes Toward AI Provide Little Data Relevant to Attitudes Toward AGI
Seth Herd · 2024-11-12T18:23:53.533Z · comments (2)

The deepest atheist: Sam Altman
Trey Edwin (Paolo Vivaldi) · 2024-10-10T03:27:34.465Z · comments (2)

Inverse Problems In Everyday Life
silentbob · 2024-10-15T11:42:30.276Z · comments (2)

[link] The Ap Distribution
criticalpoints · 2024-08-24T21:45:35.029Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

akash-wasil on Lao Mein's Shortform

and recently founded another AI company

Potentially a hot take, but I feel like xAI's contributions to race dynamics (at least thus far) have been relatively trivial. I am usually skeptical of the whole "I need to start an AI company to have a seat at the table", but I do imagine that Elon owning an AI company strengthens his voice. And I think his AI-related comms have mostly been used to (a) raise awareness about AI risk, (b) raise concerns about OpenAI/Altman, and (c) endorse SB1047 [which he did even faster and less ambiguously than Anthropic].

The counterargument here is that maybe if xAI was in 1st place, Elon's positions would shift. I find this plausible, but I also find it plausible that Musk (a) actually cares a lot about AI safety, (b) doesn't trust the other players in the race, and (c) is more likely to use his influence to help policymakers understand AI risk than any of the other lab CEOs.

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

[this is a draft. I strongly welcome comments]

The Latent Military Realities of the Coming Taiwan Crisis

A blockade of Taiwan seems significantly more likely than a full-scale invasion. The US's non-intervention in Ukraine suggests similar restraint might occur with Taiwan.

Nevertheless, Metaculus predicts a 65% chance of US military response to a Chinese invasion and separately gives 20-50% for some kind of Chinese military intervention by 2035. Let us imagine that the worst comes to pass and China and the United States are engaged in a hot war?

China's national memory of the 'century of humiliation' deeply shapes its modern strategic thinking. How many Westerners could faithfully recount the events of the Opium Wars? How many have even heard of the Boxer Rebellion, the Eight-nation alliance, the Tai-Ping rebellion? Yet these events are the core curriculum in Chinese education.

Chinese revanchism toward the West enjoys broad public support. The CCP repression of Chinese public opinion likely understates how popular this view is. CCP officals actually have more dovish view than the general public according to polling.

As other pieces of evidence: historically, the Boxer rebellion was a grass-root phenomenon. Movies depicting conflict between China and America consistently draw large audiences and positive reception. China has an absolute miniscule number of foreigners per capita and this has fallen after the pandemic and never rebounded.

China is the only nuclear power that has explicitly disavowed a nuclear first strike. It currently has a remarkably small nuclear stockpile (~200 warheads). With the increased sensor capabilities in recent years China has become vulnerable to a US nuclear first-strike destroying her launchers before she can react. This is likely part of the reason for a major build-up of her nuclear stockpile in recent years.

It is plausible that there will be a hot war without the use of nuclear weapons. The closest historical case is of course the Korea War, the last indirect conflict between the US and China, ended in stalemate despite massive US economic superiority. Today, that economic gap has largely closed - China's economy is 1.25x larger in PPP terms, while the US is only 40% bigger in nominal GDP.

How would a conventional US-China war look like? What can be learned from past conflicts?

The 1973 Falklands War between the UK and Argentina is the last air-naval war between near-peer powers. The 50-year gap since then equals the time between the US Civil War and WWI. Naval and air warfare technology advances much faster than land warfare - historically, this was tested through frequent conflicts. Today's unprecedented peace means we're largely guessing which naval technologies and doctrines will actually work. While land warfare in Ukraine looks like 'WWI with drones', naval warfare has likely seen much more dramatic changes.

Naval technology advances create bigger power gaps than land warfare. The Opium Wars showed this dramatically - British steamships simply sailed up Chinese rivers unopposed, forcing humiliating treaties on a land power.

Air warfare technology gaps may be even more extreme than naval ones. Modern F-35s achieve 20:0 kill ratios against previous-generation fighters in exercises.

The Arab-Israeli wars, and the Gulf war suggests some lessons about modern air warfare. These conflicts showed that air superiority is typically won or lost very quickly: initial strikes on airbases can be decisive, and most aircraft losses happen on the ground rather than in dogfights. This remains such a concern that it’s US Air Force doctrine to rotate aircraft between airfields. More broadly, these conflicts suggest that air warfare produces more decisive, one-sided outcomes than land battles - when one side gains air superiority, the results can be devastating.

Wild Cards

Drones and the Transparent Battlefield

Drones represent warfare's future, yet both sides underinvest. While the US military has only 10,000 small drones and 400 large ones, Ukraine alone produces 1-4 million drones annually. China leads in mass-producing small drones but lacks integration doctrine.The Ukraine war revealed how modern sensors create a 'transparent battlefield' where hiding large forces is impossible. Drones might make it trivially easy to find (and even destroy) submarines and surface ships.

Submarines

Since WWI Submarines are the kings of the sea. It is plausibly the case that submarines are dominant. A single torpedo from a submarine will sink an aircraft carrier - in exercises, small diesel-electric submarines regularly 'sink' entire carrier groups. These submarines can hide in sonar deadzones, regions where water temperature and salinity create acoustic blind spots.

Are Aircraft Carriers obsolete?

China now sports hypersonic missiles that at least in theory could disable an aircraft carrier from 1500 miles or beyond. On the flip side, missile defense effectiveness has increased dramatically, hypersonic missile effectiveness may be overstated. As a point of evidence of the remaining importance of air craft carriers, China is building her own fleet of aircraft carriers.

Military Competence Wildcard:

Peace means we don't know the true combat effectiveness of either military. Authoritarian militaries often suffer from corruption and incompetence - Chinese troops have been caught loading missile launchers with water instead of fuel during exercises [Comment 5: Need source]. But the US military also shows worrying signs: bureaucratic bloat, lack of recent peer conflict experience, and questions about training quality. Both militaries' actual combat effectiveness remains a major unknown. The US Navy now has more admirals than warships.

Stealth bombers and JASSM-ER

We don’t know what the real dominant weapon in a real conventional 21-century naval war between peers would be, but a plausible guess for a game-changing technology are Stealth Bombers & Stealth missiles.

The obscene cost made the B2 stealth bombers even less popular than the ever-more-costly jet fighters and the project was prematurely halted at 21 platforms. Despite the obscene cost it’s plausible that the B2 and it’s younger cousin the B21 is worth all the money and then some.

Unlike fighters a stealth bombers has something ‘true stealth’. While a stealth fighter like a F35 is better thought of as a ‘low-observable’ aircraft that is difficult to target-lock by short-wave radar but easily detectable by long-wave radar, the B2 stealth bomber is opaque to long-wave radar too. Stealth bombers can also carry air-to-air missiles so may even be effective against fighters. Manoeuvrability and speed, long the defining hallmark of fighters has become less important with the advent of highly accurate homing missiles.

Lockheed Martin has developed the JASSM-ER, a stealth missile with a range up to 900 miles. A B2 bomber has a range of up to something like 4000 miles. For comparison, the range of fighters is something in the range of 400-1200 miles.

A single hit of a JASSM-ER is probably a mission kill on a naval vessel. A B2 can carry up to 16 of these missiles. This means that a single squadron of stealth bombers taking off from a base in Guam could potentially wipe out half a fleet in a single sortie.

***********

And of course last but not least, the greatest wildcard of them all:

AGI.

I will refrain from speculating on the military implications of AGI.

Clear China Disadvantages, US Advantages:

Amphibious assaults are inherently difficult A full Taiwan invasion faces massive logistical hurdles. Taiwan could perhaps muster 500,000 defenders under full mobilization, requiring 1.5 million Chinese troops for a successful assault under standard military doctrine. For perspective, D-Day - history's largest amphibious invasion - landed only 133,000 troops.

China's energy vulnerability is significant - China imports 70% of its oil and 25% of its gas by sea. While Russia provides 20-40% of these imports and could increase supply, the US could severely disrupt China's energy access.

China's regional diplomacy has backfired - Chinas has alienated virtually all its neighbours. The US has basing options in Japan, Australia, Philippines, and across Pacific islands.

US carrier advantage The US operates 11 nuclear supercarriers with extensive blue-water experience. China has two smaller carriers active, one in trials, and one nuclear carrier under construction. The big questionmark is whether carriers might be obsolete or not.

US Stealth bomber advantage: The US leads with 21 B1s and 100 new B21s ordered, while China's H10 program still lags behind.

US submarine advantage US submarines are significantly technologically ahead. Putin selling Russian submarine technology might nullify some of that advantage, as might new cheap sea drones. Geographically, it’s hard for Chinese submarines to escape the China sea unnoticed.

Clear China Advantages, US Disadvantages:

Geography favors China Taiwan lies just 100 miles from mainland China while US forces must cross the Pacific. The massive Chinese Rocket Force can launch thousands of missiles from secure mainland positions.

Advanced missile capabilities Massive conventional rocket force plus claimed hypersonic missile capabilities [Comment : find skeptic hypersonic missile video]

China has been preparing for many years China has established numerous artificial islands with airfields throughout the region. They've successfully stolen F35 plans and are producing their own version at scale. The Chinese governments has built up enormous national emergency storages of essential resources in preparation for the (inevitable) conflict. Bringing Taiwan back into the fold has been a primary driver of policy for decades.

US Shipbuilding The US shipbuilding industry has collapsed to just 0.1% of global production, while China, South Korea, and Japan dominate with 35-40%, 25-30%, and 20-25% respectively.

abandon on Habryka's Shortform Feed

TechEmails' substack post with the same emails in a more centralized format includes citations; apparently these are mostly from Elon Musk, et al. v. Samuel Altman, et al. (2024)

abandon on dirk's Shortform

I more-or-less endorse the model described in larger language models may disappoint you [or, an eternally unfinished draft] [LW · GW], and moreover I think language is an inherently lossy instrument such that the minimally-lossy model won't have perfectly learned the causal processes or whatever behind its production.

akash-wasil on Making a conservative case for alignment

I agree with many points here and have been excited about AE Studio's outreach. Quick thoughts on China/international AI governance:

I think some international AI governance proposals have some sort of "kum ba yah, we'll all just get along" flavor/tone to them, or some sort of "we should do this because it's best for the world as a whole" vibe. This isn't even Dem-coded so much as it is naive-coded, especially in DC circles.
US foreign policy is dominated primarily by concerns about US interests. Other considerations can matter, but they are not the dominant driving force. My impression is that this is true within both parties (with a few exceptions).
I think folks interested in international AI governance should study international security agreements and try to get a better understanding of relevant historical case studies. Lots of stuff to absorb from the Cold War, the Iran Nuclear Deal, US-China relations over the last several decades, etc. (I've been doing this & have found it quite helpful.)
Strong Republican leaders can still engage in bilateral/multilateral agreements that serve US interests. Recall that Reagan negotiated arms control agreements with the Soviet Union, and the (first) Trump Administration facilitated the Abraham Accords. Being "tough on China" doesn't mean "there are literally no circumstances in which I would be willing to sign a deal with China." (But there likely does have to be a clear case that the deal serves US interests, has appropriate verification methods, etc.)

matrice-jacobine on Proposing the Conditional AI Safety Treaty (linkpost TIME)

Fortunately, the existential risks posed by AI are recognized by many close to President-elect Donald Trump. His daughter Ivanka seems to see the urgency of the problem. Elon Musk, a critical Trump backer, has been outspoken about the civilizational risks for many years, and recently supported California’s legislative push to safety-test AI. Even the right-wing Tucker Carlson provided common-sense commentary when he said: “So I don’t know why we’re sitting back and allowing this to happen, if we really believe it will extinguish the human race or enslave the human race. Like, how can that be good?” For his part, Trump has expressed concern about the risks posed by AI, too.

This is a strange contrast from the rest of the article, considering both Donald and Ivanka Trump's positions are largely informed by the "situational awareness" position arguing that the US should develop AGI before China to ensure US victory over China – which is explicitly the position Tegmark [LW · GW] and Leahy argue against (and consider existentially harmful) when they call to stop work on AGI and work on international co-operation to restrict it and develop tool AI instead.

I still see this kind of confusion between the two positions a fair bit and it is extremely strange. It's like if back in the original Cold War people couldn't tell the difference between anti-communist hawks and the Bulletin of the Atomic Scientists (let alone anti-war hippies) because technically they both considered nuclear arms race to be very important for the future of humanity.

abramdemski on o1 is a bad idea

Process supervision would be helpful, particularly if the model is punished for sensitive dependence on paraphrasings of the CoT, or rewarded for producing CoT which is useful even after paraphrasing.

One of the concerns that I have is that (I believe) the LLM before the CoT-training will be sensitive to some non-human-readable patterns, similar to adversarial examples for image recognition. The CoT training will encourage any patterns which help reach the desired conclusions, whether those patterns are human-readable or not.

A lot of possible process-supervision procedures will not do anything to help prevent this. For example, rewarding CoT for being human-understandable won't help, because the reward signal there can miss the subtle patterns being used. It only avoids the outcome where the CoT looks non-human-readable; it doesn't avoid the outcome where the CoT actually contains a bunch of human-unreadable patterns.

Using paraphrasing to avoid this is heavily dependent on the strength of the paraphraser. The paraphraser needs to have really good coverage of alternative ways to convey the same semantic information, so that any non-human-readable patterns get randomized. The paraphraser itself can't be vulnerable to steering by adversarial inputs.

benito on Lao Mein's Shortform

In the email above, clearly stated, is a line of reasoning that has lead very competent people to work extremely hard to build potentially-omnicidal machines.

paulbecon on Beyond the Reach of God

in 2015 a Nature paper literally titled “Undecidability of the spectral gap” showed that an important physical quantity—the difference in energy between ground state and the initial excited state of a material—is formally undecidable. It is also triggered by recursion (basically, they encode questions about spectral gaps into spectral gaps). [Quoted from Erik Hoel's 2024 post on Consciousness as a Godel Sentence]

daniel-kokotajlo on 5 ways to improve CoT faithfulness

I encourage you to make this point into a blog post.