LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Reflections on the Metastrategies Workshop
gw · 2024-10-24T18:30:46.255Z · comments (5)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (23)

[link] IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman · 2024-10-24T20:30:41.159Z · comments (12)

[link] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

Open Thread Fall 2024
habryka (habryka4) · 2024-10-05T22:28:50.398Z · comments (119)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb · 2024-10-28T17:10:04.272Z · comments (3)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

Signaling with Small Orange Diamonds
jefftk (jkaufman) · 2024-11-07T20:20:08.026Z · comments (1)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (13)

[link] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Tamay · 2024-11-14T06:13:22.042Z · comments (0)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (19)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (9)

0.202 Bits of Evidence In Favor of Futarchy
niplav · 2024-09-29T21:57:59.896Z · comments (0)

[link] AISafety.info: What is the "natural abstractions hypothesis"?
Algon · 2024-10-05T12:31:14.195Z · comments (2)

[link] Generative ML in chemistry is bottlenecked by synthesis
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-16T16:31:34.801Z · comments (2)

Book Review: On the Edge: The Business
Zvi · 2024-09-25T12:20:06.230Z · comments (0)

[link] Characterizing stable regions in the residual stream of LLMs
Jett Janiak (jett) · 2024-09-26T13:44:58.792Z · comments (4)

[link] An X-Ray is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
hugofry · 2024-10-07T08:53:14.658Z · comments (0)

AI Safety Camp 10
Robert Kralisch (nonmali-1) · 2024-10-26T11:08:09.887Z · comments (9)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)

Drug development costs can range over two orders of magnitude
rossry · 2024-11-03T23:13:17.685Z · comments (0)

Compelling Villains and Coherent Values
Cole Wyeth (Amyr) · 2024-10-06T19:53:47.891Z · comments (4)

Eye contact is effortless when you’re no longer emotionally blocked on it
Chipmonk · 2024-09-27T21:47:01.970Z · comments (24)

OODA your OODA Loop
Raemon · 2024-10-11T00:50:48.119Z · comments (3)

Exploring SAE features in LLMs with definition trees and token lists
mwatkins · 2024-10-04T22:15:28.108Z · comments (5)

The murderous shortcut: a toy model of instrumental convergence
Thomas Kwa (thomas-kwa) · 2024-10-02T06:48:06.787Z · comments (0)

Free Will and Dodging Anvils: AIXI Off-Policy
Cole Wyeth (Amyr) · 2024-08-29T22:42:24.485Z · comments (12)

LASR Labs Spring 2025 applications are open!
Erin Robertson · 2024-10-04T13:44:20.524Z · comments (0)

[link] A Percentage Model of a Person
Sable · 2024-10-12T17:55:07.560Z · comments (3)

Glitch Token Catalog - (Almost) a Full Clear
Lao Mein (derpherpize) · 2024-09-21T12:22:16.403Z · comments (3)

I'm creating a deep dive podcast episode about the original Leverage Research - would you like to take part?
spencerg · 2024-09-22T14:03:22.164Z · comments (2)

A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
Lao Mein (derpherpize) · 2024-09-20T13:13:26.181Z · comments (7)

Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi (jarviniemi) · 2024-09-05T19:13:34.443Z · comments (0)

COT Scaling implies slower takeoff speeds
Logan Zoellner (logan-zoellner) · 2024-09-28T16:20:00.320Z · comments (56)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (46)

Doing Research Part-Time is Great
casualphysicsenjoyer (hatta_afiq) · 2024-11-22T19:01:15.542Z · comments (7)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (38)

[link] My Model of Epistemology
adamShimi · 2024-08-31T17:01:45.472Z · comments (0)

Video and transcript of presentation on Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-10-08T22:30:38.054Z · comments (1)

Monthly Roundup #22: September 2024
Zvi · 2024-09-17T12:20:08.297Z · comments (10)

Open Problems in AIXI Agent Foundations
Cole Wyeth (Amyr) · 2024-09-12T15:38:59.007Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

christiankl on Repeal the Jones Act of 1920

If you don't think it relates to the question at hand, why did you brought up the point in the first place?

I think you are too much focused on Trump (likely because the media likes to focus on Trump) and not on how a successful campaign to repeal the act would look like. It's unlikely that Trump makes it his agenda, but that's not required given that the legislature is independent from the executive.

q-home on Making a conservative case for alignment

I'll describe my general thoughts, like you did.

I think about transness in a similar way to how I think about homo/bisexuality.

If homo/bisexuality is outlawed, people are gonna suffer. Bad.
If I could erase homo/bisexuality from existence without creating suffering, I wouldn't anyway. Would be a big violation of people's freedom to choose their identity and actions (even if in practice most people don't actually "choose" to be homo/bisexual).
Different people have homo/bisexuality of different "strength" and form. One man might fall in love with another man, but dislike sex or even kissing. Maybe he isn't a real homosexual, if he doesn't need to prove it physically? Another man might identify as a bisexual, but be in a relationship with a woman... he doesn't get to prove his bisexuality (sexually or romantically). Maybe we shouldn't trust him unless he walks the talk? As a result of all such situations, we might have certain "inconsistencies": some people identifying as straight have done more "gay" things than people identifying as gay. My opinion on this? I think all of this is OK. Pushing for an "objective gay test" would be dystopian and suffering-inducing. I don't think it's an empirical matter (unless we choose it to be, which is a value-laden choice). Even if it was, we might be very far away from resolving it. So just respecting people's self-identification in the meantime is best, I believe. Moreover, a lot of this is very private information anyway. Less reason to try measuring it "objectively".

My thoughts about transness specifically:

We strive for gender equality (I hope). Which makes the concept of gender less important for society as a whole.
The concept of gender is additionally damaged by all the things a person can decide to do in their social/sexual life. For example, take an "assigned male at birth" (AMAB) person. AMAB can appear and behave very feminine without taking hormones. Or vice-versa (take hormones, get a pair of boobs, but present masculine). Additionally there are different degrees of medical transition and different types of sexual preferences.
A lot of things which make someone more or less similar to a man/woman (behavior with friends, behavior with romantic partners, behavior with sexual partners, thoughts) are private. Less reason to try measuring those "objectively".
I have a choice to respect people's self-identified genders or not. I decide to respect them. Not just because I care about people's feelings, but also because of points 1 & 2 & 3 and because of my general values (I show similar respect to homo/bisexuals). So I respect pronouns, but on top of that I also respect if someone identifies as a man/woman/nonbinary. I believe respect is optimal in terms of reducing suffering and adhering to human values.

When I compare your opinion to mine, most of my confusion is about two things: what exactly do you see as an empirical question? how does the answer (or its absence) affect our actions?

Zack insists that Blanchard is right, and that I fail at rationality if I disagree with him. People on Twitter and Reddit insist that Blanchard is wrong, and that I fail at being a decent human if I disagree with them. My opinion is that I have no comparative advantage at figuring out who is right and who is wrong on this topic, or maybe everyone is wrong, anyway it is an empirical question and I don't have the data. I hope that people who have more data and better education will one day sort it out, but until that happens, my position firmly remains "I don't know (and most likely neither do you), stop bothering me".

I think we need to be careful to not make a false equivalence here:

Trans people want us to respect their pronouns and genders.
I'm not very familiar with Blanchard, so far it seems to me like Blanchard's work is (a) just a typology for predicting certain correlations and (b) this work is sometimes used to argue that trans people are mistaken about their identities/motivations.

2A is kinda tangential to 1. So is this really a case of competing theories? I think uncertainty should make one skeptical of Blanchard work's implications rather than make one skeptical about respecting trans people.

(Note that this is about the representatives, not the people being represented. Two trans people can have different opinions, but you are required to believe the woke one and oppose the non-woke one.) Otherwise, you are transphobic. I completely reject that.

Two homo/bisexuals can have different opinions on what's "true homo/bisexuality" is too. Some opinions can be pretty negative. Yes, that's inconvenient, but that's just an expected course of events.

Shortly: disagreement is not hate. But it often gets conflated, especially in environments that overwhelmingly contain people of one political tribe.

I feel it's just the nature of some political questions. Not in all questions, not in all spaces you can treat disagreement as something benign.

But if there is a person who actually feels dysphoria from not being addressed as "ve" (someone who would be triggered by calling them any of: "he", "she", or "they"), then I believe that this is between them and their psychiatrist, and I want to be left out of this game.

Agree. Also agree that lynching for accidental misgendering is bad.

(That's when you get the "attack helicopters" as an attempt to point out the absurdity of the system.)

I'm pretty sure the helicopter argument began as an argument against trans people, not as an argument against weird-ass novel pronouns.

lao-mein on Lao Mein's Shortform

I found a good summary of OpenAI's nonprofit restructuring.

will-taylor on Counting AGIs

Plateau: There may be unexpected development plateaus that come into effect at around human-level intelligence. These plateaus could be architecture-specific (scaling laws break down; getting past AGI requires something outside the deep learning paradigm) or fundamental to the nature of machine intelligence.

That doesn’t prevent any of those four things I mentioned: it doesn’t prevent (1) the AGIs escaping control and self-reproducing, nor (2) the code / weights leaking or getting stolen, nor (3) other companies reinventing the same thing, nor (4) the AGI company (or companies) having an ability to transform compute into profits at a wildly higher exchange rate than any other compute customer, and thus making unprecedented amounts of money off their existing models, and thus buying more and more compute to run more and more copies of their AGI

It doesn't prevent (1) but it does make it less likely. A 'barely general' AGI is less likely to be able to escape control than an ASI. It doesn't prevent (2). We acknowledge (3) in section IV: "We can also incorporate multiple firms or governments building AGI, by multiplying the initial AGI population by the number of such additional AGI projects. For example, 2x if we believe China and the US will be the only two projects, or 3x if we believe OpenAI, Anthropic, and DeepMind each achieve AGI." We think there are likely to be a small number of companies near the frontier, so this is likely to be a modest multiplier. Re. (4), I think ryan_b made relevant points. I would expect some portion of compute to be tied up in long-term contracts. I agree that I would expect the developer of AGI to be able to increase their access to compute over time, but it's not obvious to me how fast that would be.

Pause: Government intervention could pause frontier AI development. Such a pause could be international. It is plausible that achieving or nearly achieving an AGI system would constitute exactly the sort of catalyzing event that would inspire governments to sharply and suddenly restrict frontier AI development.

That definitely doesn’t prevent (1) or (2), and it probably doesn’t prevent (3) or (4) either depending on implementation details.

I mostly agree on this one, though again think it makes (1) less likely for the same reason. As you say, the implementation details matter for (3) and (4), and it's not clear to me that it 'probably' wouldn't prevent them. It might be that a pause would target all companies near the frontier, in which case we could see a freeze at AGI for its developer, and near AGI for competitors.

Abstention: Many frontier AI firms appear to take the risks of advanced AI seriously, and have risk management frameworks in place (see those of Google DeepMind, OpenAI, and Anthropic). Some contain what Holden Karnofsky calls if-then commitments: “If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time.” Commitments to pause further development may kick at human-level capabilities. AGI firms might avoid recursive self-improvement to avoid existential or catastrophic risks.

That could be relevant to (1,2,4) with luck. As for (3), it might buy a few months, before Meta and the various other firms and projects that are extremely dismissive of the risks of advanced AI catch up to the front-runners.

Again, mostly agreed. I think it's possible that the development of AGI would precipitate a wider change in attitude towards it, including at other developers. Maybe it would be exactly what is needed to make other firms take the risks seriously. Perhaps it's more likely it would just provide a clear demonstration of a profitable path and spur further acceleration though. Again, we see (3) as a modest multiplier.

Windup: There are hard-to-reduce windup times in the production process of frontier AI models. For example, a training run for future systems may run into the hundreds of billions of dollars, consuming vast amounts of compute and taking months of processing. Other bottlenecks, like the time it takes to run ML experiments, might extend this windup period.

That doesn’t prevent any of (1,2,3,4). Again, we’re assuming the AGI already exists, and discussing how many servers will be running copies of it, and how soon. The question of training next-generation even-more-powerful AGIs is irrelevant to that question. Right?

The question of training next-generation even-more-powerful AGIs is relevant to containment, and is therefore relevant to how long a relatively stable period running a 'first generation AGI' might last. It doesn't prevent (2) ad (3). It doesn't prevent (4) either, though presumably a next-gen AGI would further increase a company's ability in this regard.

jeremy-gillen on John Fisher's Shortform

These seem right, but more importantly I think it would eliminate investing in new scalable companies. Or dramatically reduce it in the 50% case. So there would be very few new companies created.

(As a side note: Maybe our response to this proposal was a bit cruel. It might have been better to just point toward some econ reading material).

sil-ver on Is the mind a program?

No software/hardware separation in the brain: empirical evidence

I feel like the evidence in this section isn't strong enough to support the conclusion. Neuroscience is like nutrition -- no one agrees on anything, and you can find real people with real degrees and reputations supporting just about any view. Especially if it's something as non-committal as "this mechanism could maybe matter". Does that really invalidate the neuron doctrine? Maybe if you don't simulate ATP, the only thing that changes is that you have gotten rid of an error source. Maybe it changes some isolated neuron firings, but the brain has enough redundancy that it basically computes the same functions.

Or even if it does have a desirable computational function, maybe it's easy to substitute with some additional code.

I feel like the required standard of evidence is to demonstrate that there's a mechanism-not-captured-by-the-neuron-doctrine that plays a major computational role, not just any computational role. (Aren't most people talking about neuroscience still basically assuming that this is not the case?)

We can expect natural selection to result in a web of contingencies between different levels of abstraction.[6]

Mhh yeah I think the plausibility argument has some merit.

lucas-teixeira on Bogdan Ionut Cirstea's Shortform

I'm curious how these claims relate to what's proposed by this paper. (note, I haven't read either in depth)

tristantrim on Seth Herd's Shortform

Do you think people would vibe with it better if it was framed "I may die, but it's a heroic sacrifice to save my home planet from may-as-well-be-an-alien-invasion"? Is it reasonable to characterize general superintelligence as an alien takeover and if it is, would people accept the characterization?

bogdan-ionut-cirstea on Bogdan Ionut Cirstea's Shortform

Claude Sonnet-3.5 New, commenting on the limited scalability of RNNs, when prompted with 'comment on what this would imply for the scalability of RNNs, refering (parts of) the post' and fed https://epoch.ai/blog/data-movement-bottlenecks-scaling-past-1e28-flop (relevant to opaque reasoning, out-of-context reasoning, scheming):

'Based on the article's discussion of data movement bottlenecks, RNNs (Recurrent Neural Networks) would likely face even more severe scaling challenges than Transformers for several reasons:

Sequential Nature: The article mentions pipeline parallelism and discusses how sequential dependencies affect scaling. From the text:

Pipeline parallelism: Different GPUs handle different sequential blocks, or layers, of a Transformer. The output of one GPU is fed to the next as an input sequentially until an input makes it through all of the blocks of the model.

RNNs have even stronger sequential dependencies than Transformers, as each token's processing depends directly on the previous token's hidden state. This would make them much harder to parallelize effectively.

Latency Wall Impact: The article notes that:

The basic reason behind the latency wall is that a bigger model typically requires more gradient steps during training, and if these have to be done in a fixed time window, this means each gradient step must take a shorter time as we scale the models we're training.

RNNs would hit this latency wall even harder because their sequential processing means they can't take advantage of some of the parallelization strategies discussed in the article, particularly tensor parallelism which requires:

after every one or two matrix multiplications (depending on the exact implementation) the GPUs need to synchronize results with each other.

Batch Size Scaling: The article suggests that one potential solution to scaling limits is:

If we can find a way to aggressively scale batch sizes together with model sizes, we can push out the latency wall.

However, RNNs are typically trained with truncated backpropagation through time, which limits how much you can effectively increase the batch size without losing long-range dependencies. This would make it harder to apply this particular solution to RNNs.

This analysis suggests that while Transformers are approaching scaling limits around 2e28 FLOP, RNNs would likely hit prohibitive scaling bottlenecks at significantly lower compute levels due to their inherently sequential nature and limited parallelization options.'

spectrumdt on Hell is wasted on the evil

or you don't really know yourself well

Why do you think that?