LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Guide to SB 1047
Zvi · 2024-08-20T13:10:07.408Z · comments (18)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (68)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (10)

Multiplex Gene Editing: Where Are We Now?
sarahconstantin · 2024-07-16T20:50:04.590Z · comments (6)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (15)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities
Axel Højmark (hojmax) · 2024-07-22T16:17:07.665Z · comments (0)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

What is it to solve the alignment problem?
Joe Carlsmith (joekc) · 2024-08-24T21:19:34.280Z · comments (17)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

Indecision and internalized authority figures
Kaj_Sotala · 2024-07-06T10:10:02.528Z · comments (1)

Timaeus is hiring!
Jesse Hoogland (jhoogland) · 2024-07-12T23:42:28.651Z · comments (6)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

What and Why: Developmental Interpretability of Reinforcement Learning
Garrett Baker (D0TheMath) · 2024-07-09T14:09:40.649Z · comments (4)

An AI Race With China Can Be Better Than Not Racing
niplav · 2024-07-02T17:57:36.976Z · comments (32)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

Friendship is transactional, unconditional friendship is insurance
Ruby · 2024-07-17T22:52:41.967Z · comments (24)

Schelling game evaluations for AI control
Olli Järviniemi (jarviniemi) · 2024-10-08T12:01:24.389Z · comments (5)

How a chip is designed
YM (Yannick_Muehlhaeuser_duplicate0.05902100825326273) · 2024-06-28T08:04:27.392Z · comments (4)

Advice to junior AI governance researchers
Akash (akash-wasil) · 2024-07-08T19:19:07.316Z · comments (1)

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (17)

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours
Seth Herd · 2024-08-05T15:38:09.682Z · comments (22)

[Intuitive self-models] 3. The Homunculus
Steven Byrnes (steve2152) · 2024-10-02T15:20:18.394Z · comments (36)

[link] Static Analysis As A Lifestyle
adamShimi · 2024-07-03T18:29:37.384Z · comments (11)

SAEs (usually) Transfer Between Base and Chat Models
Connor Kissane (ckkissane) · 2024-07-18T10:29:46.138Z · comments (0)

Book Review: On the Edge: The Fundamentals
Zvi · 2024-09-23T13:40:11.058Z · comments (3)

[Interim research report] Activation plateaus & sensitive directions in GPT2
StefanHex (Stefan42) · 2024-07-05T17:05:25.631Z · comments (2)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (6)

[link] A primer on why computational predictive toxicology is hard
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-19T17:16:37.735Z · comments (2)

AI research assistants competition 2024Q3: Tie between Elicit and You.com
Elizabeth (pktechgirl) · 2024-10-12T15:10:05.417Z · comments (2)

Another argument against utility-centric alignment paradigms
Fiora from Rosebloom · 2024-09-22T07:28:27.856Z · comments (39)

[link] AI, centralization, and the One Ring
owencb · 2024-09-13T14:00:16.126Z · comments (11)

[question] Is cybercrime really costing trillions per year?
Fabien Roger (Fabien) · 2024-09-27T08:44:07.621Z · answers+comments (28)

RTFB: California’s AB 3211
Zvi · 2024-07-30T13:10:03.853Z · comments (2)

[link] Pay-on-results personal growth: first success
Chipmonk · 2024-09-14T03:39:12.975Z · comments (5)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

akash-wasil on Daniel Kokotajlo's Shortform

Did they have any points that you found especially helpful, surprising, or interesting? Anything you think folks in AI policy might not be thinking enough about?

(Separately, I hope to listen to these at some point & send reactions if I have any.)

daniel-v on The lying p value

I'm here to say, this is not some property specific to p-values, just about the credibility of the communicator.

If make a bunch of errors all the time, especially those that change their conclusions, indeed you can't trust them. Turns out (BW11) that $s c i e n t i s t s_{p u b l i s h e d i n b e t t e r j o u r n a l s}$ are more credible than $s c i e n t i s t s_{p u b l i s h e d i n w o r s e j o u r n a l s}$ , the errors they make tend not to change the conclusions of the test (i.e., the chance of drawing a wrong conclusion from their data ("gross error" in BW11) was much lower than the headline rate), and (admittedly I'm going out on a limb here) it is very possible the errors that change the conclusion of a particular test do not change the overall conclusion about the general theory (e.g., if theory says X, Y, and Z should happen, and you find support for X and Y and marginal-support-now-not-significant-support-anymore for Z, the theory is still pretty intact unless you really care about using p-values in a binary fashion. If theory says X, Y, and Z should happen, and you find support for X and Y and now-not-significant-support-anymore for Z, that's more of an issue. But given how many tests are in a paper, it's also possible theory says X, Y, and Z should happen, and you find support for X and Y and Z, but turns out your conclusion about W reverses, which may or may not really have something to say about your theory).

I don't think it is wise to throw the baby out with the bathwater.

eggsyntax on eggsyntax's Shortform

But I also find my own understanding to be a bit confused and in need of better sources.

Mine too, for sure.

And agreed, Chollet's points are really interesting. As much as I'm sometimes frustrated with him, I think that ARC-AGI and his willingness to (get someone to) stake substantial money on it has done a lot to clarify the discourse around LLM generality, and also makes it harder for people to move the goalposts and then claim they were never moved).

boris-kashirin on The Online Sports Gambling Experiment Has Failed

Thinking about responsible gambling, something like up-front long-term commitment should solve a lot of problems? You have to decide right away and lock up money you going to spend this month and that will separate decision from impulse to spend.

seth-herd on Current Attitudes Toward AI Provide Little Data Relevant to Attitudes Toward AGI

Those outcomes sound quite plausible.

I'm particularly concerned with polarization. Becoming a political football was the death knell for sensible discussion on climate change, and it could be the same for AGI x-risk. Public belief in climate change actually fell while the evidence mounted. My older post AI scares and changing public beliefs [LW · GW] is actually mostly about polarization.

Having the debate become ideologically/politically motivated seems like it wouldn't be good. I'm still really hoping to avoid polarization on AGI x-risk. It does seem like "AI safety", concerns about bias, deepfakes, and harms from interacting with LLMs are already primarily discussed among liberals in the US.

Neither side has started really worrying about job loss, but that would tend to be the liberal side, too, since conservatives are still somewhat more free-market oriented.

While tying concerns about x-risk with calls to slow AI based on mundane harms might seem expedient, I wouldn't take that bargain if it created worse polarization.

I think this is a common attitude among the x-risk worried, especially since it's hard to predict whether a slowdown in the US AGI push would be a net good or bad thing for x-risk.

nathan-helm-burger on What program structures enable efficient induction?

For what it's worth, the human brain (including the cortex) has a fixed modularity. Long range connections are created during fetal development according to genetic rules, and can only be removed, not rerouted or added to.

I believe this is what causes the high degree of functional localization in the cortex.

shankar-sivarajan on gilch's Shortform

Is this a Lisp-to-Python transpiler?

bogdanb on Cryonics is free

You might want to know that I took a look through the site, and was curious, but I just closed the page the moment the “Calculate your contribution” form refused to show me the pricing options unless I gave it an email address.

nathan-helm-burger on eggsyntax's Shortform

I agree with your frustrations, I think his views are somewhat inconsistent and confusing. But I also find my own understanding to be a bit confused and in need of better sources.

I do think the discussion François has in this interview is interesting. He talks about the ways people have tried to apply LLMs to ARC, and I think he makes some good points about the strengths and shortcomings of LLMs on tasks like this.

jjxw on The Online Sports Gambling Experiment Has Failed

Another working job market economics paper out of Stanford attempts to measure the degree to which sports bettors are overly optimistic. Results largely what you'd expect: people think they're break even when they're actually losing by ~7% and a subset of those people have self control problems.

Funnily enough the way I found out about this paper is from being recruited to participate in it through a targeted ad on social media when I took a trip out to Colorado to farm sports book new account sign up bonuses.