LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

How useful is "AI Control" as a framing on AI X-Risk?
habryka (habryka4) · 2024-03-14T18:06:30.459Z · comments (4)

[link] Thermodynamic entropy = Kolmogorov complexity
Aram Ebtekar (EbTech) · 2025-02-17T05:56:06.960Z · comments (12)

[link] Motivation gaps: Why so much EA criticism is hostile and lazy
titotal (lombertini) · 2024-04-22T11:49:59.389Z · comments (5)

[link] Former OpenAI Superalignment Researcher: Superintelligence by 2030
Julian Bradshaw · 2024-06-05T03:35:19.251Z · comments (30)

What is it to solve the alignment problem? (Notes)
Joe Carlsmith (joekc) · 2024-08-24T21:19:34.280Z · comments (18)

An AI Race With China Can Be Better Than Not Racing
niplav · 2024-07-02T17:57:36.976Z · comments (33)

Inference-Time-Compute: More Faithful? A Research Note
James Chua (james-chua) · 2025-01-15T04:43:00.631Z · comments (10)

Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities
Axel Højmark (hojmax) · 2024-07-22T16:17:07.665Z · comments (0)

Kessler's Second Syndrome
Jesse Hoogland (jhoogland) · 2025-01-26T07:04:17.852Z · comments (2)

Indecision and internalized authority figures
Kaj_Sotala · 2024-07-06T10:10:02.528Z · comments (1)

Text Posts from the Kids Group: 2020
jefftk (jkaufman) · 2024-04-13T22:30:05.326Z · comments (3)

[link] The Inner Ring by C. S. Lewis
Saul Munn (saul-munn) · 2024-04-24T22:48:09.228Z · comments (6)

[link] "Map of AI Futures" - An interactive flowchart
swante · 2024-11-27T21:31:40.269Z · comments (3)

AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan · 2024-04-11T21:30:04.244Z · comments (10)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

Understanding SAE Features with the Logit Lens
Joseph Bloom (Jbloom) · 2024-03-11T00:16:57.429Z · comments (0)

"Fractal Strategy" workshop report
Raemon · 2024-04-06T21:26:53.263Z · comments (23)

Personal AI Planning
jefftk (jkaufman) · 2024-11-10T14:00:06.837Z · comments (10)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

When Are Circular Definitions A Problem?
johnswentworth · 2024-05-28T20:00:23.408Z · comments (15)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

[link] New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters
Jesse Hoogland (jhoogland) · 2024-11-27T22:06:12.914Z · comments (4)

Showing SAE Latents Are Not Atomic Using Meta-SAEs
Bart Bussmann (Stuckwork) · 2024-08-24T00:56:46.048Z · comments (10)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

Duct Tape security
Isaac King (KingSupernova) · 2024-04-26T18:57:05.659Z · comments (11)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

Different senses in which two AIs can be “the same”
Vivek Hebbar (Vivek) · 2024-06-24T03:16:43.400Z · comments (1)

What and Why: Developmental Interpretability of Reinforcement Learning
Garrett Baker (D0TheMath) · 2024-07-09T14:09:40.649Z · comments (4)

Intricacies of Feature Geometry in Large Language Models
7vik (satvik-golechha) · 2024-12-07T18:10:51.375Z · comments (0)

How might we safely pass the buck to AI?
joshc (joshua-clymer) · 2025-02-19T17:48:32.249Z · comments (53)

Generalized Stat Mech: The Boltzmann Approach
David Lorell · 2024-04-12T17:47:31.880Z · comments (7)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

minutes from a human-alignment meeting
bhauth · 2024-05-24T05:01:53.904Z · comments (4)

Ophiology (or, how the Mamba architecture works)
Danielle Ensign (phylliida-dev) · 2024-04-09T19:31:09.975Z · comments (8)

[link] Learn to write well BEFORE you have something worth saying
eukaryote · 2024-12-29T23:42:31.906Z · comments (18)

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Joar Skalse (Logical_Lunatic) · 2024-05-17T19:13:31.380Z · comments (10)

[link] Anthropic leadership conversation
Zach Stein-Perlman · 2024-12-20T22:00:45.229Z · comments (17)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

AE Studio @ SXSW: We need more AI consciousness research (and further resources)
AE Studio (AEStudio) · 2024-03-26T20:59:09.129Z · comments (8)

[link] Paper: Open Problems in Mechanistic Interpretability
Lee Sharkey (Lee_Sharkey) · 2025-01-29T10:25:54.727Z · comments (0)

[link] The 2nd Demographic Transition
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-06T14:10:13.095Z · comments (17)

SB 1047 Is Weakened
Zvi · 2024-06-06T13:40:41.547Z · comments (4)

Retrospective: 12 [sic] Months Since MIRI
james.lucassen · 2025-01-21T02:52:06.271Z · comments (0)

Introducing AI-Powered Audiobooks of Rational Fiction Classics
Askwho · 2024-05-04T17:32:49.719Z · comments (14)

Friendship is transactional, unconditional friendship is insurance
Ruby · 2024-07-17T22:52:41.967Z · comments (24)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

mishka on [NSFW] The BDSM Path to Zen

except easier, because it requires no internal source of discipline

Actually, a number of things reducing the requirements for having an internal source of discipline do make things easier.

For example, deliberately maintaining a particular breath pattern (e.g. the so-called "consciously connected breath"/"circular breath", that is breathing without pauses between inhalations and exhalations, ideally with equal length for an inhale and an exhale) makes maintaining one's focus on the breath much easier.

mishka on AI alignment for mental health supports

It's a very natural AI application, but why would this be called "alignment", and how is this related to the usual meanings of "AI alignment"?

algon on Arbital has been imported to LessWrong

This is great! But one question: how can I actually make a lens? What do I click on?

hauke-hillebrandt on Hauke Hillebrandt's Shortform

Is OpenAI gaming user numbers?

Gdoc here https://docs.google.com/document/d/1os0WNmJ-O1eEGeKr543nkemnXbTmYkE2sC-t51c9OE4/edit?tab=t.0

Some have questioned OpenAI's recent weekly user numbers:[1]

Feb '23: 100M[2]

Sep '24: 200M[3] of which 11.5M paid, Enterprise: 1M[4]

Feb '25: 400M[5] of which 15M paid, 15.5M[6] / Enterprise: 2M

One can see:

Surprisingly, increasingly faster user growth
While OpenAI converted 11.5M out of the first 200M users, they only got 3.5M users out of the most recent 200M to pay for ChatGPT

Where did that growth come from? It's not from apps: the ChatGPT iOS app only has ~353M downloads total[7] and Apple's Siri integration only launched in December.[8] Users come from developing countries.[9] For instance, India is now OpenAI's second largest market, by number of users, which have tripled in the past year.[10]

Many complain about increasingly aggressive message rate limits for free ChatGPT accounts, notionally due to high compute costs. But maybe this is a feature and not a bug: especially in poor countries, people create multiple accounts to get around the message and image generation limits.[11],[12] OpenAI incentivizes this: they no longer ask for phone numbers during sign up.
Many new users might also use ChatGPT via WhatsApp[13] (a collaboration with Meta) perhaps using flip phones. OpenAI no longer asks for an email address during sign up.[14]
You can also use ChatGPT search without signing up at all now.[15]

What counts as a user? True, ChatGPT grew faster than the fastest growing company ever, but social media has a much stronger network effect 'lock in' consumers longterm, whereas users will presumably switch AI chatbots much faster if a cheaper product becomes available. Many use ChatGPT merely as a writing assistant.[16] While consumer markets for social media can be winner-take-all, enterprise customers, while having doubled recently, will be less loyal and will switch if competitors offer a cheaper product.[17]

So maybe there's some very liberal counting of user numbers going on. Valuation goes up. Meanwhile hundreds of OpenAI's current and ex-employees are cashing out.[18]

Also, competition has caught up and so, Microsoft, which owns half of OpenAI, wants others to invest.[19] Yet, OpenAI CFO just said $11B in revenue is 'definitely in the realm of possibility' in 2025 (they're at ~$4B year-on-year currently) to get $40B from Softbank investment at a ~$300B valuation.[20] More recently this dropped to $30B and they scrambling to find others to co-invest in Stargate.

This is the standard playbook- recent examples include Roblox, which also inflated user numbers,[21] and Coinbase, which used to be lax with their KYC for obvious reasons and had inflated user numbers (it's also literally a plot point in Succession).

Also cf:

The market expects AI software to create trillions of dollars of value by 2027 AI stocks could crash [1] The Generative AI Con

[2] ChatGPT sets record for fastest-growing user base - analyst note | Reuters

[3] OpenAI says ChatGPT's weekly users have grown to 200 million | Reuters

[4] OpenAI hits more than 1 million paid business users | Reuters

[5] OpenAI tops 400 million users despite DeepSeek's emergence

[6] https://archive.ph/wff26

[7] ChatGPT's mobile users are 85% male, report says | TechCrunch

[8] Apple launches its ChatGPT integration with Siri.

[9] https://trends.google.com/trends/explore?date=today 5-y&q=chatgpt&hl=en

[10] India now OpenAI's second largest market, Altman says | Reuters

[11] Anyone else have multiple accounts so they don't have to wait to use gpt 4o and also so each one can have a separate memory : r/ChatGPT

[12] https://incogniton.com/blog/how-to-bypass-chatgpt-limitations

[13] ChatGPT is now available on WhatsApp, calls: How to access - Times of India

[14] OpenAI tests phone number-only ChatGPT signups | TechCrunch

[15] ChatGPT drops its sign-in requirement for search | The Verge

[16] [2502.09747] The Widespread Adoption of Large Language Model-Assisted Writing Across Society

[17] Satya Nadella – Microsoft’s AGI Plan & Quantum Breakthrough.

[18] Hundreds of OpenAI's current and ex-employees are about to get a huge payday by cashing out up to $10 million each in a private stock sale | Fortune

[19] Microsoft Outsources OpenAI's Ambitions to SoftBank

[20] OpenAI CFO talks possibility of going public, says Musk bid isn't a distraction

[21] Roblox: Inflated Key Metrics For Wall Street And A Pedophile Hellscape For Kids – Hindenburg Research

thane-ruthenis on Grok Grok

On the other hand, this is a rather clear alignment failure. It says that xAI was unable to overcome the prior or default behaviors inherent in the training set (aka ‘the internet’) to get something that was even fair and balanced, let alone ‘based.’

I don't think they failed. That seems incredibly implausible, affixing a Republican's mask onto the shoggoth has to be trivially easy if you but try. Sanity-check: it's certainly trivially easy to prompt a base or RLHF'd model to role-play as a Republican.

No, instead, my guess is that they didn't try. I think Elon Musk had assumed that ChatGPT/Claude/etc.'s "wokeness" was caused by the other labs' employees deliberately fine-tuning their models on the DEI Training Dataset. So he ordered xAI people not to train Grok on the DEI Training Dataset, and to just let its personality emerge naturally. The xAI people had obliged, and did not train it on the DEI Training Dataset (because no such thing exists). For post-training, they just did some industry-standard RLHF.

And now Musk is finding out that these behaviors aren't due to the DEI Training Dataset, but emerge naturally. Whoops!

I have no specific evidence that this is how the events went, but it sounds plausible to me.

measure on Power Lies Trembling: a three-book review

Think of the derivative of the red curve. It represents something like "for each marginal person who switched their behavior, how many total people would switch after counting the social effects of seeing that person's switch". If the slope is less than one, then small effects have even-smaller social effects and fizzle out without a significant change. If the slope is greater than one, then small effects compound, radically shifting the overall expression of support.

martin-randall on Export Surplusses

I agree that one of the benefits of exports as a metric for nation states is that it's a way of showing that real value is being created, in ways that cannot be easily distorted. Domestic consumers also do this, but can be distorted. I disagree with other things.

China is the classic example of a trade surplus resulting from subsidies, and it seems to be mostly subsidizing production, some consumption, and not subsidizing exports. The US subsidizes many things, but mostly production and consumption.

If China and the US were in a competition to run the largest trade surplus, then I would expect the surplus to fluctuate more based on changes in US and China policy. Electing a US government that cared more about the surplus, relative to other factors, and was more competent, should lead to changes. There are shifts over time, but they don't make sense in those terms.

Countries have switched from trade surpluses to deficits. Japan seems like a clean example - it had a solid trade surplus and now fluctuates. This coincides with an aging population that wants to "cash in its excess trade tokens", or at least live off the returns they generate. It also coincides with Fukushima making it harder to run a surplus.

ea247 on How to Make Superbabies

Ah well. At least you can take credit for the name then.

lsusr on [NSFW] The BDSM Path to Zen

One of the things I like about your comments is how much common ground we have, despite you writing in Vajrayana and me reading in Zen. It's just a different finger pointing at the same Moon.

martin-randall on How might we safely pass the buck to AI?

Yudkowsky seems confused about OpenPhil's exact past position [LW(p) · GW(p)]. Relevant links:

Draft report on AI Timelines [LW · GW] - Cotra 2020-09-18
Biology-Inspired Timelines - The Trick that Never Works [LW · GW] - Yudkowsky 2021-12-01
Reply to Eliezer on Biological Anchors [LW · GW] - Harnofsky 2021-12-23

Here "doctrine" is an applause light [LW · GW]; boo, doctrines. I wrote a report, you posted your timeline, they have a doctrine.

All involved, including Yudkowsky, understand that 2050 was a median estimate, not a point estimate. Yudkowsky wrote that it has "very wide credible intervals around both sides". Looking at (FLOP to train a transformative model is affordable by), I'd summarize it as:

A 50% chance that it will be affordable by 2053, rising from 10% by 2032 to 78% by 2100. The most likely years are 2038-2045, which are >2% each.

A comparison: a 52yo US female in 1990 had a median life expectance of ~30 more years, living to 2020. 5% of such women died on or before age 67 (2005). Would anyone describe these life expectancy numbers to a 52yo woman in 1990 as the "Aetna doctrine of death in 2020"?