LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Zvi’s 2024 In Movies
Zvi · 2025-01-13T13:40:05.488Z · comments (2)

[link] Review: Good Strategy, Bad Strategy
L Rudolf L (LRudL) · 2024-12-21T17:17:04.342Z · comments (0)

Analysis of Global AI Governance Strategies
Sammy Martin (SDM) · 2024-12-04T10:45:25.311Z · comments (10)

Claude's Constitutional Consequentialism?
1a3orn · 2024-12-19T19:53:33.254Z · comments (6)

ARENA 4.0 Impact Report
Chloe Li (chloe-li-1) · 2024-11-27T20:51:54.844Z · comments (3)

Causal inference for the home gardener
braces · 2024-11-27T17:55:52.629Z · comments (1)

[link] Two interviews with the founder of DeepSeek
Cosmia_Nebula · 2024-11-29T03:18:47.246Z · comments (1)

[link] you should probably eat oatmeal sometimes
bhauth · 2024-08-25T14:50:37.570Z · comments (32)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Take SCIFs, it’s dangerous to go alone
latterframe · 2024-05-01T08:02:38.067Z · comments (1)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (10)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

US Presidential Election: Tractability, Importance, and Urgency
kuhanj · 2024-05-29T23:52:22.420Z · comments (2)

[link] IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman · 2024-10-24T20:30:41.159Z · comments (13)

Debate: Get a college degree?
Ben Pace (Benito) · 2024-08-12T22:23:34.744Z · comments (14)

Australian AI Safety Forum 2024
Liam Carroll (liam-carroll) · 2024-09-27T00:40:11.451Z · comments (0)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

[link] Characterizing stable regions in the residual stream of LLMs
Jett Janiak (jett) · 2024-09-26T13:44:58.792Z · comments (4)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

Evidential Cooperation in Large Worlds: Potential Objections & FAQ
Chi Nguyen · 2024-02-28T18:58:25.688Z · comments (5)

Goals selected from learned knowledge: an alternative to RL alignment
Seth Herd · 2024-01-15T21:52:06.170Z · comments (18)

[link] We Need Major, But Not Radical, FDA Reform
Maxwell Tabarrok (maxwell-tabarrok) · 2024-02-24T16:54:33.061Z · comments (12)

D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues
aphyer · 2024-06-07T19:02:06.859Z · comments (16)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Case studies on social-welfare-based standards in various industries
HoldenKarnofsky · 2024-06-20T13:33:44.780Z · comments (0)

Housing Roundup #7
Zvi · 2024-03-04T15:00:08.192Z · comments (1)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

Estimating efficiency improvements in LLM pre-training
Daan · 2024-01-19T19:32:45.124Z · comments (3)

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

[link] Post series on "Liability Law for reducing Existential Risk from AI"
Nora_Ammann · 2024-02-29T04:39:50.557Z · comments (1)

Wholesomeness and Effective Altruism
owencb · 2024-02-28T20:28:22.175Z · comments (3)

[question] What rationality failure modes are there?
Ulisse Mini (ulisse-mini) · 2024-01-19T09:12:57.924Z · answers+comments (11)

How I internalized my achievements to better deal with negative feelings
Raymond Koopmanschap · 2024-02-27T15:10:24.149Z · comments (7)

Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders
Evan Anders (evan-anders) · 2024-02-27T02:43:22.446Z · comments (16)

[link] Surgery Works Well Without The FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2024-01-26T13:31:29.968Z · comments (28)

Protocol evaluations: good analogies vs control
Fabien Roger (Fabien) · 2024-02-19T18:00:09.794Z · comments (10)

Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken
Zvi · 2024-04-01T19:10:12.193Z · comments (1)

[link] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

[link] Jailbreak steering generalization
Sarah Ball · 2024-06-20T17:25:24.110Z · comments (4)

A Teacher vs. Everyone Else
ronak69 · 2024-03-21T17:45:35.714Z · comments (8)

(Approximately) Deterministic Natural Latents
johnswentworth · 2024-07-19T23:02:12.306Z · comments (0)

Open Problems in AIXI Agent Foundations
Cole Wyeth (Amyr) · 2024-09-12T15:38:59.007Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

shankar-sivarajan on Meta Pivots on Content Moderation

What kind of madman puts the "Very Left" side of the plot on the right?

philh on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

Assuming all went well, I just donated £5,000 through the Anglo-American charity, which should become about (£5000 * 1.25 * 96% = £6000 ≈ $7300) to lightcone.

I had further questions to their how to give page, so:

You can return the forms by email, no need to post them. (I filled them in with Firefox's native "draw/write on this pdf" feature, handwriting my signature with a mouse.)
If donating by bank transfer, you send the money to "anglo-american charity limited", not "anglo-american charitable foundation".
For lightcone's contact details I asked on LW intercom. Feels rude to put someone's phone number here, so if you're doing the same as me, I'm not gonna save you that step.

robert-miles on Everywhere I Look, I See Kat Woods

You understand you can just block her on Reddit and Facebook, and move on with your life?

cole-wyeth on Timaeus is hiring researchers & engineers

What do you mean by "geometry of program synthesis"?

charlie-steiner on The quantum red pill or: They lied to you, we live in the (density) matrix

When you say there's "no such thing as a state," or "we live in a density matrix," these are statements about ontology: what exists, what's real, etc.

Density matrices use the extra representational power they have over states to encode a probability distribution over states. If we regard the probabilistic nature of measurements as something to be explained, putting the probability distribution directly into the thing we live in is what I mean by "explain with ontology."

Epistemology is about how we know stuff. If we start with a world that does not inherently have a probability distribution attached to it, but obtain a probability distribution from arguments about how we know stuff, that's "explain with epistemology."

In quantum mechanics, this would look like talking about anthropics, or what properties we want a measure to satisfy, or solomonoff induction and coding theory.

What good is it to say things are real or not? One useful application is predicting the character of physical law. If something is real, then we might expect it to interact with other things. I do not expect the probability distribution of a mixed state to interact with other things.

seth-herd on Vincent Fagot's Shortform

Whew! That's pretty intense, and pretty smart. I didn't read it all because I don't have time and I'm not in the same emotional position you're in.

I do want to say that I've thought an awful lot and researched a lot about the situation we're in with AI and as a world and a society. I feel a lot more uncertainty than you seem to about outcomes. AI and AGI are going to create very, very major changes. There's a ton of risk of different sorts, but there's also a very real chance (relative to what we reliably know) that things get way, way better after AGI, if we can align it and manage the power distribution issues. And I think there are very plausible routes to both of those.

This is primarily based on uncertainty. I've looked in depth at the arguments for pessimism about both alignment and societal power structures. They are just as incomplete and vibes-based as the arguments for optimism. There's a lot of real substance on both sides, but not enough to draw firm conclusions. Some very well informed people think we're doomed, other equally well-informed people think odds are in favor of a vastly better future. We simply don't know how this is going to turn out.

There is still time to hope, and to help.

See my If we solve alignment, do we die anyway? [LW · GW] and the other posts and sources I link there for more on all of these claims.

double on Patent Trolling to Save the World

That's fair enough and a good point.

I think that the key difference is that in the case of profitable-but-bad technologies, someone, somewhere, will probably invent them because there's great incentive to do so.

In the case of gain-of-function, if there stops being grants and the academics who do it become pariahs, then the incentive to do the gain-of-function research is gone.

legionnaire on Legionnaire's Shortform

Well that puts my concern to rest. Thanks!

the-gears-to-ascension on Daniel Tan's Shortform

Partially agreed. I've tested this a little personally; Claude successfully predicted their own success probability on some programming tasks, but was unable to report their own underlying token probabilities. The former tests weren't that good, the latter ones somewhat were okay, I asked Claude to say the same thing across 10 branches and then asked a separate thread of Claude, also downstream of the same context, to verbally predict the distribution.

legionnaire on Numberwang: LLMs Doing Autonomous Research, and a Call for Input

Would also love to take the tests. If possible you could grab human test subjects from certain areas: a less wrong group, a reddit group, etc.