LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

I changed my mind about orca intelligence
Towards_Keeperhood (Simon Skade) · 2025-03-18T10:15:29.860Z · comments (24)

Equations Mean Things
abstractapplic · 2025-03-19T08:16:35.312Z · comments (10)

On (Not) Feeling the AGI
Zvi · 2025-03-25T14:30:02.215Z · comments (25)

A collection of approaches to confronting doom, and my thoughts on them
Ruby · 2025-04-06T02:11:31.271Z · comments (9)

Metacognition Broke My Nail-Biting Habit
Rafka · 2025-03-16T12:36:47.437Z · comments (20)

Interpreting Complexity
Maxwell Adam (intern) · 2025-03-14T04:52:32.103Z · comments (7)

[link] Intelsat as a Model for International AGI Governance
rosehadshar · 2025-03-13T12:58:11.692Z · comments (0)

[question] Why do many people who care about AI Safety not clearly endorse PauseAI?
humnrdble · 2025-03-30T18:06:32.426Z · answers+comments (39)

We Have No Plan for Preventing Loss of Control in Open Models
Andrew Dickson · 2025-03-10T15:35:12.597Z · comments (11)

Silly Time
jefftk (jkaufman) · 2025-03-21T12:30:08.560Z · comments (2)

Alignment faking CTFs: Apply to my MATS stream
joshc (joshua-clymer) · 2025-04-04T16:29:02.070Z · comments (0)

Tabula Bio: towards a future free of disease (& looking for collaborators)
mpoon (michael-poon) · 2025-03-23T16:30:15.523Z · comments (14)

AI #108: Straight Line on a Graph
Zvi · 2025-03-20T13:50:00.983Z · comments (5)

Notes on countermeasures for exploration hacking (aka sandbagging)
ryan_greenblatt · 2025-03-24T18:39:36.665Z · comments (4)

An Advent of Thought
Kaarel (kh) · 2025-03-17T14:21:08.765Z · comments (8)

[link] Automated Researchers Can Subtly Sandbag
gasteigerjo · 2025-03-26T19:13:26.879Z · comments (0)

Follow me on TikTok
lsusr · 2025-04-01T08:22:29.521Z · comments (8)

AI #109: Google Fails Marketing Forever
Zvi · 2025-03-27T14:50:01.825Z · comments (12)

Response to Scott Alexander on Imprisonment
Zvi · 2025-03-11T20:40:06.250Z · comments (4)

[link] Paths and waystations in AI safety
Joe Carlsmith (joekc) · 2025-03-11T18:52:57.772Z · comments (1)

Analyzing long agent transcripts (Docent)
jsteinhardt · 2025-03-24T20:49:54.472Z · comments (2)

[link] Map of all 40 copyright suits v. AI in U.S.
Remmelt (remmelt-ellen) · 2025-03-26T07:57:58.976Z · comments (3)

An overview of control measures
ryan_greenblatt · 2025-03-24T23:16:49.400Z · comments (0)

SHIFT relies on token-level features to de-bias Bias in Bios probes
Tim Hua · 2025-03-19T21:29:15.974Z · comments (2)

We need (a lot) more rogue agent honeypots
Ozyrus · 2025-03-23T22:24:52.785Z · comments (11)

They Took MY Job?
Zvi · 2025-03-21T13:30:38.507Z · comments (4)

LessOnline 2025: Early Bird Tickets On Sale
Ben Pace (Benito) · 2025-03-18T00:22:02.653Z · comments (3)

Notable runaway-optimiser-like LLM failure modes on Biologically and Economically aligned AI safety benchmarks for LLMs with simplified observation format
Roland Pihlakas (roland-pihlakas) · 2025-03-16T23:23:30.989Z · comments (6)

Announcing EXP: Experimental Summer Workshop on Collective Cognition
Jan_Kulveit · 2025-03-15T20:14:47.972Z · comments (2)

Any-Benefit Mindset and Any-Reason Reasoning
silentbob · 2025-03-15T17:10:14.682Z · comments (9)

[link] Three Types of Intelligence Explosion
rosehadshar · 2025-03-17T14:47:46.696Z · comments (8)

Meditation and Reduced Sleep Need
niplav · 2025-04-04T14:42:54.792Z · comments (7)

[link] AI Can't Write Good Fiction
JustisMills · 2025-03-12T06:11:57.786Z · comments (19)

Boots theory and Sybil Ramkin
philh · 2025-03-18T22:10:08.855Z · comments (17)

Split Personality Training: Revealing Latent Knowledge Through Personality-Shift Tokens
Florian_Dietz · 2025-03-10T16:07:45.215Z · comments (3)

Avoid the Counterargument Collapse
marknm · 2025-03-26T03:19:58.655Z · comments (3)

Everything I Know About Semantics I Learned From Music Notation
J Bostock (Jemist) · 2025-03-09T18:09:11.789Z · comments (2)

Why Are The Human Sciences Hard? Two New Hypotheses
Aydin Mohseni (aydin-mohseni) · 2025-03-18T15:45:52.239Z · comments (14)

More Fun With GPT-4o Image Generation
Zvi · 2025-04-03T02:10:02.317Z · comments (2)

The Rise of Hyperpalatability
Jack (jack-3) · 2025-04-02T20:18:04.407Z · comments (10)

FLAKE-Bench: Outsourcing Awkwardness in the Age of AI
annas (annasoli) · 2025-04-01T17:08:25.092Z · comments (0)

[link] Center on Long-Term Risk: Summer Research Fellowship 2025 - Apply Now
Tristan Cook · 2025-03-26T17:29:14.797Z · comments (0)

Is instrumental convergence a thing for virtue-driven agents?
mattmacdermott · 2025-04-02T03:59:20.064Z · comments (37)

Goodhart Typology via Structure, Function, and Randomness Distributions
JustinShovelain · 2025-03-25T16:01:08.327Z · comments (0)

More on Various AI Action Plans
Zvi · 2025-03-24T13:10:05.637Z · comments (0)

Field tests of semi-rationality in Brazilian military training
P. João (gabriel-brito) · 2025-03-12T16:14:12.590Z · comments (0)

When the Wannabe Rambo Comedian Cried
P. João (gabriel-brito) · 2025-03-31T14:47:50.660Z · comments (0)

An overview of areas of control work
ryan_greenblatt · 2025-03-25T22:02:16.178Z · comments (0)

On the Implications of Recent Results on Latent Reasoning in LLMs
Rauno Arike (rauno-arike) · 2025-03-31T11:06:23.939Z · comments (6)

Monthly Roundup #28: March 2025
Zvi · 2025-03-17T12:50:03.097Z · comments (8)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cwillu on Prediction Markets Are Mediocre

30% was already a ridiculously high risk, and should have already been decisive if this was the deciding issue for a voter. You don't have to know that the bullet is in your chamber to decide not to play russian roulette.

fabien-roger on Untrusted monitoring insights from watching ChatGPT play coordination games

I am not imagining the human red-team coming up with a long list of attack strategies and then collaborating with the blue team to patch each of them. I am imagining sth more like the blue team having a fairly general strategy (e.g. paraphrasing), and then relying on a property like "the part of solution space the human red-team can cover is roughly as effective at beating the blue-team protocol as the AI solutions". I agree that AIs having different kinds of ideas is a reason to expect this assumption to not hold for some protocols (e.g. for human monitoring), but as discussed here [LW · GW], the human red-team also has a bunch of affordances that make it plausible to get somewhat conservative evals. This seems especially true for things like steganography vs paraphrasing, where I don't see strong reasons to expect AIs to excel relative to humans.

I guess a lot of this is contingent on what AGI will look like. Current AIs have mostly human-like insights. If AIs got very insane high-quality ideas (e.g. like move 37) in the relevant domains as we got closer to AGI, I think this would become more concerning.

There are other hopes like "train AIs to be good at collusion / steganography, and use that as red-team" that suffer less from the concerns discussed here, but they have other problems like exploration hacking.

I am not sure how much we disagree. For these sorts of difficulties both in ruling out collusion and in evaluating it, I expect that doing untrusted monitoring well enough to have it reduce risk significantly would likely require a large amount of effort.

j-bostock on Jemist's Shortform

I have added a link to the report now.

As to your point: this is one of the better arguments I've heard that welfare ranges might be similar between animals. Still I don't think it squares well with the actual nature of the brain. Saying there's a single suffering computation would make sense if the brain was like a CPU, where one core did the thinking, but actually all of the neurons in the brain are firing at once and doing computations in at the same time. So it makes much more sense to me to think that the more neurons are computing some sort of suffering, the greater the intensity of suffering.

ryan_greenblatt on Factory farming intelligent minds

I only skimmed this essay and I'm probably more sympathetic to moral patienthood of current AI systems than many, but I think this exact statement is pretty clearly wrong:

Statistically speaking, if you're an intelligent mind that came into existence in the past few years, you're probably running on a large language model.

Among beings which speak in some human language (LLMs or humans), I think most experience moments are human.

I think OpenAI generates around ~100 billion tokens per day. Let's round up and say that reasonably smart LLMs generate or read a total of ~10 trillion tokens per day (likely an overestimate I think). Then, let's say that 1 token is equivalent to 1 second of time (also an overestimate, I'd guess more like 10-100 tokens per second even if I otherwise assigned similar moral weight which I currently don't). Then, we're looking at 10 trillion seconds of experience moments per day. There are 8 billion humans and 50,000 seconds (while awake) each day, so 400 trillion seconds of experience moments. 400 trillion >> 10 trillion. So, probably the majority of experience moments (among being which speak some human language) are from humans.

j-bostock on Jemist's Shortform

Good point, edited a link to the Google Doc into the post.

esrogs on Davey Morse's Shortform

Responding to your parenthetical, the downside of that approach is that the discussion would not be recorded for posterity!

Regarding the original question, I am curious if this could work for a country whose government spending was small enough, e.g. 2-3% of GDP. Maybe the most obvious issue is that no government would be disciplined enough to keep their spending at that level. But it does seem sort of elegant otherwise.

azergante on Distinct Configurations

I liked the intro but some parts of the previous posts and this one have been confusing, for example in this post:

Second, we saw that configurations are about multiple particles. [...] And in the real universe, every configuration is about all the particles… everywhere.)

and more glaring in the previous one:

A configuration says, “a photon here, a photon there,”

Here my intuition is that we can model the world as particles, or we can use the lower-level model of the world which configurations are, but we can't mix both any way we want. These sentences feel backwards: they talk about configurations in terms of particles, and this doesn't make any sense because the model of particles should be derived from the lower-level model of configurations, not the other way around!!

Maybe someone with more knowledge in physics and philosophy of science can clarify?

daniel-amdurer on PauseAI and E/Acc Should Switch Sides

I get that this was a joke, and don't agree with the "AGI doomers should push for faster development" part, but I actually do agree that e/acc would be better off accepting a slowdown now in order to get more powerful AI later.

shash42 on METR: Measuring AI Ability to Complete Long Tasks

These results empirically resolve for me why scaling will continue to be economically rational despite logarithmic gains in (many) benchmark performance. https://www.lesswrong.com/posts/dAYemKXz4JDFQk8QE/log-linear-scaling-is-worth-the-cost-due-to-gains-in-long [LW · GW]

michaelharrop on How to Make Superbabies

Oh my god, what a disturbingly overconfident & erroneous comment. Especially coming from someone who has been immersed in science for so many years. I recognize your name from Reddit from over 10 years ago.

Due to Brandolini's law, your comment made me look up how to block users on Lesswrong, which apparently isn't possible. I now have to waste a huge amount of time debunking your egregious misinformation. I will only do it this once because in my experience, people who exhibit this kind of behavior will continue it. So in the future I will simply refer to this exchange as evidence that you are not someone who deserves to be taken seriously or responded to.

On any evidence-based website, your comment is the type that deserves a warning and then a permanent ban if it happens again. That you've had an active account on this website for 15 years makes me want to avoid this website.

The human microbiome is irrelevant to this topic.

The topic is how to make people/babies healthier, better developed, and more intelligent. Anyone who reviews this information should be able to conclude that your statement is ridiculously false:

The microbiome is highly heritable (usual twin studies & SNP heritabilities), and it is caused by genes and the environment, as well as unstable; its direct causal effects in normal humans are minimal.

You started off with largely irrelevant statements and ended with severe misinformation. FMT (fecal microbiota transplant) studies demonstrate causation. You can look through the humanmicrobiome.info wiki or do a literature search to see how many FMT studies there are showing "non-minimal" effects. So much so that a 2020 review said they thought the results were implausible.

Some examples are the plethora of studies showing that the benefits of fasting, the ketogenic diet, and other dietary interventions, are dependent on the gut microbiome, and the benefits can be transferred via FMT. And the same goes for exercise, grip strength, and muscle mass.

We know that it is supremely irrelevant because environmental changes like antibiotics or new food or global travel which produce large changes in personal (and offspring) microbiomes do not produce large changes in intelligence (of oneself or offspring)

Firstly, this is false.

Low-dose penicillin in early life induces long-term changes in murine gut microbiota, brain cytokines and behavior (2017): https://www.nature.com/articles/ncomms15062
Antibiotics that kill gut bacteria also stop growth of new brain cells: https://www.sciencedaily.com/releases/2016/05/160519130105.htm
https://www.nature.com/articles/s41598-021-80982-6 - "These results suggest early antibiotic use may impact the gut-brain axis with the potential for consequences in early life development."
Many more: https://humanmicrobiome.info/antibiotics/#harms-of-antibiotics
The damage done by antibiotics affects the offspring as well: https://humanmicrobiome.info/maternity/#brain-function
You also have to keep in mind that antibiotic use increases the risk for diseases that are known to decrease brain function.

Secondly, antibiotics are one of the biggest threats and degraders of human health and development: https://humanmicrobiome.info/antibiotics/

germ-free humans exist and

I have read that it's not possible to make a germ-free human. What you linked to as evidence for your claim is a person living in a sterile isolator. That prevents him from exposure to new microbes, but it doesn't make him germ-free.

germ-free mice apparently even live longer

From your citation (just adding context): "The reduced early food intake and smaller body weight of adult GF rats may be the reason ad libitum fed GF rats live slightly longer". Here are some studies indicating that "germ-free" has detrimental consequences:

Various health problems: https://archive.is/1Rxak
"Germ-free animals have numerous other immunological defects that may lead to disease, which implicates a role for the microbiota in actively supporting health" https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4095778/
Behavioural and neurochemical consequences of chronic gut microbiota depletion during adulthood in the rat (2016) https://www.sciencedirect.com/science/article/abs/pii/S0306452216305127
Germ-Free Mice Exhibit Mast Cells With Impaired Functionality and Gut Homing (Feb 2019): https://www.frontiersin.org/articles/10.3389/fimmu.2019.00205/full
"Studies have characterized differences in host physiology in germ free and colonized mice, the most striking being the enlarged cecum" (2015) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4083815/

Most of this page is meaningless mouse studies (infamous for not replicating and getting whatever result the experimenter wants and the animal model literature having huge systemic biases)

Let's take a look at why that is: https://archive.fo/Nzz1Y#selection-1735.10-1735.11 - Summary: It's largely due to microbiome differences.

Here's a quote from the OP post:

Let’s put it all together; if super-SOX works as well in humans as it does in mice, this is how you would make superbabies

So are you dismissing the entire OP post as well because of this?

While animal studies definitely have their limits, it seems extremely erroneous to essentially dismiss an entire area of study since much of the research was done in mice. A lot of mouse research isn't ethical to do on humans.

Most of this page is meaningless mouse studies, and the handful of actual human studies I see here are all garbage

You're dismissing an entire field of tens of thousands of studies. If you're right, you should be spending your time protesting such a massive waste of time and money rather than arguing with some random blog commenter.

breastfeeding where the beneficial effects disappear when controlling for just some confounds

This is false. There are cited studies there that control for confounders and still found benefits, including to intelligence.

much-touted correlations like autism

Again, there are plenty of studies showing causation. https://humanmicrobiome.info/brain/#autism

There's not a single result on this page that provides a shred of evidence for your implied thesis that microbiome interventions could, even in theory, possibly matter to 'how to make superbabies'. It doesn't.

This is such a ridiculous statement. But it'll be up to each reader to review the page and decide for themselves.

He was right. BTW, you remember what happened in 2021, right?

You linked to uBiome, which was a company that sold gut microbiome tests. Your citation of them is irrelevant to your claim that "the microbiome is a fad". If you said "microbiome testing is a fad", that would arguably be accurate, but it's a completely different claim.