LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Universal Love Integration Test: Hitler
Raemon · 2024-01-10T23:55:35.526Z · comments (65)

[link] Soft Nationalization: how the USG will control AI labs
Deric Cheng (deric-cheng) · 2024-08-27T15:11:14.601Z · comments (7)

2025 Prediction Thread
habryka (habryka4) · 2024-12-30T01:50:14.216Z · comments (18)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

[question] What could a policy banning AGI look like?
TsviBT · 2024-03-13T14:19:07.783Z · answers+comments (23)

The 2023 LessWrong Review: The Basic Ask
Raemon · 2024-12-04T19:52:40.435Z · comments (25)

[link] Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds"
mattmacdermott · 2024-02-29T13:59:34.959Z · comments (19)

Coherence of Caches and Agents
johnswentworth · 2024-04-01T23:04:31.320Z · comments (9)

Value fragility and AI takeover
Joe Carlsmith (joekc) · 2024-08-05T21:28:07.306Z · comments (5)

Mid-conditional love
KatjaGrace · 2024-04-17T04:00:08.341Z · comments (21)

Secular Solstice Round Up 2024
dspeyer · 2024-11-21T10:49:36.682Z · comments (15)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

My guess at Conjecture's vision: triggering a narrative bifurcation
Alexandre Variengien (alexandre-variengien) · 2024-02-06T19:10:42.690Z · comments (12)

Brief analysis of OP Technical AI Safety Funding
22tom (thomas-barnes) · 2024-10-25T19:37:41.674Z · comments (5)

AISC9 has ended and there will be an AISC10
Linda Linsefors · 2024-04-29T10:53:18.812Z · comments (4)

[Intuitive self-models] 3. The Homunculus
Steven Byrnes (steve2152) · 2024-10-02T15:20:18.394Z · comments (36)

On the CrowdStrike Incident
Zvi · 2024-07-22T12:40:05.894Z · comments (14)

[link] Claude 3.5 Sonnet
Zach Stein-Perlman · 2024-06-20T18:00:35.443Z · comments (41)

🇫🇷 Announcing CeSIA: The French Center for AI Safety
Charbel-Raphaël (charbel-raphael-segerie) · 2024-12-20T14:17:13.104Z · comments (0)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (7)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (79)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

Analogies between scaling labs and misaligned superintelligent AI
scasper · 2024-02-21T19:29:39.033Z · comments (5)

Vote on Anthropic Topics to Discuss
Ben Pace (Benito) · 2024-03-06T19:43:47.194Z · comments (55)

Counting AGIs
cash (cshunter) · 2024-11-26T00:06:17.845Z · comments (19)

Mistakes people make when thinking about units
Isaac King (KingSupernova) · 2024-06-25T03:39:20.138Z · comments (14)

MATS AI Safety Strategy Curriculum
Ronny Fernandez (ronny-fernandez) · 2024-03-07T19:59:37.434Z · comments (2)

SAE-VIS: Announcement Post
CallumMcDougall (TheMcDouglas) · 2024-03-31T15:30:49.079Z · comments (8)

The case for a negative alignment tax
Cameron Berg (cameron-berg) · 2024-09-18T18:33:18.491Z · comments (20)

Introducing Transluce — A Letter from the Founders
jsteinhardt · 2024-10-23T18:10:02.526Z · comments (2)

[link] MIRI's June 2024 Newsletter
Harlan · 2024-06-14T23:02:23.721Z · comments (20)

[link] Cost, Not Sacrifice
Joe Rogero · 2024-11-20T21:32:26.281Z · comments (13)

Human study on AI spear phishing campaigns
Simon Lermen (dalasnoin) · 2025-01-03T15:11:14.765Z · comments (8)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

(Not) Derailing the LessOnline Puzzle Hunt
Error · 2024-06-04T01:28:31.688Z · comments (2)

A Simple Toy Coherence Theorem
johnswentworth · 2024-08-02T17:47:50.642Z · comments (22)

Q&A on Proposed SB 1047
Zvi · 2024-05-02T15:10:02.916Z · comments (8)

Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs (elriggs) · 2024-07-01T21:35:40.603Z · comments (12)

“Artificial General Intelligence”: an extremely brief FAQ
Steven Byrnes (steve2152) · 2024-03-11T17:49:02.496Z · comments (6)

A Gentle Introduction to Risk Frameworks Beyond Forecasting
pendingsurvival · 2024-04-11T18:03:25.605Z · comments (10)

Do sparse autoencoders find "true features"?
Demian Till · 2024-02-22T18:06:59.630Z · comments (33)

On Dwarkesh’s Podcast with OpenAI’s John Schulman
Zvi · 2024-05-21T17:30:04.332Z · comments (4)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

The World in 2029
Nathan Young · 2024-03-02T18:03:29.368Z · comments (37)

The One and a Half Gemini
Zvi · 2024-02-22T13:10:04.725Z · comments (4)

Companies' safety plans neglect risks from scheming AI
Zach Stein-Perlman · 2024-06-03T15:00:20.236Z · comments (4)

[link] Nick Bostrom’s new book, “Deep Utopia”, is out today
PeterH · 2024-03-27T11:24:01.401Z · comments (5)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (12)

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

kave on Parkinson's Law and the Ideology of Statistics

I dug up my old notes on this book review. Here they are:

So, I've just spent some time going through the World Bank documents on its interventions in Lesotho. The Anti-Politics Machine is not doing great on epistemic checking
There is no recorded Thaba-Tseka Development Project, despite the period in which it should have taken place being covered
There is a Thaba-Bosiu development project (parts 1 and 2) taking place at the correct time.
Thaba-Bosiu and Thaba-Tseka are both regions of Lesotho
The spec doc for Thaba-Bosiu Part 2 references the alleged problems the economists were faced with (remittances from South African miners, poor crop yield ... no complaint about cows)
It has a negative assessment doc at the end. It was an unsuccessful project. This would match
The funding doesn't quite match up. The UK is mentioned as funding the "Thaba-Tseka" project, and is indeed funding Thaba-Bosiu. But Canada is I believe funding a road project instead
Something like 2/3 of the country is involved in Thaba-Bosiu Development II (It became renamed the "Basic Agricultural Services Program")
There is no mention of ponies or wood involved in interventions anywhere. In fact, the part II retrospective includes the lack of focus on livestock as a problem (suggesting they didn't do much of it)
They were focused on five major crops (maize, sorghum, beans, peas and wheat)
Also the quote in the book review of the quote in The Anti-Politics Machine of the quote in the report doesn't show up in any of the documents I looked at (which basically covered every project in Lesotho by the World Bank in that time period). The writing style of the quote is also moderately distinct from that of the reports
AFAICT, the main intervention was fertiliser. The retrospective claims this failed because (a) the climate in Lesotho is uniquely bad and screened off fertilisation and (b) the Lesotho government fucked up messaging and also every other part of everything all the time and ultimately all the donors backed out.
The government really wanted to be self-sufficient in food production. None of the donors, the farmers or the world bank cared about this but the government focused its messaging heavily around this. The government ended up directing a lot of its efforts towards a new Food Self-Sufficiency Program which was seen as incompatible with the goals of Basic Agricultural Services Program.
The fact that the crop situation wasn't working was recognised fairly early on. They started on an adaptive trial of crop research to figure out what would work better. This was hampered by donor coordination so only happened in a small area, but apparently worked quite well
All-in-all, sounds less bad than the Anti-Politics Machine makes it out to be, and also just generally very different? I'm not 100% certain I've managed to locate all the relevant programs though, so it's possible something closer to the book's description did happen

martin-randall on Pausing AI Developments Isn't Enough. We Need to Shut it All Down

This letter was an important milestone in the evolution of MIRI's strategy over 2020-2024. As of October 2023 [LW · GW] Yudkowsky is MIRI's chair and "the de facto reality (is) that his views get a large weight in MIRI strategic direction".

MIRI used to favor technical alignment over policy work. In April 2021, in comments to Death with Dignity [LW · GW] Yudkowsky argued [LW(p) · GW(p)] that:

How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI? A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.

People were not completely swayed by this advice, and the Best of LessWrong 2022 included What an Actually Pessimistic Containment Strategy Looks Like [LW · GW] in April 2022 and Let's Think About Slowing Down AI [LW · GW] in December 2022.

In this Time letter from March 2023 we see Yudkowsky doing AI policy work. In January 2024 the new MIRI CEO announced in the MIRI 2024 Mission and Strategy Update [LW · GW] that AI policy work provides "a glimmer of hope". In December 2024 we get Communications in Hard Mode - My New Job at MIRI [LW · GW].

A question for all of us: "How could I have thought that faster? [LW · GW]"

mitchell_porter on Don't fall for ontology pyramid schemes

Could you give some examples of ontology pyramid schemes?

Also, can ontologies that are actually true, be the subject of pyramid schemes, as you define them?

la-alis on nikola's Shortform

Did you collect the data for their actual median timelines, or just its position relative to 2030? If you collected higher-resolution data, are you able to share it somewhere?

jessica-liu-taylor on Adam Shai's Shortform

I don't habitually use the concept so I don't have an opinion on how to use the term.

saidachmiz on Deontic Explorations In "Paying To Talk To Slaves"

(In general, any human who might be worth enslaving is also a person whom it would be improper to enslave.)

...I don’t see what that has to do with LLMs, though.

This claim by you about the conditions under which slavery is profitable seems wildly optimistic, and not at all realistic, but also a very normal sort of intellectual move.

If a person is a depraved monster (as many humans actually are) then there are lots of ways to make money from a child slave.

I looked up a list of countries where child labor occurs. Pakistan jumped out as “not Africa or Burma” and when I look it up in more detail, I see that Pakistan’s brick industry, rug industry, and coal industry all make use of both “child labor” and “forced labor”. Maybe not every child in those industries is a slave, and not every slave in those industries is a child, but there’s probably some overlap.

It seems like you have quite substantially misunderstood my quoted claim. I think this is probably a case of simple “read too quickly” on your part, and if you reread what I wrote there, you’ll readily see the mistake you made. But, just in case, I will explain again; I hope that you will not take offense, if this is an unnecessary amount of clarification.

The children who are working in coal mines, brick factories, etc., are (according to the report you linked) 10 years old and older. This is as I would expect, and it exactly matches what I said: any human who might be worth enslaving (i.e., a human old enough to be capable of any kind of remotely useful work, which—it would seem—begins at or around 10 years of age) is also a person whom it would be improper to enslave (i.e., a human old enough to have developed sapience, which certainly takes place long before 10 years of age). In other words, “old enough to be worth enslaving” happens no earlier (and realistically, years later) than “old enough such that it would be wrong to enslave them [because they are already sapient]”.

(It remains unclear to me what this has to do with LLMs.)

Since “we” (you know, the good humans in a good society with good institutions) can’t even clean up child slavery in Pakistan, maybe it isn’t surprising that “we” also can’t clean up AI slavery in Silicon Valley, either.

Maybe so, but it would also not be surprising that we “can’t” clean up “AI slavery” in Silicon Valley even setting aside the “child slavery in Pakistan” issue, for the simple reason that most people do not believe that there is any such thing as “AI slavery in Silicon Valley” that needs to be “cleaned up”.

In asking the questions I was trying to figure out if you meant “obviously AI aren’t moral patients because they aren’t sapient” or “obviously the great mass of normal humans would kill other humans for sport if such practices were normalized on TV for a few years since so few of them have a conscience” or something in between.

Like the generalized badness of all humans could be obvious-to-you (and hence why so many of them would be in favor of genocide, slavery, war, etc and you are NOT surprised) or it might be obvious-to-you that they are right about whatever it is that they’re thinking when they don’t object to things that are probably evil, and lots of stuff in between.

None of the above.

You are treating it as obvious that there are AIs being “enslaved” (which, naturally, is bad, ought to be stopped, etc.). Most people would disagree with you. Most people, if asked whether something should be done about the enslaved AIs, will respond with some version of “don’t be silly, AIs aren’t people, they can’t be ‘enslaved’”. This fact fully suffices to explain why they do not see it as imperative to do anything about this problem—they simply do not see any problem. This is not because they are unaware of the problem, nor is it because they are callous. It is because they do not agree with your assessment of the facts.

That is what is obvious to me.

(I once again emphasize that my opinions about whether AIs are people, whether AIs are sapient, whether AIs are being enslaved, whether enslaving AIs is wrong, etc., have nothing whatever to do with the point I am making.)

keltan on quetzal_rainbow's Shortform

I've been thinking about this a lot lately. It seems to link to many things. And might be a bit too much for just a comment. But here are some key concepts from mostly psych that I think link to why sleeping on a problem makes it easier.

Hebb's Law
Learning is assumed to take place over a 24hr span
Chunking
The Multi-component Model of Working Memory
Mice developing 'Maze Neurons' when learning a maze
People who are woken mid-sleep and self report dreaming about a problem they've tried to solve, do better the next day than people who are woken and don't report dreaming about the problem

If I boil it down, I have two hypotheses that could both be true.

When you dream about a problem you're brain is formulating ideas that can help you solve it. All you have to do the next day is try again and those ideas will become available to you as if you had just 'had an idea'
Sleeping on a problem breaks it up into more manageable chunks that you can better manipulate in working memory the next time you try to solve it.

There are other things that happen during sleep that will just make every problem easier to solve the next day. For example:

Cleaning up chemical 'garbage' that collects in your brain during the day.
Forgetting things that the brain doesn't think you have a use for
Resetting/reducing your emotions. (If you're stressed about a new problem, you'll find it easier to solve it when you're less stressed.)

elifland on AI Timelines

Do you think that cyber professionals would take multiple hours to do the tasks with 20-40 min first-solve times? I'm intuitively skeptical.

Yes, that would be my guess, medium confidence.

One component of my skepticism is that someone told me that the participants in these competitions are less capable than actual cyber professionals, because the actual professionals have better things to do than enter competitions. I have no idea how big that selection effect is, but it at least provides some countervailing force against the selection effect you're describing.

I'm skeptical of your skepticism. Not knowing basically anything about the CTF scene but using the competitive programming scene as an example, I think the median competitor is much more capable than the median software engineering professional, not less. People like competing at things they're good at.

maxnadeau on AI Timelines

Do you think that cyber professionals would take multiple hours to do the tasks with 20-40 min first-solve times? I'm intuitively skeptical.

One (edit: minor) component of my skepticism is that someone told me that the participants in these competitions are less capable than actual cyber professionals, because the actual professionals have better things to do than enter competitions. I have no idea how big that selection effect is, but it at least provides some countervailing force against the selection effect you're describing.

elifland on AI Timelines

I believe Cybench first solve times are based on the fastest top professional teams, rather than typical individual CTF competitors or cyber employees, for which the time to complete would probably be much higher (especially for the latter).