LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Minimal Motivation of Natural Latents
johnswentworth · 2024-10-14T22:51:58.125Z · comments (14)

Motivation control
Joe Carlsmith (joekc) · 2024-10-30T17:15:50.881Z · comments (7)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

[link] The Deep Lore of LightHaven, with Oliver Habryka (TBC episode 228)
Eneasz · 2024-12-24T22:45:50.065Z · comments (4)

Start an Upper-Room UV Installation Company?
jefftk (jkaufman) · 2024-10-19T02:00:10.691Z · comments (9)

[link] Literacy Rates Haven't Fallen By 20% Since the Department of Education Was Created
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-22T20:53:59.007Z · comments (0)

AI #97: 4
Zvi · 2025-01-02T14:10:06.505Z · comments (4)

[link] Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI
Connor Leahy (NPCollapse) · 2024-12-02T13:28:57.977Z · comments (10)

Monthly Roundup #24: November 2024
Zvi · 2024-11-18T13:20:06.086Z · comments (14)

[link] The Choice Transition
owencb · 2024-11-18T12:30:56.198Z · comments (4)

Preppers Are Too Negative on Objects
jefftk (jkaufman) · 2024-12-18T02:30:01.854Z · comments (2)

Open Thread Fall 2024
habryka (habryka4) · 2024-10-05T22:28:50.398Z · comments (186)

[link] Preference Inversion
Benquo · 2025-01-02T18:15:52.938Z · comments (35)

[link] Dangerous capability tests should be harder
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:20:50.610Z · comments (3)

Claude's Constitutional Consequentialism?
1a3orn · 2024-12-19T19:53:33.254Z · comments (6)

[link] Review: Good Strategy, Bad Strategy
L Rudolf L (LRudL) · 2024-12-21T17:17:04.342Z · comments (0)

[link] Began a pay-on-results coaching experiment, made $40,300 since July
Chipmonk · 2024-12-29T21:12:02.574Z · comments (14)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

Practicing Bayesian Epistemology with "Two Boys" Probability Puzzles
Liron · 2025-01-02T04:42:20.362Z · comments (13)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (10)

[link] Two interviews with the founder of DeepSeek
Cosmia_Nebula · 2024-11-29T03:18:47.246Z · comments (1)

[link] IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman · 2024-10-24T20:30:41.159Z · comments (13)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (23)

Causal Undertow: A Work of Seed Fiction
Daniel Murfet (dmurfet) · 2024-12-08T21:41:48.132Z · comments (0)

AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan · 2024-12-01T06:00:06.345Z · comments (0)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

Reflections on the Metastrategies Workshop
gw · 2024-10-24T18:30:46.255Z · comments (5)

ARENA 4.0 Impact Report
Chloe Li (chloe-li-1) · 2024-11-27T20:51:54.844Z · comments (3)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

[link] A car journey with conservative evangelicals - Understanding some British political-religious beliefs
Nathan Young · 2024-12-06T11:22:45.563Z · comments (8)

Trying to translate when people talk past each other
Kaj_Sotala · 2024-12-17T09:40:02.640Z · comments (12)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

[question] What are the most interesting / challenging evals (for humans) available?
Raemon · 2024-12-27T03:05:26.831Z · answers+comments (13)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (10)

My January alignment theory Nanowrimo
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T00:07:24.050Z · comments (2)

[link] Alignment Is Not All You Need
Adam Jones (domdomegg) · 2025-01-02T17:50:00.486Z · comments (10)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb · 2024-10-28T17:10:04.272Z · comments (3)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (13)

Analysis of Global AI Governance Strategies
Sammy Martin (SDM) · 2024-12-04T10:45:25.311Z · comments (10)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)

What happens next?
Logan Zoellner (logan-zoellner) · 2024-12-29T01:41:33.685Z · comments (19)

[link] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Tamay · 2024-11-14T06:13:22.042Z · comments (0)

Signaling with Small Orange Diamonds
jefftk (jkaufman) · 2024-11-07T20:20:08.026Z · comments (1)

[question] Are You More Real If You're Really Forgetful?
Thane Ruthenis · 2024-11-24T19:30:55.233Z · answers+comments (25)

0.202 Bits of Evidence In Favor of Futarchy
niplav · 2024-09-29T21:57:59.896Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

vladimir_nesov on On Eating the Sun

Eating of the Sun is reversible, it's letting it burn that can't be reversed. The environmentalist option is to eat the Sun as soon as possible.

adamzerner on adamzerner's Shortform

I spent the day browsing the website of Josh W. Comeau yesterday. He writes educational content about web development. I am in awe.

For so many reasons. The quality of the writing. The clarity of the thinking. The mastery of the subject matter. The metaphors. The analogies. The quality and attention to detail of the website itself. Try zooming in to 300%. It still look gorgeous.

One thing that he's got me thinking about is the place that sound effects and animation have on a website. Previously my opinion was that you should usually just leave 'em out. Focus on more important things. It's hard to implement them well; they usually just make the site feel tacky. They also add a decent amount of complexity.

But Josh does such a good job of utilizing sound effects and animation! Try clicking one of the icons on the top right. Or moving your cursor around those dots at the top of the home page. Or clicking the "heart" button at the end of the "Table of contents" section for one of his posts. It's so satisfying.

I'm realizing that my previous opinion was largely a cached thought. When I think about it now, I arrive at a different perspective. Right now I'm suspecting that both sound effects and animations should be treated as something to aspire towards. If it's a smaller-scale site and you don't have the skills or the resources to incorporate them, that's ok. But if it's a larger-scale site, I dunno, I feel like it's often worth prioritizing.

Anyway, the main thing I want to talk about is his usage of demos that you can explore. For example, check out his demos of how the flex-grow property works. It's one thing to read the docs. It's another to see a visualization. It's another to play with a demo. They say that a picture is worth a thousand words. How many words is an interactive demo worth?

I don't think such demos always add that much value. Like for flex-grow, I think it adds some value, but not too much compared to a visualization like the one here. On the other hand, the demo for content vs items actually made the concept click for me in a way that I don't think would have without the demo. It makes me think back to Explorable Explanations along with some of the other work of Bret Victor.

So yeah, sometimes demos really do add a lot of educational value. But even when they don't, I think that they can also add a lot of value in other ways. For example, by being engaging, or by providing delight.

This all makes me wonder about how worthwhile it is for writers to try to incorporate such interactive demos into their posts. I'm coming from a place where I'm thinking about the fact that they often add a lot of value. Yes, they're also pretty costly, but if they add a lot of value, I dunno, maybe it's worth paying the cost. Or maybe it's worth figuring out a way to lower it. I'm also coming from a place where I observe that people will, at times, put a lot of effort into some sort of written material that they produce, rewriting it and revising it and whatnot.

Then again, it is pretty costly to incorporate such demos. You'd have to learn to code, and be pretty good at it. You'd have to develop a pretty strong intuition for good design. Those are skills that take years to learn. Maybe doing so is worthwhile if your main thing is writing or education, but otherwise, probably not.

Another thing you could do is pay someone who has those skills. Again, doing so is going to be expensive. Maybe the cost is worth it for a large project like a book or something, but for something moreso on the scale of the blog post, probably not.

As a creative solution, I wonder whether it'd make sense to find young people earlier in their careers who have the needed skills and who are looking to get real world experience, make connections, and add to their resumes. Finding such people would probably be a decent amount of work though. Maybe if there was a platform to help? Meh. I feel cynical. Fundamentally, you're trying to get valuable, skilled labor for free. Feels too much like an uphill battle.

I suppose that like all things you could just point to the fact that AI will be good enough to do this sort of thing at some point. However, I don't think that observation is a helpful one. The conversation here is about how to improve content that you produce via interactive demos. Once AI is good enough to freely or cheaply produce those demos, it'll also be good enough to just produce the overall content.

What about people like me? I like to write. I want to produce good content. I am a front end leaning web developer. I think I have an eye for design. Maybe I am the type of person who could take the time and produce these sorts of interactive demos for the content I produce?

Nah. I don't think it'd be practical. Right now I have other things aside from writing that I'm prioritizing and I'm not looking to spend more than something on the scale of hours for a given post. At other points I aimed to spend something more on the scale of days for a given post, but even that is probably too short of a time scale to justify interactive demos. I think interactive demos become relevant when you're dealing with weeks, if not months. And so even if you have the skills, I think it often isn't practical if you aren't eg. a book author or something.

Maybe there is a deeper issue here. Maybe it's that we are the kind that can't cooperate [LW · GW]. From a God's Eye perspective, I feel like I'd much prefer to take 100 authors and have them coordinate to produce 1 amazing blog post than for them to go off on their own and produce 100 mediocre blog posts. But observing this is hardly a solution. If we were able to solve this problem of a lack of cooperation it'd have impacts far beyond explorable explanations.

So overall, I guess I'm not really seeing anything actionable here with respect to the interactive demos.

tcheasdfjkl on The hostile telepaths problem

I like this post. But also the part of it I found most interesting was this footnote bit:

Loosely speaking, you've just turned your own conscious mind into an internal hostile telepath!

bc I think I do that kind of a lot, but also am somewhat sensitive to at least some kinds of things that feel like self-deception or thought-avoidance, and really dislike that feeling, so I do tend to probe at things that feel suspicious in that kind of way, which sometimes adds up to pretty unhelpful thought spirals where I'm chasing my thoughts and emotions around and getting kind of stuck in them. might be useful for me to try strategies where I let the avoidance exist, though I'm not sure how - if I'm at the point where I notice the hypothesis at all it's a pretty unpleasant feeling. I guess "learn to tolerate uncertainty and confusion better" is already a thing I wanted to do, and relevant here.

kave on Parkinson's Law and the Ideology of Statistics

I dug up my old notes on this book review. Here they are:

So, I've just spent some time going through the World Bank documents on its interventions in Lesotho. The Anti-Politics Machine is not doing great on epistemic checking
There is no recorded Thaba-Tseka Development Project, despite the period in which it should have taken place being covered
There is a Thaba-Bosiu development project (parts 1 and 2) taking place at the correct time.
Thaba-Bosiu and Thaba-Tseka are both regions of Lesotho
The spec doc for Thaba-Bosiu Part 2 references the alleged problems the economists were faced with (remittances from South African miners, poor crop yield ... no complaint about cows)
It has a negative assessment doc at the end. It was an unsuccessful project. This would match
The funding doesn't quite match up. The UK is mentioned as funding the "Thaba-Tseka" project, and is indeed funding Thaba-Bosiu. But Canada is I believe funding a road project instead
Something like 2/3 of the country is involved in Thaba-Bosiu Development II (It became renamed the "Basic Agricultural Services Program")
There is no mention of ponies or wood involved in interventions anywhere. In fact, the part II retrospective includes the lack of focus on livestock as a problem (suggesting they didn't do much of it)
They were focused on five major crops (maize, sorghum, beans, peas and wheat)
Also the quote in the book review of the quote in The Anti-Politics Machine of the quote in the report doesn't show up in any of the documents I looked at (which basically covered every project in Lesotho by the World Bank in that time period). The writing style of the quote is also moderately distinct from that of the reports
AFAICT, the main intervention was fertiliser. The retrospective claims this failed because (a) the climate in Lesotho is uniquely bad and screened off fertilisation and (b) the Lesotho government fucked up messaging and also every other part of everything all the time and ultimately all the donors backed out.
The government really wanted to be self-sufficient in food production. None of the donors, the farmers or the world bank cared about this but the government focused its messaging heavily around this. The government ended up directing a lot of its efforts towards a new Food Self-Sufficiency Program which was seen as incompatible with the goals of Basic Agricultural Services Program.
The fact that the crop situation wasn't working was recognised fairly early on. They started on an adaptive trial of crop research to figure out what would work better. This was hampered by donor coordination so only happened in a small area, but apparently worked quite well
All-in-all, sounds less bad than the Anti-Politics Machine makes it out to be, and also just generally very different? I'm not 100% certain I've managed to locate all the relevant programs though, so it's possible something closer to the book's description did happen

martin-randall on Pausing AI Developments Isn't Enough. We Need to Shut it All Down

This letter was an important milestone in the evolution of MIRI's strategy over 2020-2024. As of October 2023 [LW · GW] Yudkowsky is MIRI's chair and "the de facto reality (is) that his views get a large weight in MIRI strategic direction".

MIRI used to favor technical alignment over policy work. In April 2021, in comments to Death with Dignity [LW · GW] Yudkowsky argued [LW(p) · GW(p)] that:

How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI? A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.

People were not completely swayed by this advice, and the Best of LessWrong 2022 included What an Actually Pessimistic Containment Strategy Looks Like [LW · GW] in April 2022 and Let's Think About Slowing Down AI [LW · GW] in December 2022.

In this Time letter from March 2023 we see Yudkowsky doing AI policy work. In January 2024 the new MIRI CEO announced in the MIRI 2024 Mission and Strategy Update [LW · GW] that AI policy work provides "a glimmer of hope". In December 2024 we get Communications in Hard Mode - My New Job at MIRI [LW · GW].

A question for all of us: "How could I have thought that faster? [LW · GW]"

mitchell_porter on Don't fall for ontology pyramid schemes

Could you give some examples of ontology pyramid schemes?

Also, can ontologies that are actually true, be the subject of pyramid schemes, as you define them?

la-alis on nikola's Shortform

Did you collect the data for their actual median timelines, or just its position relative to 2030? If you collected higher-resolution data, are you able to share it somewhere?

jessica-liu-taylor on Adam Shai's Shortform

I don't habitually use the concept so I don't have an opinion on how to use the term.

saidachmiz on Deontic Explorations In "Paying To Talk To Slaves"

(In general, any human who might be worth enslaving is also a person whom it would be improper to enslave.)

...I don’t see what that has to do with LLMs, though.

This claim by you about the conditions under which slavery is profitable seems wildly optimistic, and not at all realistic, but also a very normal sort of intellectual move.

If a person is a depraved monster (as many humans actually are) then there are lots of ways to make money from a child slave.

I looked up a list of countries where child labor occurs. Pakistan jumped out as “not Africa or Burma” and when I look it up in more detail, I see that Pakistan’s brick industry, rug industry, and coal industry all make use of both “child labor” and “forced labor”. Maybe not every child in those industries is a slave, and not every slave in those industries is a child, but there’s probably some overlap.

It seems like you have quite substantially misunderstood my quoted claim. I think this is probably a case of simple “read too quickly” on your part, and if you reread what I wrote there, you’ll readily see the mistake you made. But, just in case, I will explain again; I hope that you will not take offense, if this is an unnecessary amount of clarification.

The children who are working in coal mines, brick factories, etc., are (according to the report you linked) 10 years old and older. This is as I would expect, and it exactly matches what I said: any human who might be worth enslaving (i.e., a human old enough to be capable of any kind of remotely useful work, which—it would seem—begins at or around 10 years of age) is also a person whom it would be improper to enslave (i.e., a human old enough to have developed sapience, which certainly takes place long before 10 years of age). In other words, “old enough to be worth enslaving” happens no earlier (and realistically, years later) than “old enough such that it would be wrong to enslave them [because they are already sapient]”.

(It remains unclear to me what this has to do with LLMs.)

Since “we” (you know, the good humans in a good society with good institutions) can’t even clean up child slavery in Pakistan, maybe it isn’t surprising that “we” also can’t clean up AI slavery in Silicon Valley, either.

Maybe so, but it would also not be surprising that we “can’t” clean up “AI slavery” in Silicon Valley even setting aside the “child slavery in Pakistan” issue, for the simple reason that most people do not believe that there is any such thing as “AI slavery in Silicon Valley” that needs to be “cleaned up”.

In asking the questions I was trying to figure out if you meant “obviously AI aren’t moral patients because they aren’t sapient” or “obviously the great mass of normal humans would kill other humans for sport if such practices were normalized on TV for a few years since so few of them have a conscience” or something in between.

Like the generalized badness of all humans could be obvious-to-you (and hence why so many of them would be in favor of genocide, slavery, war, etc and you are NOT surprised) or it might be obvious-to-you that they are right about whatever it is that they’re thinking when they don’t object to things that are probably evil, and lots of stuff in between.

None of the above.

You are treating it as obvious that there are AIs being “enslaved” (which, naturally, is bad, ought to be stopped, etc.). Most people would disagree with you. Most people, if asked whether something should be done about the enslaved AIs, will respond with some version of “don’t be silly, AIs aren’t people, they can’t be ‘enslaved’”. This fact fully suffices to explain why they do not see it as imperative to do anything about this problem—they simply do not see any problem. This is not because they are unaware of the problem, nor is it because they are callous. It is because they do not agree with your assessment of the facts.

That is what is obvious to me.

(I once again emphasize that my opinions about whether AIs are people, whether AIs are sapient, whether AIs are being enslaved, whether enslaving AIs is wrong, etc., have nothing whatever to do with the point I am making.)

keltan on quetzal_rainbow's Shortform

I've been thinking about this a lot lately. It seems to link to many things. And might be a bit too much for just a comment. But here are some key concepts from mostly psych that I think link to why sleeping on a problem makes it easier.

Hebb's Law
Learning is assumed to take place over a 24hr span
Chunking
The Multi-component Model of Working Memory
Mice developing 'Maze Neurons' when learning a maze
People who are woken mid-sleep and self report dreaming about a problem they've tried to solve, do better the next day than people who are woken and don't report dreaming about the problem

If I boil it down, I have two hypotheses that could both be true.

When you dream about a problem you're brain is formulating ideas that can help you solve it. All you have to do the next day is try again and those ideas will become available to you as if you had just 'had an idea'
Sleeping on a problem breaks it up into more manageable chunks that you can better manipulate in working memory the next time you try to solve it.

There are other things that happen during sleep that will just make every problem easier to solve the next day. For example:

Cleaning up chemical 'garbage' that collects in your brain during the day.
Forgetting things that the brain doesn't think you have a use for
Resetting/reducing your emotions. (If you're stressed about a new problem, you'll find it easier to solve it when you're less stressed.)