LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (25)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (14)

A shortcoming of concrete demonstrations as AGI risk advocacy
Steven Byrnes (steve2152) · 2024-12-11T16:48:41.602Z · comments (14)

Deep Causal Transcoding: A Framework for Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-12-03T21:19:42.333Z · comments (5)

AI #92: Behind the Curve
Zvi · 2024-11-28T14:40:05.448Z · comments (7)

[question] What are the good rationality films?
Ben Pace (Benito) · 2024-11-20T06:04:56.757Z · answers+comments (52)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (19)

Parable of the vanilla ice cream curse (and how it would prevent a car from starting!)
Mati_Roy (MathieuRoy) · 2024-12-08T06:57:45.783Z · comments (19)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (17)

How to prevent collusion when using untrusted models to monitor each other
Buck · 2024-09-25T18:58:20.693Z · comments (9)

[Intuitive self-models] 2. Conscious Awareness
Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · comments (48)

[link] Not every accommodation is a Curb Cut Effect: The Handicapped Parking Effect, the Clapper Effect, and more
Michael Cohn (michael-cohn) · 2024-09-15T05:27:36.691Z · comments (39)

[link] Gwern Branwen interview on Dwarkesh Patel’s podcast: “How an Anonymous Researcher Predicted AI's Trajectory”
Said Achmiz (SaidAchmiz) · 2024-11-14T23:53:34.922Z · comments (0)

Scaffolding for "Noticing Metacognition"
Raemon · 2024-10-09T17:54:13.657Z · comments (4)

Graceful Degradation
Screwtape · 2024-11-05T23:57:53.362Z · comments (8)

[link] Should you be worried about H5N1?
gw · 2024-12-05T21:11:06.996Z · comments (2)

Should there be just one western AGI project?
rosehadshar · 2024-12-03T10:11:17.914Z · comments (71)

[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (10)

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (3)

Rationality Quotes - Fall 2024
Screwtape · 2024-10-10T18:37:55.013Z · comments (26)

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (62)

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.
Andrew_Critch · 2024-11-22T03:26:11.681Z · comments (53)

The Obliqueness Thesis
jessicata (jessica.liu.taylor) · 2024-09-19T00:26:30.677Z · comments (17)

My 10-year retrospective on trying SSRIs
Kaj_Sotala · 2024-09-22T20:30:02.483Z · comments (10)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

Dentistry, Oral Surgeons, and the Inefficiency of Small Markets
GeneSmith · 2024-11-01T17:26:06.466Z · comments (16)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (7)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (78)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

The case for a negative alignment tax
Cameron Berg (cameron-berg) · 2024-09-18T18:33:18.491Z · comments (20)

Introducing Transluce — A Letter from the Founders
jsteinhardt · 2024-10-23T18:10:02.526Z · comments (2)

[link] Cost, Not Sacrifice
Joe Rogero · 2024-11-20T21:32:26.281Z · comments (13)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (11)

Counting AGIs
cash (cshunter) · 2024-11-26T00:06:17.845Z · comments (19)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (16)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (9)

The 2023 LessWrong Review: The Basic Ask
Raemon · 2024-12-04T19:52:40.435Z · comments (20)

[Intuitive self-models] 3. The Homunculus
Steven Byrnes (steve2152) · 2024-10-02T15:20:18.394Z · comments (36)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

[link] "Map of AI Futures" - An interactive flowchart
swante · 2024-11-27T21:31:40.269Z · comments (3)

[link] New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters
Jesse Hoogland (jhoogland) · 2024-11-27T22:06:12.914Z · comments (4)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

adamzerner on adamzerner's Shortform

I would like to see people write high-effort summaries, analyses and distillations of the posts in The Sequences.

When Eliezer wrote the original posts, he was [? · GW] writing one blog post a day for two years. Surely you could do a better job presenting the content that he produced in one day if you, say, took four months applying principles of pedagogy and iterating on it as a side project. I get the sense that more is possible [? · GW].

This seems like a particularly good project for people who want to write but don't know what to write about. I've talked with a variety of people who are in that boat.

One issue with such distillation posts is discoverability. Maybe you write the post, it receives some upvotes, some people see it, and then it disappears into the ether. Ideally when someone in the future goes to read the corresponding sequence post they would be aware that your distillation post is available as a sort of sister content to the original content. LessWrong does have the "Mentioned in" section at the bottom of posts, but that doesn't feel like it is sufficient.

yams on yams's Shortform

What text analogizing LLMs to human brains have you found most compelling?

adamzerner on adamzerner's Shortform

I recently started going through some of Rationality from AI to Zombies again. A big reason why is the fact that there are audio recordings of the posts. It's easy to listen to a post or two as I walk my dog, or a handful of posts instead of some random hour-long podcast that I would otherwise listen to.

I originally read (most of) The Sequences maybe 13 or 14 years ago when I was in college. At various [? · GW] times [LW · GW] since then I've made somewhat deliberate efforts to revisit them. Other times I've re-read random posts as opposed to larger collections of posts. Anyway, the point I want to make is that it's been a while.

I've been a little surprised in my feelings as I re-read them. Some of them feel notably less good than what I remember. Others blow my mind and are incredible.

The Mysterious Answers sequence [? · GW] is one that I felt disappointed by. I felt like the posts weren't very clear and that there wasn't much substance. I think the main overarching point of the sequence is that an explanation can't say that all outcomes are equally probable. It has to say that some outcomes are more probable than others. But that just seems kinda obvious.

I think it's quite plausible that there are "good" reasons why I felt disappointed as I re-read this and other sequences. Maybe there are important things that are going over my head. Or maybe I actually understand things too well now after hanging around this community for so long.

One post that hit me kinda hard that I really enjoyed after re-reading it was Rationality and the English Language [? · GW], and then the follow up post, Human Evil and Muddled Thinking [? · GW]. The posts helped me grok how powerful language can be.

If you really want an artist’s perspective on rationality, then read Orwell; he is mandatory reading for rationalists as well as authors. Orwell was not a scientist, but a writer; his tools were not numbers, but words; his adversary was not Nature, but human evil. If you wish to imprison people for years without trial, you must think of some other way to say it than “I’m going to imprison Mr. Jennings for years without trial.” You must muddy the listener’s thinking, prevent clear images from outraging conscience. You say, “Unreliable elements were subjected to an alternative justice process.”

I'm pretty sure that I read those posts before, along with a bunch of related posts and stuff, but for whatever reason the re-read still meaningfully improved my understand the concept.

alex-k-chen-parrot on Which skincare products are evidence-based?

Has anyone tried OneSkin/does it actually do what it claims? It acts on a mechanism independent from tretonin.

atillayasar on AtillaYasar's Shortform

Editability and findability --> higher quality over time

Editability

Code being easier to find and easier to edit, for example,

if it's in the same live environment where you're working, or if it's a simple hotkey away, or an alt-tab away to a config file which updates your settings without having to restart,

makes it more likely to be edited, more subject to "evolutionary pressures", to feedback loop dynamics.

Same applies to writing, or anything where you have connected objects that influence each other, where the "influencer node" is editable and visible.

configs : program layout / behavior
informal rules about how your relationship is to your friend : the dynamics of the relationship
layout of your desk : the way you work
underlying philosophy of ideas : writing ideas
ideas "going viral" in social media : people discussing them (think about Luigi Mangione triggering notions of killing people you don't like (this is bad!!), or Elon x Trump's doge thing having all sorts of people discussing efficiency of organizations and bureaucracy (this is amazing) )

(not sure if the last one is a good example)

Imagine if when writing this Quick Take:tm:, I had a side panel that on every keystroke, pulled up related paragraphs from all my existing writings!
I can see past writings which cool, but I can edit them way more easily (assuming a "jump to" feature), in the long term this yields many more edits, and a more polished and readable total volume of work.

Findability

If you can easily see the contents of something and go, "wait this is dumb". Then even if it's "far away" like, you have to find it in the browser, do mouse clicks and scrolls, you'll still do it. What in fact determined you editing it, is that the threshold for loading its contents into your mind, had been lowered.
When you load it, the opinion is instantly triggered.

g-1 on Post-Quantum Investing: Dump Crypto for Index Funds and Real Estate?

I extrapolate faster, because experts were wrong about AGI "after 2050" and they were wrong about predicting explosive growth of Bitcoin. In general they are usually too conservative, so odds are experts will be wrong about quantum supremacy as well.

davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn

Good point. The problem I have with that is that in every listed example, the mapping either requires the execution of the conscious mind and a readout of its output and process in order to build it, or it stipulates that it is well enough understood that it can be mapped to an arbitrary process, thereby implicitly also requiring that it was run elsewhere.

nathan-helm-burger on A shortcoming of concrete demonstrations as AGI risk advocacy

Sure. At this point I agree that some people will be so foolish and stubborn that no demo will concern them. Indeed, some people fail to update even on actual events.

So now we are, as Zvi likes to say, 'talking price'. What proportion of key government decision-makers would be influenced by how persuasive (and costly) of demos.

We both agree that the correct amount of effort to put towards demos is somewhere between nearly all of our AI safety effort-resources, and nearly none. I think it's a good point that we should try to estimate how effective a demo is likely to be on some particular individual or group, and aim to neither over nor under invest in it.

ape-in-the-coat on Zombies! Substance Dualist Zombies?

I think we can use the same method Eliezer applied to the regular epiphenomenalist Zombie argument to deal with this, weaker one.

Whether your mind interprets certain colour in a certain way actually has causal effects on the world. Namely, things that appear beautiful to you in our world may not appear beautiful to your qualia inversed counterpart. Which naturally affects your behaviour: whether you look at a certain object more, whether you buy a certain object and so on.

This is even more obvious for people with selective colour blindness. Suppose your mind is unable to distinguish between qualia of blueness and redness. And suppose there are three objects: A is red, B is blue and C is green. In our world you can't distinguish between objects A and B. But in the qualia inversed world you wouldn't be able to distinguish between objects B and C.

And if you try to switch to substance dualist version - all the reasoning from this post still stands.

eukaryote on Is being sexy for your homies?

Didn't like the post then, still don't like it in 2024. I think there are defensible points interwoven with assumptions and stereotypes.

First: generalizes from personal experiences that are not universal. I think a lot of people don't have this or don't struggle with this or find it worth it, and the piece assumes everyone feels the way the author feels.

Second: the thing it describes is a bias, and I don't think the essay realizes this.

Okay, part of the thing is that this doesn't make a case or acknowledge this romantic factor as being different from, like, friendship. Like, in the people-at-work case, you might also do someone a favor at work because you like them as a buddy, which is not necessarily the same as whether they're a good worker or it's a strategic thing for you to do, or whatever - you're inclined to give your friends special treatment. Even in straight same-gender groups, people will end up being friends and having outgroups.

Anyway, you have to be careful reasoning out of "what your in-built stereotypes say". This is sometimes relevant information, totally. But A) your in-built stereotypes are not everyone else's in-built stereotypes, even within your culture, and B) this is reasoning from the territory, not the map. Are they true? In some of the cases given in this piece, it matters if they're true.

Like, the thing being described here is a bias, a flaw in the lens. "Having to navigate around possible sexual dynamics with other people makes it harder to do regular communication with them" is a thing that'll make you less able to reason and less effective. (Especially if it still fires strongly in cases like "this woman is at this event about an unrelated topic, with a partner, and so is probably not available for dating.") I don't begrudge the author for having it. I think it's really common. God knows my own best judgment has failed me before in the face of very pretty people.

But I like this community for usually not giving up on matters of self-improvement and epistemics. Even if you don't prioritize it, you're at least recognizing it and not throwing it out. It's very disconcerting to read "I notice my brain does extra work when I talk with women... wouldn't it be easier if society were radically altered so that I didn't have to talk with women?" Like, what? And there's no way you or anyone else can become more rational about this? This barrier to ideal communication with 50% of people is insurmountable? It's worth giving up on this one? Hello?

I get that the author views this as sort of a series of tenuous hypotheticals and doesn't necessarily stand by these stances and was just putting it out there, which is respectable. I think it's wrong and so tenuous as to be unhelpful.

Overall: bad takes, did have a solid 20 seconds of mixed fun and horror imagining this totally-unsexist society where straight men and women are kept in polite segregated groups, and 10% of people are in fringe situations - stable lesbian gay-male duos who must rely on each other, the bisexuals and the nonbinary people wandering the earth alone, the asexuals reigning supreme; incorruptible, masters of all domains.