LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Skills from a year of Purposeful Rationality Practice
Raemon · 2024-09-18T02:05:58.726Z · comments (18)

[link] Why I’m not a Bayesian
Richard_Ngo (ricraz) · 2024-10-06T15:22:45.644Z · comments (92)

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer
johnswentworth · 2024-04-18T00:27:43.451Z · comments (21)

Humming is not a free $100 bill
Elizabeth (pktechgirl) · 2024-06-06T20:10:02.457Z · comments (6)

Introducing Alignment Stress-Testing at Anthropic
evhub · 2024-01-12T23:51:25.875Z · comments (23)

Safety consultations for AI lab employees
Zach Stein-Perlman · 2024-07-27T15:00:27.276Z · comments (4)

"Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity
Thane Ruthenis · 2023-12-16T20:08:39.375Z · comments (34)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (17)

Contra papers claiming superhuman AI forecasting
nikos (followtheargument) · 2024-09-12T18:10:50.582Z · comments (16)

[link] Toward a Broader Conception of Adverse Selection
Ricki Heicklen (bayesshammai) · 2024-03-14T22:40:57.920Z · comments (61)

Every "Every Bay Area House Party" Bay Area House Party
Richard_Ngo (ricraz) · 2024-02-16T18:53:28.567Z · comments (6)

[question] Why is o1 so deceptive?
abramdemski · 2024-09-27T17:27:35.439Z · answers+comments (24)

[link] FHI (Future of Humanity Institute) has shut down (2005–2024)
gwern · 2024-04-17T13:54:16.791Z · comments (22)

Struggling like a Shadowmoth
Raemon · 2024-09-24T00:47:05.030Z · comments (38)

WTH is Cerebrolysin, actually?
gsfitzgerald (neuroplume) · 2024-08-06T20:40:53.378Z · comments (23)

This is already your second chance
Malmesbury (Elmer of Malmesbury) · 2024-07-28T17:13:57.680Z · comments (13)

Effective Aspersions: How the Nonlinear Investigation Went Wrong
TracingWoodgrains (tracingwoodgrains) · 2023-12-19T12:00:23.529Z · comments (170)

Timaeus's First Four Months
Jesse Hoogland (jhoogland) · 2024-02-28T17:01:53.437Z · comments (6)

Critical review of Christiano's disagreements with Yudkowsky
Vanessa Kosoy (vanessa-kosoy) · 2023-12-27T16:02:50.499Z · comments (40)

'Empiricism!' as Anti-Epistemology
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-03-14T02:02:59.723Z · comments (90)

Did Christopher Hitchens change his mind about waterboarding?
Isaac King (KingSupernova) · 2024-09-15T08:28:09.451Z · comments (22)

Three Subtle Examples of Data Leakage
abstractapplic · 2024-10-01T20:45:27.731Z · comments (16)

[link] Recommendation: reports on the search for missing hiker Bill Ewasko
eukaryote · 2024-07-31T22:15:03.174Z · comments (28)

My motivation and theory of change for working in AI healthtech
Andrew_Critch · 2024-10-12T00:36:30.925Z · comments (37)

Reconsider the anti-cavity bacteria if you are Asian
Lao Mein (derpherpize) · 2024-04-15T07:02:02.655Z · comments (43)

The 'Neglected Approaches' Approach: AE Studio's Alignment Agenda
Cameron Berg (cameron-berg) · 2023-12-18T20:35:01.569Z · comments (21)

Many arguments for AI x-risk are wrong
TurnTrout · 2024-03-05T02:31:00.990Z · comments (86)

Is being sexy for your homies?
Valentine · 2023-12-13T20:37:02.043Z · comments (94)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (69)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)

[link] Boycott OpenAI
PeterMcCluskey · 2024-06-18T19:52:42.854Z · comments (26)

Announcing ILIAD — Theoretical AI Alignment Conference
Nora_Ammann · 2024-06-05T09:37:39.546Z · comments (18)

[link] Sycophancy to subterfuge: Investigating reward tampering in large language models
Carson Denison (carson-denison) · 2024-06-17T18:41:31.090Z · comments (22)

You can remove GPT2’s LayerNorm by fine-tuning for an hour
StefanHex (Stefan42) · 2024-08-08T18:33:38.803Z · comments (11)

o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)

Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
Jeremy Gillen (jeremy-gillen) · 2024-01-26T07:22:06.370Z · comments (60)

[link] Masterpiece
Richard_Ngo (ricraz) · 2024-02-13T23:10:35.376Z · comments (21)

[link] Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data
Johannes Treutlein (Johannes_Treutlein) · 2024-06-21T15:54:41.430Z · comments (13)

And All the Shoggoths Merely Players
Zack_M_Davis · 2024-02-10T19:56:59.513Z · comments (57)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

[link] Succession
Richard_Ngo (ricraz) · 2023-12-20T19:25:03.185Z · comments (48)

DeepMind's "Frontier Safety Framework" is weak and unambitious
Zach Stein-Perlman · 2024-05-18T03:00:13.541Z · comments (14)

[link] Making every researcher seek grants is a broken model
jasoncrawford · 2024-01-26T16:06:26.688Z · comments (41)

Most People Don't Realize We Have No Idea How Our AIs Work
Thane Ruthenis · 2023-12-21T20:02:00.360Z · comments (42)

Language Models Model Us
eggsyntax · 2024-05-17T21:00:34.821Z · comments (55)

What’s up with LLMs representing XORs of arbitrary features?
Sam Marks (samuel-marks) · 2024-01-03T19:44:33.162Z · comments (61)

EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
scasper · 2024-05-21T20:15:36.502Z · comments (16)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)

Deep Honesty
Aletheophile (aletheo) · 2024-05-07T20:31:48.734Z · comments (25)

Formal verification, heuristic explanations and surprise accounting
Jacob_Hilton · 2024-06-25T15:40:03.535Z · comments (11)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

adamzerner on adamzerner's Shortform

I would like to see people write high-effort summaries, analyses and distillations of the posts in The Sequences.

When Eliezer wrote the original posts, he was [? · GW] writing one blog post a day for two years. Surely you could do a better job presenting the content that he produced in one day if you, say, took four months applying principles of pedagogy and iterating on it as a side project. I get the sense that more is possible [? · GW].

This seems like a particularly good project for people who want to write but don't know what to write about. I've talked with a variety of people who are in that boat.

One issue with such distillation posts is discoverability. Maybe you write the post, it receives some upvotes, some people see it, and then it disappears into the ether. Ideally when someone in the future goes to read the corresponding sequence post they would be aware that your distillation post is available as a sort of sister content to the original content. LessWrong does have the "Mentioned in" section at the bottom of posts, but that doesn't feel like it is sufficient.

yams on yams's Shortform

What text analogizing LLMs to human brains have you found most compelling?

adamzerner on adamzerner's Shortform

I recently started going through some of Rationality from AI to Zombies again. A big reason why is the fact that there are audio recordings of the posts. It's easy to listen to a post or two as I walk my dog, or a handful of posts instead of some random hour-long podcast that I would otherwise listen to.

I originally read (most of) The Sequences maybe 13 or 14 years ago when I was in college. At various [? · GW] times [LW · GW] since then I've made somewhat deliberate efforts to revisit them. Other times I've re-read random posts as opposed to larger collections of posts. Anyway, the point I want to make is that it's been a while.

I've been a little surprised in my feelings as I re-read them. Some of them feel notably less good than what I remember. Others blow my mind and are incredible.

The Mysterious Answers sequence [? · GW] is one that I felt disappointed by. I felt like the posts weren't very clear and that there wasn't much substance. I think the main overarching point of the sequence is that an explanation can't say that all outcomes are equally probable. It has to say that some outcomes are more probable than others. But that just seems kinda obvious.

I think it's quite plausible that there are "good" reasons why I felt disappointed as I re-read this and other sequences. Maybe there are important things that are going over my head. Or maybe I actually understand things too well now after hanging around this community for so long.

One post that hit me kinda hard that I really enjoyed after re-reading it was Rationality and the English Language [? · GW], and then the follow up post, Human Evil and Muddled Thinking [? · GW]. The posts helped me grok how powerful language can be.

If you really want an artist’s perspective on rationality, then read Orwell; he is mandatory reading for rationalists as well as authors. Orwell was not a scientist, but a writer; his tools were not numbers, but words; his adversary was not Nature, but human evil. If you wish to imprison people for years without trial, you must think of some other way to say it than “I’m going to imprison Mr. Jennings for years without trial.” You must muddy the listener’s thinking, prevent clear images from outraging conscience. You say, “Unreliable elements were subjected to an alternative justice process.”

I'm pretty sure that I read those posts before, along with a bunch of related posts and stuff, but for whatever reason the re-read still meaningfully improved my understand the concept.

alex-k-chen-parrot on Which skincare products are evidence-based?

Has anyone tried OneSkin/does it actually do what it claims? It acts on a mechanism independent from tretonin.

atillayasar on AtillaYasar's Shortform

Editability and findability --> higher quality over time

Editability

Code being easier to find and easier to edit, for example,

if it's in the same live environment where you're working, or if it's a simple hotkey away, or an alt-tab away to a config file which updates your settings without having to restart,

makes it more likely to be edited, more subject to "evolutionary pressures", to feedback loop dynamics.

Same applies to writing, or anything where you have connected objects that influence each other, where the "influencer node" is editable and visible.

configs : program layout / behavior
informal rules about how your relationship is to your friend : the dynamics of the relationship
layout of your desk : the way you work
underlying philosophy of ideas : writing ideas
ideas "going viral" in social media : people discussing them (think about Luigi Mangione triggering notions of killing people you don't like (this is bad!!), or Elon x Trump's doge thing having all sorts of people discussing efficiency of organizations and bureaucracy (this is amazing) )

(not sure if the last one is a good example)

Imagine if when writing this Quick Take:tm:, I had a side panel that on every keystroke, pulled up related paragraphs from all my existing writings!
I can see past writings which cool, but I can edit them way more easily (assuming a "jump to" feature), in the long term this yields many more edits, and a more polished and readable total volume of work.

Findability

If you can easily see the contents of something and go, "wait this is dumb". Then even if it's "far away" like, you have to find it in the browser, do mouse clicks and scrolls, you'll still do it. What in fact determined you editing it, is that the threshold for loading its contents into your mind, had been lowered.
When you load it, the opinion is instantly triggered.

g-1 on Post-Quantum Investing: Dump Crypto for Index Funds and Real Estate?

I extrapolate faster, because experts were wrong about AGI "after 2050" and they were wrong about predicting explosive growth of Bitcoin. In general they are usually too conservative, so odds are experts will be wrong about quantum supremacy as well.

davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn

Good point. The problem I have with that is that in every listed example, the mapping either requires the execution of the conscious mind and a readout of its output and process in order to build it, or it stipulates that it is well enough understood that it can be mapped to an arbitrary process, thereby implicitly also requiring that it was run elsewhere.

nathan-helm-burger on A shortcoming of concrete demonstrations as AGI risk advocacy

Sure. At this point I agree that some people will be so foolish and stubborn that no demo will concern them. Indeed, some people fail to update even on actual events.

So now we are, as Zvi likes to say, 'talking price'. What proportion of key government decision-makers would be influenced by how persuasive (and costly) of demos.

We both agree that the correct amount of effort to put towards demos is somewhere between nearly all of our AI safety effort-resources, and nearly none. I think it's a good point that we should try to estimate how effective a demo is likely to be on some particular individual or group, and aim to neither over nor under invest in it.

ape-in-the-coat on Zombies! Substance Dualist Zombies?

I think we can use the same method Eliezer applied to the regular epiphenomenalist Zombie argument to deal with this, weaker one.

Whether your mind interprets certain colour in a certain way actually has causal effects on the world. Namely, things that appear beautiful to you in our world may not appear beautiful to your qualia inversed counterpart. Which naturally affects your behaviour: whether you look at a certain object more, whether you buy a certain object and so on.

This is even more obvious for people with selective colour blindness. Suppose your mind is unable to distinguish between qualia of blueness and redness. And suppose there are three objects: A is red, B is blue and C is green. In our world you can't distinguish between objects A and B. But in the qualia inversed world you wouldn't be able to distinguish between objects B and C.

And if you try to switch to substance dualist version - all the reasoning from this post still stands.

eukaryote on Is being sexy for your homies?

Didn't like the post then, still don't like it in 2024. I think there are defensible points interwoven with assumptions and stereotypes.

First: generalizes from personal experiences that are not universal. I think a lot of people don't have this or don't struggle with this or find it worth it, and the piece assumes everyone feels the way the author feels.

Second: the thing it describes is a bias, and I don't think the essay realizes this.

Okay, part of the thing is that this doesn't make a case or acknowledge this romantic factor as being different from, like, friendship. Like, in the people-at-work case, you might also do someone a favor at work because you like them as a buddy, which is not necessarily the same as whether they're a good worker or it's a strategic thing for you to do, or whatever - you're inclined to give your friends special treatment. Even in straight same-gender groups, people will end up being friends and having outgroups.

Anyway, you have to be careful reasoning out of "what your in-built stereotypes say". This is sometimes relevant information, totally. But A) your in-built stereotypes are not everyone else's in-built stereotypes, even within your culture, and B) this is reasoning from the territory, not the map. Are they true? In some of the cases given in this piece, it matters if they're true.

Like, the thing being described here is a bias, a flaw in the lens. "Having to navigate around possible sexual dynamics with other people makes it harder to do regular communication with them" is a thing that'll make you less able to reason and less effective. (Especially if it still fires strongly in cases like "this woman is at this event about an unrelated topic, with a partner, and so is probably not available for dating.") I don't begrudge the author for having it. I think it's really common. God knows my own best judgment has failed me before in the face of very pretty people.

But I like this community for usually not giving up on matters of self-improvement and epistemics. Even if you don't prioritize it, you're at least recognizing it and not throwing it out. It's very disconcerting to read "I notice my brain does extra work when I talk with women... wouldn't it be easier if society were radically altered so that I didn't have to talk with women?" Like, what? And there's no way you or anyone else can become more rational about this? This barrier to ideal communication with 50% of people is insurmountable? It's worth giving up on this one? Hello?

I get that the author views this as sort of a series of tenuous hypotheticals and doesn't necessarily stand by these stances and was just putting it out there, which is respectable. I think it's wrong and so tenuous as to be unhelpful.

Overall: bad takes, did have a solid 20 seconds of mixed fun and horror imagining this totally-unsexist society where straight men and women are kept in polite segregated groups, and 10% of people are in fringe situations - stable lesbian gay-male duos who must rely on each other, the bisexuals and the nonbinary people wandering the earth alone, the asexuals reigning supreme; incorruptible, masters of all domains.