LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (57)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (38)

Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (4)

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (57)

[link] China Hawks are Manufacturing an AI Arms Race
garrison · 2024-11-20T18:17:51.958Z · comments (43)

My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (33)

[Completed] The 2024 Petrov Day Scenario
Ben Pace (Benito) · 2024-09-26T08:08:32.495Z · comments (114)

[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (5)

Read the Roon
Zvi · 2024-03-05T13:50:04.967Z · comments (6)

[link] A computational no-coincidence principle
Eric Neyman (UnexpectedValues) · 2025-02-14T21:39:39.277Z · comments (29)

Anomalous Tokens in DeepSeek-V3 and r1
henry (henry-bass) · 2025-01-25T22:55:41.232Z · comments (2)

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
Neel Nanda (neel-nanda-1) · 2024-07-07T17:39:35.064Z · comments (16)

The Worst Form Of Government (Except For Everything Else We've Tried)
johnswentworth · 2024-03-17T18:11:38.374Z · comments (47)

On saying "Thank you" instead of "I'm Sorry"
Michael Cohn (michael-cohn) · 2024-07-08T03:13:50.663Z · comments (16)

How it All Went Down: The Puzzle Hunt that took us way, way Less Online
A* (agendra) · 2024-06-02T08:01:40.109Z · comments (5)

Limitations on Formal Verification for AI Safety
Andrew Dickson · 2024-08-19T23:03:52.706Z · comments (60)

Hire (or Become) a Thinking Assistant
Raemon · 2024-12-23T03:58:42.061Z · comments (47)

Loving a world you don’t trust
Joe Carlsmith (joekc) · 2024-06-18T19:31:36.581Z · comments (13)

[link] Simple probes can catch sleeper agents
Monte M (montemac) · 2024-04-23T21:10:47.784Z · comments (21)

[link] "AI achieves silver-medal standard solving International Mathematical Olympiad problems"
gjm · 2024-07-25T15:58:57.638Z · comments (38)

Parasites (not a metaphor)
lemonhope (lcmgcd) · 2024-08-08T20:07:13.593Z · comments (19)

A Dozen Ways to Get More Dakka
Davidmanheim · 2024-04-08T04:45:19.427Z · comments (11)

My simple AGI investment & insurance strategy
lc · 2024-03-31T02:51:53.479Z · comments (27)

Why I don't believe in the placebo effect
transhumanist_atom_understander · 2024-06-10T02:37:07.776Z · comments (22)

Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (27)

[link] Training on Documents About Reward Hacking Induces Reward Hacking
evhub · 2025-01-21T21:32:24.691Z · comments (13)

[question] Which things were you surprised to learn are not metaphors?
Eric Neyman (UnexpectedValues) · 2024-11-21T18:56:18.025Z · answers+comments (88)

[link] "Can AI Scaling Continue Through 2030?", Epoch AI (yes)
gwern · 2024-08-24T01:40:32.929Z · comments (4)

Near-mode thinking on AI
Olli Järviniemi (jarviniemi) · 2024-08-04T20:47:28.085Z · comments (8)

The Paris AI Anti-Safety Summit
Zvi · 2025-02-12T14:00:07.383Z · comments (21)

Circuits in Superposition: Compressing many small neural networks into one
Lucius Bushnaq (Lblack) · 2024-10-14T13:06:14.596Z · comments (8)

How I started believing religion might actually matter for rationality and moral philosophy
zhukeepa · 2024-08-23T17:40:47.341Z · comments (41)

Community Notes by X
NicholasKees (nick_kees) · 2024-03-18T17:13:33.195Z · comments (15)

[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (7)

Building AI Research Fleets
Ben Goldhaber (bgold) · 2025-01-12T18:23:09.682Z · comments (11)

Tell me about yourself: LLMs are aware of their learned behaviors
Martín Soto (martinsq) · 2025-01-22T00:47:15.023Z · comments (5)

Pantheon Interface
NicholasKees (nick_kees) · 2024-07-08T19:03:51.681Z · comments (22)

Gradual Disempowerment, Shell Games and Flinches
Jan_Kulveit · 2025-02-02T14:47:53.404Z · comments (35)

[link] The Failed Strategy of Artificial Intelligence Doomers
Ben Pace (Benito) · 2025-01-31T18:56:06.784Z · comments (73)

"The Solomonoff Prior is Malign" is a special case of a simpler argument
David Matolcsi (matolcsid) · 2024-11-17T21:32:34.711Z · comments (44)

Human takeover might be worse than AI takeover
Tom Davidson (tom-davidson-1) · 2025-01-10T16:53:27.043Z · comments (54)

[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)

Some articles in “International Security” that I enjoyed
Buck · 2025-01-31T16:23:27.061Z · comments (7)

Awakening
lsusr · 2024-05-30T07:03:00.821Z · comments (79)

[question] What do coherence arguments actually prove about agentic behavior?
[deleted] · 2024-06-01T09:37:28.451Z · answers+comments (37)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (14)

[link] Investigating the Chart of the Century: Why is food so expensive?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-16T13:21:23.596Z · comments (26)

Do you believe in hundred dollar bills lying on the ground? Consider humming
Elizabeth (pktechgirl) · 2024-05-16T00:00:05.257Z · comments (22)

[link] My Number 1 Epistemology Book Recommendation: Inventing Temperature
adamShimi · 2024-09-08T14:30:40.456Z · comments (18)

[question] Which skincare products are evidence-based?
Vanessa Kosoy (vanessa-kosoy) · 2024-05-02T15:22:12.597Z · answers+comments (48)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

davidmanheim on Alignment can be the ‘clean energy’ of AI

The load bearing assumption here seems to be that we won't make unaligned superintelligent systems given current methods soon enough to think it matters.

This seems false, and at the very least should be argued explicitly.

mitchell_porter on Make Superintelligence Loving

asking it to think long term instead could change the fate of our species

If it's superintelligent, it has already thought more deeply about the long term than any human ever has.

knight-lee on Power Lies Trembling: a three-book review

I think society is very inconsistent about AI risk because the "Schelling point" is that people feel free to believe in a sizable probability of extinction from AI without looking crazy, but nobody dares argue for the massive sacrifices (spending or regulation or diplomacy) which actually fit those probabilities.

The best guess by basically every group of people, is that with 2%-12%, AI will cause catastrophe [LW · GW] (kill 10% of people). At these probabilities, AI safety should be an equal priority to the military!

Yet at the same time, nobody is doing anything about it. Because they all observe everyone else doing nothing about it. Each person thinks the reason that "everyone else" is doing nothing, is that they figured out good reasons to ignore AI risk. But the truth is that "everyone else" is doing nothing for the same reason that they are doing nothing. Everyone else is just following everyone else.

This "everyone following everyone else" inertia is very slowly changing, as governments start giving a bit of lip-service and small amounts of funding to organizations which are half working on AI Notkilleveryoneism. But this kind of change is slow and tends to take decades. Meanwhile many AGI timelines are less than one decade.

thane-ruthenis on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

It’s the backfire supposition that is baseless

There is a potential instinctive association of protests with violence, riots, vandalism, obstruction of public property, crackpots/conspiracy theorists, et cetera. I don't think it's baseless to worry whether this association is strong enough, in a median person's mind, for any protest towards an unknown cause to be instinctively associated with said negative things, with this first impression then lingering.

Anti-technology protests, in particular, might have an association with Unabomber-style terrorism, and certainly the AGI labs will be eager to reinforce this association. Protests therefore make your cause/movement uniquely vulnerable to this type of attack (via corresponding biased newspaper coverage). The marginal increase in visibility does not necessarily offset it.

It doesn't seem obvious to me whether the net effects are positive or negative. Do you have theoretical or empirical support for the effects being positive?

Small protests are the only way to get to big protests

I don't think so, and I'm not even saying small protests bad. I'm saying small protests might be bad without the appropriate groundwork.

genesmith on How to Make Superbabies

Thanks for catching that! I hadn't heard. I will probably have to rewrite that section of the post.

What's your impression about the general finding about many autoimmune variants increasing protection against ancient plauges?

mateusz-baginski on shortplav

It's totally possible that I'm seeing faces in the clouds but there seems to be a non-trivial relationship between these two glitch token and what they make the model say.

Хронологија -> chronologija -> chronology, i.e. time-related, like February
"kwiet" is similar to "kwiecień" which means "April" in Polish (also "kviten'" in Ukrainian)

genesmith on How to Make Superbabies

No, the problem really is technical right now.

There may be additional societal and political problems afterwards. But none of those problems actually matter unless the technology works.

Obviously we are going to do it in animals first. We have in fact DONE gene editing in animals many times (especially mice, but also some minor stuff in cows and other livestock). But you're correct that we need to test massive multiplex editing. My hope is we can have good data on this in cows in the next 1-3 years.

genesmith on How to Make Superbabies

I don't understand your question

crazy-philosopher on How to Make Superbabies

An other problem with authors calculs of potential to improve intelligence: let's suppose, there is a problem in the human brain that reduces IQ by 10 points, and it can be solved by Gene1 or Gene2. Let's suppose that 99% of humans do not have either Gene1 or Gene2. In this case, the author's method would show that if we added both Gene1 and Gene2 to the same person, their IQ would increase by 20 points.

dalcy on [Intuitive self-models] 1. Preliminaries

Curious about the claim regarding bistable perception as the brain "settling" differently on two distinct but roughly equally plausible generative model parameters behind an observation. In standard statistical terms, should I think of it as: two parameters having similarly high Bayesian posterior probability, but the brain not explicitly representing this posterior, instead using something like local hill climbing to find a local MAP solution—bistable perception corresponding to the two different solutions this process converges to?

If correct, to what extent should I interpret the brain as finding a single solution (MLE/MAP) versus representing a superposition or distribution over multiple solutions (fully Bayesian)? Specifically, in which context should I interpret the phrase "the brain settling on two different generative models"?