LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Alternative Cancer Care As Biohacking & Book Review: Surviving "Terminal" Cancer
DenizT · 2025-01-06T07:43:52.773Z · comments (6)

AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan · 2024-11-27T06:30:03.821Z · comments (0)

[link] Suffering Is Not Pain
jbkjr · 2024-06-18T18:04:43.407Z · comments (45)

[link] Inferring the model dimension of API-protected LLMs
Ege Erdil (ege-erdil) · 2024-03-18T06:19:25.974Z · comments (3)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures
abstractapplic · 2024-05-17T00:25:42.950Z · comments (12)

Difficulty classes for alignment properties
Jozdien · 2024-02-20T09:08:24.783Z · comments (5)

[link] Romae Industriae
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-19T13:03:31.536Z · comments (2)

Intransitive Trust
Screwtape · 2024-05-27T16:55:29.294Z · comments (15)

Geometric Utilitarianism (And Why It Matters)
StrivingForLegibility · 2024-05-12T03:41:21.342Z · comments (2)

AI Safety Strategies Landscape
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-09T17:33:45.853Z · comments (1)

Adam Smith Meets AI Doomers
James_Miller · 2024-01-31T15:53:03.070Z · comments (10)

[link] GPT2, Five Years On
[deleted] · 2024-06-05T17:44:17.552Z · comments (0)

One True Love
Zvi · 2024-02-09T15:10:05.298Z · comments (7)

Musings on LLM Scale (Jul 2024)
Vladimir_Nesov · 2024-07-03T18:35:48.373Z · comments (0)

[link] The last era of human mistakes
owencb · 2024-07-24T09:58:42.116Z · comments (2)

AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan · 2024-06-12T03:30:05.747Z · comments (0)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (27)

If You Can Climb Up, You Can Climb Down
jefftk (jkaufman) · 2024-07-30T00:00:06.295Z · comments (9)

[link] legged robot scaling laws
bhauth · 2024-01-20T05:45:56.632Z · comments (8)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

Deceptive agents can collude to hide dangerous features in SAEs
Simon Lermen (dalasnoin) · 2024-07-15T17:07:33.283Z · comments (2)

[link] FTX expects to return all customer money; clawbacks may go away
Mikhail Samin (mikhail-samin) · 2024-02-14T03:43:13.218Z · comments (1)

[link] Vacuum: Theory and Technologies
ethanmorse · 2024-01-21T17:23:49.257Z · comments (0)

More on the Apple Vision Pro
Zvi · 2024-02-13T17:40:05.388Z · comments (5)

[link] Twitter thread on open-source AI
Richard_Ngo (ricraz) · 2024-07-31T00:26:11.655Z · comments (6)

One way violinists fail
Solenoid_Entity · 2024-05-29T04:08:17.675Z · comments (5)

[link] patent process problems
bhauth · 2024-07-14T21:12:04.953Z · comments (13)

UDT1.01: Logical Inductors and Implicit Beliefs (5/10)
Diffractor · 2024-04-18T08:39:13.368Z · comments (2)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery (arjun-panickssery) · 2024-01-15T21:21:03.962Z · comments (0)

Monthly Roundup #20: July 2024
Zvi · 2024-07-23T12:50:07.991Z · comments (9)

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

Rational Animations offers animation production and writing services!
Writer · 2024-03-15T17:26:07.976Z · comments (0)

LLMs can strategically deceive while doing gain-of-function research
Igor Ivanov (igor-ivanov) · 2024-01-24T15:45:08.795Z · comments (4)

AI #63: Introducing Alpha Fold 3
Zvi · 2024-05-09T14:20:03.176Z · comments (2)

A Sober Look at Steering Vectors for LLMs
Joschka Braun (joschka-braun) · 2024-11-23T17:30:00.745Z · comments (0)

Childhood and Education #8: Dealing with the Internet
Zvi · 2025-01-06T14:00:09.604Z · comments (6)

Dress Up For Secular Solstice
Gordon H.S. (gordon-schaefer) · 2024-12-15T16:28:24.607Z · comments (13)

D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset
aphyer · 2025-01-07T05:02:25.929Z · comments (8)

What AI companies should do: Some rough ideas
Zach Stein-Perlman · 2024-10-21T14:00:10.412Z · comments (10)

Templates I made to run feedback rounds for Ethan Perez’s research fellows.
Henry Sleight (ResentHighly) · 2024-03-28T19:41:15.506Z · comments (0)

Monthly Roundup #16: March 2024
Zvi · 2024-03-19T13:10:05.529Z · comments (4)

The slingshot helps with learning
Wilson Wu (wilson-wu) · 2024-10-31T23:18:16.762Z · comments (0)

Experimentation (Part 7 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-18T21:25:56.527Z · comments (0)

Important open problems in voting
Closed Limelike Curves · 2024-07-01T02:53:44.690Z · comments (1)

How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-07-01T09:04:03.687Z · comments (4)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

Attention Output SAEs Improve Circuit Analysis
Connor Kissane (ckkissane) · 2024-06-21T12:56:07.969Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

abandon on Everywhere I Look, I See Kat Woods

I also dislike many of the posts you included here, but I feel like this is perhaps unfairly harsh on some of the matters that come down to subjective taste; while it's perfectly reasonable to find a post cringe or unfunny for your own part, not everyone will necessarily agree, and the opinions of those who enjoy this sort of content aren't incorrect per se.

As a note, since it seems like you're pretty frustrated with how many of her posts you're seeing, blocking her might be a helpful intervention; Reddit's help page says blocked users' posts are hidden from your feeds.

wassname on New, improved multiple-choice TruthfulQA

Owen, have you looked at the GitHub issues in your repo? There are other issues too. I submitted one here about wrong labels.

I really think it's worth making TruthfulQA 2.0, give the amount of usage it sees and the room for improvement.

wassname on Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses

TruthfulQA is actually quite bad. I don't blame the authors, as no one has made anything better, but we really should. It's only ~800 samples. And many of them are badly labelled.

wassname on Nathan Helm-Burger's Shortform

I agree, it shows the ease of shoffy copying. But it doesn't show the ease of reverse engineering or parallel engineering.

It's just distillation, though. It doesn't reveal how o1 could be constructed, it just reveals how to efficiently copy from o1-like outputs (not from scratch). This recipe won't be able to make o1, unless o1 already exists. That means this method of copying lets someone catch up to the leader, but not surpass them.

There are some papers that attempt to replicate o1 though, and so far they don't quite get there, using distillation from a larger model (math-star, huggingface TTC) or not matching the results (see my post [LW(p) · GW(p)]). Maybe we will see open source replication in a couple of months? Which means only a short lag.

It's worth noting that Silicon Valley leaks like a sieve. And this is a feature, not a bug. Part of the reason it became the techno-VC centre of the world is because they banned non-competes. So you can take your competitor's trade secrets if you are willing to pay millions to poach some of their engineers. This is why some ML engineers get paid millions, it's not the skill, it's the trade secrets that competitors are paying for (and sometimes the brand-name). This has been great for tech and civilisation, but it's not so great for maintaining a technology lead.

christiankl on Unregulated Peptides: Does BPC-157 hold its promises?

That's not a good data point. If you want to provide anecdotal data, it would be good to provide more of the observations. How long did he have a should issue before taking BPC-157? How fast did it get away afterward?

benquo on Rough Sketch for Product to Enhance Citizen Participation in Politics

Your proposal is well-structured and interesting but has a fundamental flaw that needs to be addressed. Interest keyword-based filtering will primarily encourage politics-as-identity, which is actively harmful - it directs attention towards zero-sum thinking and performative identities, rather than creative problem solving. As Bryan Caplan demonstrates in The Myth of the Rational Voter, people already tend to vote to express identities and affiliations rather than to achieve better outcomes. We shouldn't build tools that further entrench this destructive pattern.

Instead, imagine a tool that:

Has users journal daily about their life - activities, hopes, problems, and worries
Uses AI to identify where their constraints are plausibly caused by or could be alleviated by government action, especially local government
Maps them to specific opportunities for formal recourse, with guidance on process, likely outcomes, and practical assistance (like drafting letters or legal documents)
For issues requiring collective action, connects users facing similar constraints and helps coordinate through mechanisms like dominant assurance contracts [LW · GW] where appropriate

This approach would ground political participation in the solving of one's own problems rather than identity expression. While technically more challenging to implement than interest-based filtering, it would generate higher-quality engagement that expands our collective problem-solving capacity rather than just reallocating political power between existing interest groups.

The patterns emerging from aggregated user experiences would naturally reveal systemic issues and preventive opportunities, especially in how regulations and policies interact to shape people's choices and planning horizons. While building reliable AI judgment about political causation is challenging, it's better to attempt something hard that would be beneficial if feasible, than to facilitate the destructive forces of identity-based politics simply because they're easier to implement.

waterlubber on Unregulated Peptides: Does BPC-157 hold its promises?

Anecdotal data point: an (online) friend of mine with EDS successfully used BPC-157 to treat shoulder ligament injury, although apparently it promoted scar tissue formation as well. He claims that it produced a significant improvement in his symptoms.

yonatan-cale-1 on Yonatan Cale's Shortform

More on starting early:

Imagine a lab starts working in an air gapped network, and one of the 1000 problems that comes up is working-from-home.

If that problem comes up now (early), then we can say "okay, working from home is allowed", and we'll add that problem to the queue of things that we'll prioritize and solve. We can also experiment with it: Maybe we can open another secure office closer to the employee's house, would they like that? If so, we could discuss fancy ways to secure the communication between the offices. If not, we can try something else.

If that problem comes up when security is critical (if we wait), then the solution will be "no more working from home, period". The security staff will be too overloaded with other problems to solve, not available to experiment with having another office nor to sign a deal with Cursor.

anthonyc on Passages I Highlighted in The Letters of J.R.R.Tolkien

Edit to add: Just thinking about the converse, you could also make it sound more ridiculous by rewriting it with more obscure parts of the legendarium, too.

Conquer Morgoth with Ungoliant. Turn Maiar into balrogs. Glamdring among the morgul-blades.

sharmake-farah on What Is The Alignment Problem?

Third reason “patterns not holding” is less central an issue than it might seem: the Generalized Correspondence Principle. When quantum mechanics or general relativity came along, they still had to agree with classical mechanics in all the (many) places where classical mechanics worked. More generally: if some pattern in fact holds, then it will still be true that the pattern held under the original context even if later data departs from the pattern, and typically the pattern will generalize in some way to the new data. Prototypical example: maybe in the blegg/rube example, some totally new type of item is introduced, a gold donut (“gonut”). And then we’d have a whole new cluster, but the two old clusters are still there; the old pattern is still present in the environment.

While a trivial version of something like this holds true, the Correspondence principle doesn't apply everywhere, and while there are 2 positive results on a correspondence theorem holding, there is a negative result stating that the correspondence principle is false in the general case of physical laws/rules whose only requirement is that they be Turing-computable, which means that there's no way to make theories all add up to normality in all cases.

More here:

https://www.lesswrong.com/posts/XMGWdfTC7XjgTz3X7/a-correspondence-theorem-in-the-maximum-entropy-framework [LW · GW]

https://www.lesswrong.com/posts/FWuByzM9T5qq2PF2n/a-correspondence-theorem [LW · GW]

https://www.lesswrong.com/posts/74crqQnH8v9JtJcda/egan-s-theorem#oZNLtNAazf3E5bN6X [LW(p) · GW(p)]

https://www.lesswrong.com/posts/74crqQnH8v9JtJcda/egan-s-theorem#M6MfCwDbtuPuvoe59 [LW(p) · GW(p)]

https://www.lesswrong.com/posts/74crqQnH8v9JtJcda/egan-s-theorem#XQDrXyHSJzQjkRDZc [LW(p) · GW(p)]