LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The Limitations of GPT-4
p.b. · 2023-11-24T15:30:30.933Z · comments (12)

Smartphone Etiquette: Suggestions for Social Interactions
Declan Molony (declan-molony) · 2024-06-04T06:01:03.336Z · comments (4)

Losing Metaphors: Zip and Paste
jefftk (jkaufman) · 2023-11-29T20:31:07.464Z · comments (6)

Agent membranes/boundaries and formalizing “safety”
Chipmonk · 2024-01-03T17:55:21.018Z · comments (46)

The Sequences on YouTube
Neil (neil-warren) · 2024-01-07T01:44:39.663Z · comments (9)

What is the best argument that LLMs are shoggoths?
JoshuaFox · 2024-03-17T11:36:23.636Z · comments (22)

Facebook is Paying Me to Post
jefftk (jkaufman) · 2023-11-14T19:10:07.303Z · comments (5)

D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra
aphyer · 2024-01-16T22:44:52.424Z · comments (1)

Ideas for Next-Generation Writing Platforms, using LLMs
ozziegooen · 2024-06-04T18:40:24.636Z · comments (4)

Clipboard Filtering
jefftk (jkaufman) · 2024-04-14T20:50:02.256Z · comments (1)

Control Symmetry: why we might want to start investigating asymmetric alignment interventions
domenicrosati · 2023-11-11T17:27:10.636Z · comments (1)

[question] What ML gears do you like?
Ulisse Mini (ulisse-mini) · 2023-11-11T19:10:11.964Z · answers+comments (4)

[question] Impressions from base-GPT-4?
mishka · 2023-11-08T05:43:23.001Z · answers+comments (25)

Beta Tester Request: Rallypoint Bounties
lukemarks (marc/er) · 2024-05-25T09:11:11.446Z · comments (4)

To Boldly Code
StrivingForLegibility · 2024-01-26T18:25:59.525Z · comments (4)

D&D.Sci Hypersphere Analysis Part 4: Fine-tuning and Wrapup
aphyer · 2024-01-18T03:06:39.344Z · comments (5)

AXRP Episode 30 - AI Security with Jeffrey Ladish
DanielFilan · 2024-05-01T02:50:04.621Z · comments (0)

$250K in Prizes: SafeBench Competition Announcement
ozhang (oliver-zhang) · 2024-04-03T22:07:41.171Z · comments (0)

[link] Let's Design A School, Part 2.2 School as Education - The Curriculum (General)
Sable · 2024-05-07T19:22:21.730Z · comments (3)

[link] Executive Dysfunction 101
DaystarEld · 2024-05-23T12:43:13.785Z · comments (1)

Fact Finding: Simplifying the Circuit (Post 2)
Senthooran Rajamanoharan (SenR) · 2023-12-23T02:45:49.675Z · comments (3)

Testing for consequence-blindness in LLMs using the HI-ADS unit test.
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2023-11-24T23:35:29.560Z · comments (2)

Virtually Rational - VRChat Meetup
Tomás B. (Bjartur Tómas) · 2024-01-28T05:52:36.934Z · comments (3)

If a little is good, is more better?
DanielFilan · 2023-11-04T07:10:05.943Z · comments (16)

[link] OpenAI Superalignment: Weak-to-strong generalization
Dalmert · 2023-12-14T19:47:24.347Z · comments (3)

Decent plan prize announcement (1 paragraph, $1k)
lukehmiles (lcmgcd) · 2024-01-12T06:27:44.495Z · comments (19)

[link] **In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley**
mrtreasure · 2023-12-06T02:02:32.004Z · comments (3)

Fun With The Tabula Muris (Senis)
sarahconstantin · 2024-09-20T18:20:01.901Z · comments (0)

[link] Introduction to Super Powers (for kids!)
Shoshannah Tekofsky (DarkSym) · 2024-09-20T17:17:27.070Z · comments (0)

Trying to be rational for the wrong reasons
Viliam · 2024-08-20T16:18:06.385Z · comments (8)

Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde (kola-ayonrinde) · 2024-10-30T22:50:45.642Z · comments (0)

[question] When can I be numerate?
FinalFormal2 · 2024-09-12T04:05:27.710Z · answers+comments (3)

[link] Beware the science fiction bias in predictions of the future
Nikita Sokolsky (nikita-sokolsky) · 2024-08-19T05:32:47.372Z · comments (20)

[link] Conventional footnotes considered harmful
dkl9 · 2024-10-01T14:54:01.732Z · comments (16)

[link] SB 1047 gets vetoed
ryan_b · 2024-09-30T15:49:38.609Z · comments (1)

[link] Death notes - 7 thoughts on death
Nathan Young · 2024-10-28T15:01:13.532Z · comments (1)

The case for more Alignment Target Analysis (ATA)
Chi Nguyen · 2024-09-20T01:14:41.411Z · comments (13)

A Triple Decker for Elfland
jefftk (jkaufman) · 2024-10-11T01:50:02.332Z · comments (0)

You're Playing a Rough Game
jefftk (jkaufman) · 2024-10-17T19:20:06.251Z · comments (2)

the Daydication technique
chaosmage · 2024-10-18T21:47:46.448Z · comments (0)

[question] When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?
corruptedCatapillar · 2024-11-01T07:29:49.262Z · answers+comments (2)

[link] UK AISI: Early lessons from evaluating frontier AI systems
Zach Stein-Perlman · 2024-10-25T19:00:21.689Z · comments (0)

[link] A primer on the next generation of antibodies
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-01T22:37:59.207Z · comments (0)

Useful starting code for interpretability
eggsyntax · 2024-02-13T23:13:47.940Z · comments (2)

The Wisdom of Living for 200 Years
Martin Sustrik (sustrik) · 2024-06-28T04:44:10.609Z · comments (3)

[question] How to Model the Future of Open-Source LLMs?
Joel Burget (joel-burget) · 2024-04-19T14:28:00.175Z · answers+comments (9)

An experiment on hidden cognition
Olli Järviniemi (jarviniemi) · 2024-07-22T03:26:05.564Z · comments (2)

[link] MIRI's July 2024 newsletter
Harlan · 2024-07-15T21:28:17.343Z · comments (2)

[link] The Best Essay (Paul Graham)
Chris_Leong · 2024-03-11T19:25:42.176Z · comments (2)

Housing Roundup #9: Restricting Supply
Zvi · 2024-07-17T12:50:05.321Z · comments (8)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

deepthoughtlife on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

It does of course raise the difficulty level for the political maneuvering, but would make things far more credible which means that people could actually rely on it. It really is quite difficult to precommit to things you might not like, so structures that make it work seem interesting to me.

sarahconstantin on Eli's shortform feed

Shreeda Segan is working on building it, as a cashflow business. they need $10K to get to the MVP. https://manifund.org/projects/hire-a-dev-to-finish-and-launch-our-dating-site

elityre on Eli's shortform feed

The fact that there's a sex recession is pretty suggestive that tinder and the endless stream of tinder clones doesn't serve people very well.

Even if you don't assess potential romantic partners by reading their essays, like I do, OkC's match percentage meant that you could easily filter out 95% of the pool to people who are more likely to be compatible with you, along whatever metrics of compatibility you care about.

genesmith on An alternative approach to superbabies

Yeah I pretty much agree with this assessment. I think you could probably get to 80% with 100 million and ten years and maybe 50% with 30 million and 7 years. Perhaps I'm optimistic, but right now the entire field is bottlenecked by the need for $4 million to do primate testing.

david-hornbein on Eli's shortform feed

OKcupid is certainly a better product for hundreds of thousands, or possibly millions, of unusually literate people, including ~all potential developers and most people in their social circles. It's not a small niche.

sharmake-farah on Thomas Kwa's Shortform

I kind of wished you both gave some reasoning as to why you believe that the agentic AI overhang/algorithmic overhang is likely, and I also wish that Nathan Helm Burger and Vladimir Nesov discussed this topic in a dialogue post.

bogdan-ionut-cirstea on Bogdan Ionut Cirstea's Shortform

Sam Altman says AGI is coming in 2025 (and he is also expecting a child next year) https://x.com/tsarnick/status/1854988648745517297

kyle-o-brien on SAEs are highly dataset dependent: a case study on the refusal direction

Great works, folks. This further highlights a challenge that wasn't obvious to me when I first began to study SAEs — which features are learned is just super contingent on the SAE size, sparsity, and training data. Ablations like this one are important.

redbird on the case for CoT unfaithfulness is overstated

"bottleneck" of the CoT tokens. Whatever needs to be passed along from one sequential calculation step to the next must go through this bottleneck.

Does it, though? The keys and values from previous forward passes are still accessible, even if the generated token is not.

So the CoT tokens are not absolute information bottlenecks. But yes, replacing the token by a dot reduces the number of serial steps the model can perform (from mn to m+n, if there are m forward passes and n layers).

alexander-gietelink-oldenziel on Thomas Kwa's Shortform

Fwiw I basically think you are right about the agentic AI overhang and obviously so. I do think it shapes how one thinks about what's most valuable in AI alignment.