LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] The Best Bits From Build, Baby, Build
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-11T14:09:10.131Z · comments (0)

Inducing human-like biases in moral reasoning LMs
Artyom Karpov (artkpv) · 2024-02-20T16:28:11.424Z · comments (3)

AXRP Episode 34 - AI Evaluations with Beth Barnes
DanielFilan · 2024-07-28T03:30:07.192Z · comments (0)

The Garden of Eden
Alexander Turok · 2024-07-22T16:07:42.509Z · comments (2)

2024 Unofficial LW Community Census, Request for Comments
Screwtape · 2024-11-01T16:34:14.758Z · comments (32)

[link] Letter from an Alien Mind
Shoshannah Tekofsky (DarkSym) · 2024-12-27T13:20:49.277Z · comments (7)

Don’t Legalize Drugs
Declan Molony (declan-molony) · 2025-01-14T06:51:14.005Z · comments (7)

[link] NAO Updates, January 2025
jefftk (jkaufman) · 2025-01-10T03:37:36.698Z · comments (0)

Complete Feedback
abramdemski · 2024-11-01T16:58:50.183Z · comments (7)

Evolution's selection target depends on your weighting
tailcalled · 2024-11-19T18:24:53.117Z · comments (22)

[link] Human-AI Complementarity: A Goal for Amplified Oversight
rishubjain · 2024-12-24T09:57:55.111Z · comments (3)

The current state of RSPs
Zach Stein-Perlman · 2024-11-04T16:00:42.630Z · comments (2)

Improving Our Safety Cases Using Upper and Lower Bounds
Yonatan Cale (yonatan-cale-1) · 2025-01-16T00:01:49.043Z · comments (0)

From the outside, American schooling is weird
Jacob G-W (g-w1) · 2024-03-28T22:45:30.485Z · comments (4)

[link] Public computers can make addictive tools safe
dkl9 · 2024-12-11T19:55:22.818Z · comments (0)

[link] A Defense of Peer Review
Niko_McCarty (niko-2) · 2024-10-22T16:16:49.982Z · comments (1)

[link] [EA xpost] The Rationale-Shaped Hole At The Heart Of Forecasting
dschwarz · 2024-04-02T17:40:44.278Z · comments (2)

Less Anti-Dakka
Mateusz Bagiński (mateusz-baginski) · 2024-05-31T09:07:10.450Z · comments (5)

[link] Foundations - Why Britain has stagnated [crosspost]
Nathan Young · 2024-09-23T10:43:20.411Z · comments (1)

Would you benefit from, or object to, a page with LW users' reacts?
Raemon · 2024-08-20T16:35:47.568Z · comments (6)

Launching Adjacent News
Lucas Kohorst (lucas-kohorst) · 2024-10-16T17:58:10.289Z · comments (0)

[link] Increasing IQ by 10 Points is Possible
George3d6 · 2024-03-19T20:48:41.277Z · comments (51)

Rashomon - A newsbetting site
ideasthete · 2024-10-15T18:15:02.476Z · comments (8)

Apply to the Cooperative AI PhD Fellowship by October 14th!
Lewis Hammond (lewis-hammond-1) · 2024-10-05T12:41:24.093Z · comments (0)

[question] Money Pump Arguments assume Memoryless Agents. Isn't this Unrealistic?
Dalcy (Darcy) · 2024-08-16T04:16:23.159Z · answers+comments (6)

Disentangling Competence and Intelligence
Robert Kralisch (nonmali-1) · 2024-04-29T00:12:50.779Z · comments (7)

[link] The unreasonable effectiveness of plasmid sequencing as a service
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-08T02:02:55.352Z · comments (2)

Deception and Jailbreak Sequence: 1. Iterative Refinement Stages of Deception in LLMs
Winnie Yang (winnie-yang) · 2024-08-22T07:32:07.600Z · comments (1)

[link] The Offense-Defense Balance of Gene Drives
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-27T16:47:25.976Z · comments (1)

[link] [Talk transcript] What “structure” is and why it matters
Alex_Altair · 2024-07-25T15:49:00.844Z · comments (0)

[link] Should I Finish My Bachelor's Degree?
Zack_M_Davis · 2024-05-11T05:17:40.067Z · comments (13)

LessWrong audio: help us choose the new voice
PeterH · 2024-12-11T02:24:37.026Z · comments (0)

[link] Being Present is Not a Skill
Chipmonk · 2024-12-18T01:11:04.715Z · comments (8)

The Second Gemini
Zvi · 2024-12-17T15:50:06.373Z · comments (0)

Partitioned Book Club
jenn (pixx) · 2024-05-12T18:38:53.315Z · comments (6)

Gizmo Watch Review
jefftk (jkaufman) · 2024-06-18T20:00:02.247Z · comments (3)

New paper on aligning AI with human values
ryan.lowe · 2024-03-30T23:39:20.288Z · comments (3)

Offering service as a sensayer for simulationist-adjacent beliefs.
mako yass (MakoYass) · 2024-05-22T18:52:05.576Z · comments (0)

[link] social lemon markets
bhauth · 2024-04-25T02:18:04.480Z · comments (6)

Geoffrey Hinton on the Past, Present, and Future of AI
Stephen McAleese (stephen-mcaleese) · 2024-10-12T16:41:56.796Z · comments (5)

[link] Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority
Zach Stein-Perlman · 2024-10-28T17:00:18.660Z · comments (4)

3a. Towards Formal Corrigibility
Max Harms (max-harms) · 2024-06-09T16:53:45.386Z · comments (2)

AI Safety Evaluations: A Regulatory Review
Elliot Mckernon (elliot) · 2024-03-19T15:05:23.769Z · comments (1)

[link] How to choose what to work on
jasoncrawford · 2024-09-18T20:39:12.316Z · comments (6)

Interpretability: Integrated Gradients is a decent attribution method
Lucius Bushnaq (Lblack) · 2024-05-20T17:55:22.893Z · comments (7)

[question] How was Less Online for you?
Gordon Seidoh Worley (gworley) · 2024-06-03T17:10:33.766Z · answers+comments (4)

Why Isn't Tesla Level 3?
jefftk (jkaufman) · 2024-12-11T14:50:01.159Z · comments (7)

"The Singularity Is Nearer" by Ray Kurzweil - Review
Lavender (Kevin92) · 2024-07-08T21:32:27.307Z · comments (0)

Why I'm bearish on mechanistic interpretability: the shards are not in the network
tailcalled · 2024-09-13T17:09:25.407Z · comments (40)

[link] Day Zero Antivirals for Future Pandemics
Niko_McCarty (niko-2) · 2024-08-26T15:18:33.858Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

habryka4 on Habryka's Shortform Feed

Yep, when the fundraising post went live, i.e. November 29th.

t3t on Everywhere I Look, I See Kat Woods

I was thinking the same thing. This post badly, badly clashes with the vibe of Less Wrong. I think you should delete it, and repost to a site in which catty takedowns are part of the vibe. Less Wrong is not the place for it.

I think this is a misread of LessWrong's "vibes" and would discourage other people from thinking of LessWrong as a place where such discussions should be avoided by default.

With the exception of the title, I think the post does a decent job at avoiding making it personal.

embee on Open Thread Winter 2024/2025

Hi! I'm Embee but you can call me Max.

I'm a mathematics for quantum physics graduate student considering redirecting my focus toward AI alignment research. My background includes:
- Graduate-level mathematics
- Focus on quantum physics
- Programming experience with Python
- Interest in type theory and formal systems

I'm particularly drawn to MIRI-style approaches and interested in:
- Formal verification methods
- Decision theory implementation
- Logical induction
- Mathematical bounds on AI systems

My current program feels too theoretical and disconnected from urgent needs. I'm looking to:
- Connect with alignment researchers
- Find concrete projects to contribute to
- Apply mathematical rigor to safety problems
- Work on practical implementations

Regarding timelines: I have significant concerns about rapid capability advances, particularly given recent developments (o3). I'm prioritizing work that could contribute meaningfully in a compressed timeframe.

Looking for guidance on:
- Most neglected mathematical approaches to alignment
- Collaboration opportunities
- Where to start contributing effectively
- Balance between theory and implementation

huera on Unregulated Peptides: Does BPC-157 hold its promises?

A blogger who goes by Troof created a huge questionnaire to get people to report their experiences with various nootropics including peptides. He writes:
Selank, Semax, Cerebrolysin, BPC-157 are all peptides, and they are all in the green “uncommon-but-great” rectangle above. Their mean ratings are excellent, but their probabilities of changing your life are especially impressive: between 5 and 20% for Cerebrolysin (which matches anecdotal reports), between 2 and 13% for BPC-157, and between 3 and 7% for Semax.

This article pretty much convinced me that cerebrosylin doesn't work (as a nootropic), which made me quite sceptical of all popular peptides, since it's also the highest-rated one in troof's survey.

embee on Welcome & FAQ!

The best pathway towards becoming a member is to produce lots of great AI Alignment content, and to post it to LessWrong and participate in discussions there. The LessWrong/Alignment Forum admins monitor activity on both sites, and if someone consistently contributes to Alignment discussions on LessWrong that get promoted to the Alignment Forum, then it’s quite possible full membership will be offered.

Got it. Thanks.

quetzal_rainbow on How do fictional stories illustrate AI misalignment?

I think, collusion between AIs?

zy on Habryka's Shortform Feed

Out of curiosity - what was the time span for this raise that achieved this goal/when did first start again? Was it 2 months ago?

faul_sname on LLMs for language learning

Adapting spaced repetition to interruptions in usage: Even without parsing the user’s responses (which would make this robust to difficult audio conditions), if the reader rewinds or pauses on some answers, the app should be able to infer that the user is having some difficulty with the relevant material, and dynamically generate new content that repeats those words or grammatical forms sooner than the default.

Likewise, if the user takes a break for a few days, weeks, or months, the ratio of old to new material should automatically adjust accordingly, as forgetting is more likely, especially of relatively new material. (And of course with text to speech, an interactive app that interpreted responses from the user could and should be able to replicate LanguageZen’s ability to specifically identify (and explain) which part of a user’s response was incorrect, and why, and use this information to adjust the schedule on which material is reviewed or introduced.)

Seems like this one is mostly a matter of schlep rather than capability. The abilities you would need to make this happen are

Have a highly granular curriculum for what vocabulary and what skills are required to learn the language and a plan for what order to teach them in / what spaced repetition schedule to aim for
Have a granular and continuously updated model of the user's current knowledge of vocabulary, rules of grammar and acceptability, idioms, if there are any phonemes or phoneme sequences they have trouble with
Given specific highly granular learning goals (e.g. "understanding when to use preterite vs imperfect when conjugating saber" in spanish) within the curriculum and the model of the user's knowledge and abilities, produce exercises which teach / evaluate those specific skills.
Determine whether the user had trouble with the exercise, and if so what the trouble was
Based on the type of trouble the user had, describe whay updates should be made to the model of the user's knowledge and vocabulary
Correctly apply the updates from (6)
Adapt to deviations from the spaced repetition plan (tbh this seems like the sort of thing you would want to do with normal code)

I expect that the hardest things here will be 1, 2, and 6, and I expect them to be hard because of the volume of required work rather than the technical difficulty. But I also expect the LanguageZen folks have already tried this and could give you a more detailed view about what the hard bits are here.

Automatic customization of content through passive listening

This sounds like either a privacy nightmare or a massive battery drain. The good language models are quite compute intensive, so running them on a battery-powered phone will drain the battery very fast. Especially since this would need to hook into the "granular model of what the user knows" piece.

metawrong on Shortform

How does this explain the Decoy effect ^[1]?

^{^}
I am not sure how real and how well researched the 'decoy effect' is

shankar-sivarajan on Why abandon “probability is in the mind” when it comes to quantum dynamics?

You might also like this short summary from MinutePhysics: