LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Automating Mechanistic Interpretability via Program Synthesis
Edy Nastase (edy-nastase) · 2025-04-17T10:58:46.748Z · comments (1)
Memory Decoding Journal Club
Devin Ward (Carboncopies Foundation) · 2025-04-17T16:19:25.992Z · comments (0)
AI, Alignment & the Art of Relationship Design
Priyanka Bharadwaj (priyanka-bharadwaj) · 2025-04-19T00:47:02.591Z · comments (0)
Machines of Stolen Grace
Riley Tavassoli (riley-tavassoli) · 2025-03-27T18:15:23.736Z · comments (0)
I’m headed to DC this week. any tips?
Wes R · 2025-04-19T02:33:18.584Z · comments (0)
Could LLMs Learn to Detect Bias Autonomously, Like Tesla’s Self-Driving Cars?
Omnipheasant · 2025-04-18T18:45:36.242Z · comments (0)
Hierarchical Cognitive Anchoring: A Sketch Toward Scalable Structural Alignment
sparckix · 2025-04-18T19:03:51.115Z · comments (0)
Alignment Does Not Need to Be Opaque! An Introduction to Feature Steering with Reinforcement Learning
Jeremias Ferrao (jeremias-ferrao) · 2025-04-18T19:34:49.357Z · comments (0)
[question] How familiar is the Lesswrong community as a whole with the concept of Reward-modelling?
Oxidize · 2025-04-09T23:33:18.044Z · answers+comments (8)
Routine Novelty
BazingaBoy (martin-nenov) · 2025-03-31T15:47:05.217Z · comments (0)
A Fraction of Global Market Capitalization as the Best Currency
Greenless Mirror (mikhail-2) · 2025-03-31T13:30:03.970Z · comments (25)
Does the universe's recognition of measurement provide stronger evidence for being in a simulation than universal fine-tuning?
amelia (314159) · 2025-04-09T08:20:10.561Z · comments (0)
AI Needs Us? Information Theory and Humans as data
tomdekan (tomd@hey.com) · 2025-03-29T15:51:16.070Z · comments (6)
LLM-based Fact Checking for Popular Posts?
azergante · 2025-04-18T21:26:25.230Z · comments (0)
[link] Six reasons why objective morality is nonsense
Zero Contradictions · 2025-04-11T02:11:04.775Z · comments (10)
[question] How many times faster can the AGI advance the science than humans do?
StanislavKrym · 2025-03-28T15:16:52.320Z · answers+comments (0)
[link] Rethinking Friction: Equity and Motivation Across Domains
eltimbalino · 2025-04-08T03:58:02.839Z · comments (0)
Do we want too much from a potentially godlike AGI?
StanislavKrym · 2025-04-11T23:33:06.710Z · comments (0)
[link] find_purpose.exe
heatdeathandtaxes · 2025-04-12T19:31:38.951Z · comments (0)
Alignment through atomic agents
micseydel · 2025-03-27T18:43:14.569Z · comments (0)
On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
funnyfranco · 2025-03-31T19:26:27.090Z · comments (32)
Would this solve the (outer) alignment problem, or at least help?
Wes R · 2025-04-06T18:49:14.145Z · comments (1)
An argument for asexuality
filthy_hedonist (sid-kolichala) · 2025-03-27T18:08:48.624Z · comments (10)
What If Galaxies Are Alive and Atoms Have Minds? A Thought Experiment on Life Across Scales
Saif Khan (saif-khan) · 2025-04-18T10:01:18.783Z · comments (4)
A Solution to Sandbagging and other Self-Provable Misalignment: Constitutional AI Detectives
Knight Lee (Max Lee) · 2025-04-14T10:27:24.903Z · comments (2)
[link] The Cynic Wasps in the Beehive
mempko · 2025-04-12T19:30:44.227Z · comments (0)
Why Does It Feel Like Something? An Evolutionary Path to Subjectivity
gmax (maxim-gurevich) · 2025-04-15T08:38:50.637Z · comments (11)
[question] Is the ethics of interaction with primitive peoples already solved?
StanislavKrym · 2025-04-11T14:56:21.306Z · answers+comments (0)
Will the AGIs be able to run the civilisation?
StanislavKrym · 2025-03-28T04:50:07.568Z · comments (2)
A New Challenge to all Bayesians!
milanrosko · 2025-04-02T02:38:35.562Z · comments (0)
8 PRIME SKILLS An analisis
P. João (gabriel-brito) · 2025-04-17T11:36:54.678Z · comments (0)
Reframing AI Safety Through the Lens of Identity Maintenance Framework
Hiroshi Yamakawa (hiroshi-yamakawa) · 2025-04-01T06:16:45.228Z · comments (1)
How to defeat superintelligence, the Sta-Hi way
kilgoar (william-walshe) · 2025-04-09T13:58:59.541Z · comments (0)
Karel Čapek’s 'War with the Newts' 1936 review
Petr 'Margot' Andreev (petr-andreev) · 2025-04-04T23:12:39.572Z · comments (1)
Ai Cone of Probabilties - what aren't we talking about?
Marzipan · 2025-04-05T05:51:27.859Z · comments (5)
Null Rationalism
kilgoar (william-walshe) · 2025-04-05T03:26:06.034Z · comments (0)
Insect Suffering Is The Biggest Issue: What To Do About It
omnizoid · 2025-04-01T12:51:08.115Z · comments (9)
Not The End of All Value
Ben Ihrig (eternal/ephemera) · 2025-04-10T20:53:36.671Z · comments (0)
[question] How far are Western welfare states from coddling the population into becoming useless?
StanislavKrym · 2025-04-13T17:08:01.834Z · answers+comments (5)
An Unbiased Evaluation of My Debate with Thane Ruthenis - Run It Yourself
funnyfranco · 2025-04-07T18:56:47.831Z · comments (14)
← previous page (newer posts) · next page (older posts) →