LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

You can just wear a suit
lsusr · 2025-02-26T14:57:57.260Z · comments (15)
Fuzzing LLMs sometimes makes them reveal their secrets
Fabien Roger (Fabien) · 2025-02-26T16:48:48.878Z · comments (5)
Time to Welcome Claude 3.7
Zvi · 2025-02-26T13:00:06.489Z · comments (0)
Osaka
lsusr · 2025-02-26T13:50:24.102Z · comments (2)
[PAPER] Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik (lucy.fa) · 2025-02-26T12:50:04.204Z · comments (1)
Why Can't We Hypothesize After the Fact?
David Udell · 2025-02-26T22:41:39.819Z · comments (2)
The non-tribal tribes
PatrickDFarley · 2025-02-26T17:22:59.949Z · comments (2)
Representation Engineering has Its Problems, but None Seem Unsolvable
Lukasz G Bartoszcze (lukasz-g-bartoszcze) · 2025-02-26T19:53:32.095Z · comments (0)
Optimizing Feedback to Learn Faster
Towards_Keeperhood (Simon Skade) · 2025-02-26T14:24:26.835Z · comments (0)
Levels of analysis for thinking about agency
Cole Wyeth (Amyr) · 2025-02-26T04:24:24.583Z · comments (0)
Kingfisher Tour February 2025
jefftk (jkaufman) · 2025-02-27T02:20:04.988Z · comments (0)
[link] AI models can be dangerous before public deployment
UnofficialLinkpostBot (LinkpostBot) · 2025-02-26T20:19:08.640Z · comments (0)
[link] Matthew Yglesias - Misinformation Mostly Confuses Your Own Side
Siebe · 2025-02-26T14:55:55.627Z · comments (1)
[question] Name for Standard AI Caveat?
yrimon (yehuda-rimon) · 2025-02-26T07:07:16.523Z · answers+comments (5)
Why technology usually improves exponentially
lemonhope (lcmgcd) · 2025-02-27T03:55:25.888Z · comments (0)
You should use Consumer Reports
KvmanThinking (avery-liu) · 2025-02-27T01:52:17.235Z · comments (1)
Minor interpretability exploration #1: Grokking of modular addition, subtraction, multiplication, for different activation functions
Rareș Baron · 2025-02-26T11:35:56.610Z · comments (3)
[link] AI Rapidly Gets Smarter, And Makes Some of Us Dumber"
Evan_Gaensbauer · 2025-02-26T22:33:43.688Z · comments (3)
SAE Dataset Sensitivity in Feature Matching and a Hypothesis on Position Features
Seonglae Cho (seonglae) · 2025-02-26T17:05:18.265Z · comments (0)
outlining is a historically recent underutilized gift to family
daijin · 2025-02-26T13:58:17.623Z · comments (0)
[link] Recursive alignment with the principle of alignment
hive · 2025-02-27T02:34:37.940Z · comments (0)
Short & long term tradeoffs of strategic voting
kaleb (geomaturge) · 2025-02-27T04:25:04.304Z · comments (0)
Thoughts that prompt good forecasts: A survey
Daniel_Friedrich (Hominid Dan) · 2025-02-26T18:36:02.847Z · comments (0)
Universal AI Maximizes Variational Empowerment: New Insights into AGI Safety
Yusuke Hayashi (hayashiyus) · 2025-02-27T00:46:46.989Z · comments (0)
Proposing Human Survival Strategy based on the NAIA Vision: Toward the Co-evolution of Diverse Intelligences
Hiroshi Yamakawa (hiroshi-yamakawa) · 2025-02-27T05:18:05.369Z · comments (0)
next page (older posts) →