LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

[New Jersey] HPMOR 10 Year Anniversary Party 🎉
🟠UnlimitedOranges🟠 (mr-mar) · 2025-02-27T22:30:26.009Z · comments (0)
[link] OpenAI releases GPT-4.5
Seth Herd · 2025-02-27T21:40:45.010Z · comments (12)
AEPF_OpenSource is Live – A New Open Standard for Ethical AI
ethoshift · 2025-02-27T20:40:18.997Z · comments (0)
The Elicitation Game: Evaluating capability elicitation techniques
Teun van der Weij (teun-van-der-weij) · 2025-02-27T20:33:24.861Z · comments (0)
For the Sake of Pleasure Alone
Greenless Mirror (mikhail-2) · 2025-02-27T20:07:54.852Z · comments (14)
[link] Keeping AI Subordinate to Human Thought: A Proposal for Public AI Conversations
syh · 2025-02-27T20:00:26.150Z · comments (0)
[link] How to Corner Liars: A Miasma-Clearing Protocol
ymeskhout · 2025-02-27T17:18:36.028Z · comments (23)
Economic Topology, ASI, and the Separation Equilibrium
mkualquiera · 2025-02-27T16:36:48.098Z · comments (11)
The Illusion of Iterative Improvement: Why AI (and Humans) Fail to Track Their Own Epistemic Drift
Andy E Williams (andy-e-williams) · 2025-02-27T16:26:52.718Z · comments (2)
AI #105: Hey There Alexa
Zvi · 2025-02-27T14:30:08.038Z · comments (3)
Space-Faring Civilization density estimates and models - Review
Maxime Riché (maxime-riche) · 2025-02-27T11:44:21.101Z · comments (0)
[link] Market Capitalization is Semantically Invalid
Zero Contradictions · 2025-02-27T11:27:47.765Z · comments (14)
Proposing Human Survival Strategy based on the NAIA Vision: Toward the Co-evolution of Diverse Intelligences
Hiroshi Yamakawa (hiroshi-yamakawa) · 2025-02-27T05:18:05.369Z · comments (0)
Short & long term tradeoffs of strategic voting
kaleb (geomaturge) · 2025-02-27T04:25:04.304Z · comments (0)
[link] Recursive alignment with the principle of alignment
hive · 2025-02-27T02:34:37.940Z · comments (1)
Kingfisher Tour February 2025
jefftk (jkaufman) · 2025-02-27T02:20:04.988Z · comments (0)
You should use Consumer Reports
KvmanThinking (avery-liu) · 2025-02-27T01:52:17.235Z · comments (5)
Universal AI Maximizes Variational Empowerment: New Insights into AGI Safety
Yusuke Hayashi (hayashiyus) · 2025-02-27T00:46:46.989Z · comments (0)
Why Can't We Hypothesize After the Fact?
David Udell · 2025-02-26T22:41:39.819Z · comments (3)
[link] "AI Rapidly Gets Smarter, And Makes Some of Us Dumber," from Sabine Hossenfelder
Evan_Gaensbauer · 2025-02-26T22:33:43.688Z · comments (9)
[link] METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot (LinkpostBot) · 2025-02-26T20:19:08.640Z · comments (0)
Representation Engineering has Its Problems, but None Seem Unsolvable
Lukasz G Bartoszcze (lukasz-g-bartoszcze) · 2025-02-26T19:53:32.095Z · comments (1)
Thoughts that prompt good forecasts: A survey
Daniel_Friedrich (Hominid Dan) · 2025-02-26T18:36:02.847Z · comments (0)
The non-tribal tribes
PatrickDFarley · 2025-02-26T17:22:59.949Z · comments (4)
SAE Training Dataset Influence in Feature Matching and a Hypothesis on Position Features
Seonglae Cho (seonglae) · 2025-02-26T17:05:18.265Z · comments (3)
Fuzzing LLMs sometimes makes them reveal their secrets
Fabien Roger (Fabien) · 2025-02-26T16:48:48.878Z · comments (13)
You can just wear a suit
lsusr · 2025-02-26T14:57:57.260Z · comments (48)
[link] Matthew Yglesias - Misinformation Mostly Confuses Your Own Side
Siebe · 2025-02-26T14:55:55.627Z · comments (1)
Optimizing Feedback to Learn Faster
Towards_Keeperhood (Simon Skade) · 2025-02-26T14:24:26.835Z · comments (0)
outlining is a historically recent underutilized gift to family
daijin · 2025-02-26T13:58:17.623Z · comments (2)
Osaka
lsusr · 2025-02-26T13:50:24.102Z · comments (11)
Time to Welcome Claude 3.7
Zvi · 2025-02-26T13:00:06.489Z · comments (2)
[PAPER] Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik (lucy.fa) · 2025-02-26T12:50:04.204Z · comments (8)
Minor interpretability exploration #1: Grokking of modular addition, subtraction, multiplication, for different activation functions
Rareș Baron · 2025-02-26T11:35:56.610Z · comments (13)
[question] Name for Standard AI Caveat?
yrimon (yehuda-rimon) · 2025-02-26T07:07:16.523Z · answers+comments (5)
Levels of analysis for thinking about agency
Cole Wyeth (Amyr) · 2025-02-26T04:24:24.583Z · comments (0)
[link] The Stag Hunt—cultivating cooperation to reap rewards
James Stephen Brown (james-brown) · 2025-02-25T23:45:07.472Z · comments (0)
Three Levels for Large Language Model Cognition
Eleni Angelou (ea-1) · 2025-02-25T23:14:00.306Z · comments (0)
[link] [Crosspost] Strategic wealth accumulation under transformative AI expectations
arden446 · 2025-02-25T21:50:11.458Z · comments (0)
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Jan Betley (jan-betley) · 2025-02-25T17:39:31.059Z · comments (91)
[link] We Can Build Compassionate AI
Gordon Seidoh Worley (gworley) · 2025-02-25T16:37:06.160Z · comments (5)
[question] Intellectual lifehacks repo
Antoine de Scorraille (Etoile de Scauchy) · 2025-02-25T16:32:09.814Z · answers+comments (15)
Economics Roundup #5
Zvi · 2025-02-25T13:40:07.086Z · comments (10)
Making alignment a law of the universe
juggins · 2025-02-25T10:44:11.632Z · comments (3)
Revisiting Conway's Law
annebrandes (annebrandes1@gmail.com) · 2025-02-25T08:33:52.421Z · comments (4)
Demystifying the Pinocchio Paradox
Novak Zukowski (Zantarus) · 2025-02-25T06:16:57.219Z · comments (0)
Technical comparison of Deepseek, Novasky, S1, Helix, P0
Juliezhanggg · 2025-02-25T04:20:40.413Z · comments (0)
[link] Upcoming Protest for AI Safety
Matt Vincent (matthew-milone) · 2025-02-25T03:04:03.153Z · comments (0)
[link] what an efficient market feels from inside
DMMF · 2025-02-25T02:38:40.129Z · comments (9)
Metacompilation
Donald Hobson (donald-hobson) · 2025-02-24T22:58:00.085Z · comments (1)
next page (older posts) →