LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

The real political spectrum
Hzn · 2025-01-22T08:55:39.328Z · comments (0)
Evolution and the Low Road to Nash
Aydin Mohseni (aydin-mohseni) · 2025-01-22T07:06:32.305Z · comments (0)
The Human Alignment Problem for AIs
rife (edgar-muniz) · 2025-01-22T04:06:10.872Z · comments (2)
[link] When does capability elicitation bound risk?
joshc (joshua-clymer) · 2025-01-22T03:42:36.289Z · comments (0)
[question] Popular materials about environmental goals/agent foundations? People wanting to discuss such topics?
Q Home · 2025-01-22T03:30:38.066Z · answers+comments (0)
Kitchen Air Purifier Comparison
jefftk (jkaufman) · 2025-01-22T03:20:03.224Z · comments (1)
November-December 2024 Progress in Guaranteed Safe AI
Quinn (quinn-dougherty) · 2025-01-22T01:20:00.868Z · comments (0)
[link] Quotes from the Stargate press conference
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-22T00:50:14.793Z · comments (1)
[link] Tell me about yourself: LLMs are aware of their implicit behaviors
Martín Soto (martinsq) · 2025-01-22T00:47:15.023Z · comments (0)
Using the probabilistic method to bound the performance of toy transformers
Alex Gibson · 2025-01-21T23:01:38.067Z · comments (0)
[link] Training on Documents About Reward Hacking Induces Reward Hacking
evhub · 2025-01-21T21:32:24.691Z · comments (7)
Veo-2 Can Produce Realistic Ads
Logan Riggs (elriggs) · 2025-01-21T19:13:32.884Z · comments (0)
Computational Limits on Efficiency
vibhumeh · 2025-01-21T18:29:36.997Z · comments (0)
Democratizing AI Governance: Balancing Expertise and Public Participation
Lucile Ter-Minassian (lucile-ter-minassian) · 2025-01-21T18:29:06.160Z · comments (0)
Hitler was not a monster
halgir · 2025-01-21T18:21:55.777Z · comments (0)
Natural Intelligence is Overhyped
Collisteru · 2025-01-21T18:09:16.167Z · comments (0)
14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource
Bryce Robertson (bryceerobertson) · 2025-01-21T17:34:02.170Z · comments (0)
[Linkpost] Why AI Safety Camp struggles with fundraising (FBB #2)
gergogaspar (gergo-gaspar) · 2025-01-21T17:27:51.965Z · comments (0)
[link] The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating
Corin Katzke (corin-katzke) · 2025-01-21T16:57:00.998Z · comments (5)
[link] Links and short notes, 2025-01-20
jasoncrawford · 2025-01-21T16:10:51.813Z · comments (0)
The Case Against AI Control Research
johnswentworth · 2025-01-21T16:03:10.143Z · comments (28)
Will AI Resilience protect Developing Nations?
ejk64 · 2025-01-21T15:31:32.378Z · comments (0)
Sleep, Diet, Exercise and GLP-1 Drugs
Zvi · 2025-01-21T12:20:06.018Z · comments (1)
[link] We don't want to post again "This might be the last AI Safety Camp"
Remmelt (remmelt-ellen) · 2025-01-21T12:03:33.171Z · comments (4)
On Responsibility
silentbob · 2025-01-21T10:47:37.562Z · comments (1)
The ‘anti woke’ are positioned to win but can they capitalize?
Hzn · 2025-01-21T09:52:50.673Z · comments (0)
Almost all growth is exponential growth
lemonhope (lcmgcd) · 2025-01-21T07:16:24.686Z · comments (2)
Arbitrage Drains Worse Markets to Feeds Better Ones
Cedar (xida-ren) · 2025-01-21T03:44:46.111Z · comments (1)
On Contact, Part 1
james.lucassen · 2025-01-21T03:10:54.429Z · comments (0)
Retrospective: 12 [sic] Months Since MIRI
james.lucassen · 2025-01-21T02:52:06.271Z · comments (0)
Easily Evaluate SAE-Steered Models with EleutherAI Evaluation Harness
Matthew Khoriaty (matthew-khoriaty) · 2025-01-21T02:02:35.177Z · comments (0)
Why We Need More Shovel-Ready AI Notkilleveryoneism Megaproject Proposals
Peter Berggren (peter-berggren) · 2025-01-20T22:38:26.593Z · comments (1)
Tips and Code for Empirical Research Workflows
John Hughes (john-hughes) · 2025-01-20T22:31:51.498Z · comments (3)
Lecture Series on Tiling Agents #2
abramdemski · 2025-01-20T21:02:25.479Z · comments (0)
Announcement: Learning Theory Online Course
Yegreg · 2025-01-20T19:55:57.598Z · comments (8)
The Hidden Status Game in Hospital Slacking
EpistemicExplorer · 2025-01-20T18:35:54.086Z · comments (2)
Monthly Roundup #26: January 2025
Zvi · 2025-01-20T15:30:08.680Z · comments (13)
Things I have been using LLMs for
Kaj_Sotala · 2025-01-20T14:20:02.600Z · comments (5)
Contextual attention heads in the first layer of GPT-2
Alex Gibson · 2025-01-20T13:24:31.803Z · comments (0)
[question] What are the chances that Superhuman Agents are already being tested on the internet?
artemium · 2025-01-20T11:09:33.835Z · answers+comments (1)
Detroit Lions -- over confidence is over rated?
Hzn · 2025-01-20T10:53:48.574Z · comments (0)
Logits, log-odds, and loss for parallel circuits
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-20T09:56:26.031Z · comments (0)
Worries about latent reasoning in LLMs
CBiddulph (caleb-biddulph) · 2025-01-20T09:09:02.335Z · comments (3)
It is (probably) time for a Buterlian Jihad
waterlubber · 2025-01-20T05:55:17.156Z · comments (13)
SIGMI Certification Criteria
a littoral wizard · 2025-01-20T02:41:17.210Z · comments (0)
AXRP Episode 38.5 - Adrià Garriga-Alonso on Detecting AI Scheming
DanielFilan · 2025-01-20T00:40:07.077Z · comments (0)
The Monster in Our Heads
testingthewaters · 2025-01-19T23:58:11.251Z · comments (2)
[link] AI: How We Got Here—A Neuroscience Perspective
Mordechai Rorvig (mordechai-rorvig) · 2025-01-19T23:51:47.822Z · comments (0)
Agent Foundations 2025 at CMU
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2025-01-19T23:48:22.569Z · comments (10)
Who is marketing AI alignment?
MrThink (ViktorThink) · 2025-01-19T21:37:30.477Z · comments (3)
next page (older posts) →