LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
GeneSmith · 2023-12-12T18:14:51.438Z · comments (162)
Speaking to Congressional staffers about AI risk
Akash (akash-wasil) · 2023-12-04T23:08:52.055Z · comments (23)
Constellations are Younger than Continents
Jeffrey Heninger (jeffrey-heninger) · 2023-12-19T06:12:40.667Z · comments (22)
AI Control: Improving Safety Despite Intentional Subversion
Buck · 2023-12-13T15:51:35.982Z · comments (7)
Thoughts on “AI is easy to control” by Pope & Belrose
Steven Byrnes (steve2152) · 2023-12-01T17:30:52.720Z · comments (55)
re: Yudkowsky on biological materials
bhauth · 2023-12-11T13:28:10.639Z · comments (30)
Critical review of Christiano's disagreements with Yudkowsky
Vanessa Kosoy (vanessa-kosoy) · 2023-12-27T16:02:50.499Z · comments (40)
"Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity
Thane Ruthenis · 2023-12-16T20:08:39.375Z · comments (23)
2023 Unofficial LessWrong Census/Survey
Screwtape · 2023-12-02T04:41:51.418Z · comments (81)
Effective Aspersions: How the Nonlinear Investigation Went Wrong
TracingWoodgrains (tracingwoodgrains) · 2023-12-19T12:00:23.529Z · comments (170)
The 'Neglected Approaches' Approach: AE Studio's Alignment Agenda
Cameron Berg (cameron-berg) · 2023-12-18T20:35:01.569Z · comments (20)
The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity.
BobBurgers · 2023-12-12T02:42:18.559Z · comments (34)
[link] Succession
Richard_Ngo (ricraz) · 2023-12-20T19:25:03.185Z · comments (48)
How useful is mechanistic interpretability?
ryan_greenblatt · 2023-12-01T02:54:53.488Z · comments (53)
Most People Don't Realize We Have No Idea How Our AIs Work
Thane Ruthenis · 2023-12-21T20:02:00.360Z · comments (42)
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar · 2023-12-18T11:58:39.379Z · comments (21)
Is being sexy for your homies?
Valentine · 2023-12-13T20:37:02.043Z · comments (92)
The Plan - 2023 Version
johnswentworth · 2023-12-29T23:34:19.651Z · comments (39)
AI Views Snapshots
Rob Bensinger (RobbBB) · 2023-12-13T00:45:50.016Z · comments (61)
The Dark Arts
lsusr · 2023-12-19T04:41:13.356Z · comments (49)
[link] Bayesian Injustice
Kevin Dorst · 2023-12-14T15:44:08.664Z · comments (10)
The LessWrong 2022 Review
habryka (habryka4) · 2023-12-05T04:00:00.000Z · comments (43)
Mapping the semantic void: Strange goings-on in GPT embedding spaces
mwatkins · 2023-12-14T13:10:22.691Z · comments (31)
Natural Latents: The Math
johnswentworth · 2023-12-27T19:03:01.923Z · comments (31)
Current AIs Provide Nearly No Data Relevant to AGI Alignment
Thane Ruthenis · 2023-12-15T20:16:09.723Z · comments (152)
What I Would Do If I Were Working On AI Governance
johnswentworth · 2023-12-08T06:43:42.565Z · comments (32)
Deep Forgetting & Unlearning for Safely-Scoped LLMs
scasper · 2023-12-05T16:48:18.177Z · comments (29)
Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1)
Neel Nanda (neel-nanda-1) · 2023-12-23T02:44:24.270Z · comments (6)
On the future of language models
owencb · 2023-12-20T16:58:28.433Z · comments (17)
Nonlinear’s Evidence: Debunking False and Misleading Claims
KatWoods (ea247) · 2023-12-12T13:16:12.008Z · comments (171)
[link] The Witness
Richard_Ngo (ricraz) · 2023-12-03T22:27:16.248Z · comments (4)
"AI Alignment" is a Dangerously Overloaded Term
Roko · 2023-12-15T14:34:29.850Z · comments (98)
[question] How do you feel about LessWrong these days? [Open feedback thread]
jacobjacob · 2023-12-05T20:54:42.317Z · answers+comments (272)
Prediction Markets aren't Magic
SimonM · 2023-12-21T12:54:07.754Z · comments (29)
Meaning & Agency
abramdemski · 2023-12-19T22:27:32.123Z · comments (17)
Based Beff Jezos and the Accelerationists
Zvi · 2023-12-06T16:00:08.380Z · comments (29)
[Valence series] 1. Introduction
Steven Byrnes (steve2152) · 2023-12-04T15:40:21.274Z · comments (14)
A Crisper Explanation of Simulacrum Levels
Thane Ruthenis · 2023-12-23T22:13:52.286Z · comments (13)
[link] A Universal Emergent Decomposition of Retrieval Tasks in Language Models
Alexandre Variengien (alexandre-variengien) · 2023-12-19T11:52:27.354Z · comments (3)
Refusal mechanisms: initial experiments with Llama-2-7b-chat
Andy Arditi (andy-arditi) · 2023-12-08T17:08:01.250Z · comments (7)
OpenAI: Leaks Confirm the Story
Zvi · 2023-12-12T14:00:04.812Z · comments (9)
Send us example gnarly bugs
Beth Barnes (beth-barnes) · 2023-12-10T05:23:00.773Z · comments (10)
MATS Summer 2023 Retrospective
Rocket (utilistrutil) · 2023-12-01T23:29:47.958Z · comments (34)
EU policymakers reach an agreement on the AI Act
tlevin (trevor) · 2023-12-15T06:02:44.668Z · comments (7)
Studying The Alien Mind
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-12-05T17:27:28.049Z · comments (10)
[link] The problems with the concept of an infohazard as used by the LW community [Linkpost]
Noosphere89 (sharmake-farah) · 2023-12-22T16:13:54.822Z · comments (43)
[link] The Offense-Defense Balance Rarely Changes
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-09T15:21:23.340Z · comments (23)
Neural uncertainty estimation review article (for alignment)
Charlie Steiner · 2023-12-05T08:01:32.723Z · comments (3)
Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem
Ansh Radhakrishnan (anshuman-radhakrishnan-1) · 2023-12-16T05:49:23.672Z · comments (3)
[link] Nietzsche's Morality in Plain English
Arjun Panickssery (arjun-panickssery) · 2023-12-04T00:57:42.839Z · comments (13)
next page (older posts) →