LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Focus on existential risk is a distraction from the real issues. A false fallacy
Nik Samoylov (nik-samoylov) · 2023-10-30T23:42:02.066Z · comments (11)
[link] Will releasing the weights of large language models grant widespread access to pandemic agents?
jefftk (jkaufman) · 2023-10-30T18:22:59.677Z · comments (25)
[link] [Linkpost] Two major announcements in AI governance today
[deleted] · 2023-10-30T17:28:16.482Z · comments (1)
[link] Grokking Beyond Neural Networks
Jack Miller (jack-miller) · 2023-10-30T17:28:04.626Z · comments (0)
[link] Response to “Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers”
Matthew Wearden (matthew-wearden) · 2023-10-30T17:27:58.166Z · comments (2)
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Zeming Wei · 2023-10-30T17:22:31.780Z · comments (1)
5 Reasons Why Governments/Militaries Already Want AI for Information Warfare
trevor (TrevorWiesinger) · 2023-10-30T16:30:38.020Z · comments (0)
[Linkpost] Biden-Harris Executive Order on AI
beren · 2023-10-30T15:20:22.582Z · comments (0)
[link] AI Alignment [progress] this Week (10/29/2023)
Logan Zoellner (logan-zoellner) · 2023-10-30T15:02:26.265Z · comments (4)
Improving the Welfare of AIs: A Nearcasted Proposal
ryan_greenblatt · 2023-10-30T14:51:35.901Z · comments (5)
[link] President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence
Tristan Williams (tristan-williams) · 2023-10-30T11:15:38.422Z · comments (39)
GPT-2 XL's capacity for coherence and ontology clustering
MiguelDev (whitehatStoic) · 2023-10-30T09:24:13.202Z · comments (2)
Charbel-Raphaël and Lucius discuss Interpretability
Mateusz Bagiński (mateusz-baginski) · 2023-10-30T05:50:34.589Z · comments (7)
Multi-Winner 3-2-1 Voting
Yoav Ravid · 2023-10-30T03:31:25.776Z · comments (5)
[link] math terminology as convolution
bhauth · 2023-10-30T01:05:11.823Z · comments (1)
Grokking, memorization, and generalization — a discussion
Kaarel (kh) · 2023-10-29T23:17:30.098Z · comments (10)
[link] Comp Sci in 2027 (Short story by Eliezer Yudkowsky)
sudo · 2023-10-29T23:09:56.730Z · comments (22)
Mathematically-Defined Optimization Captures A Lot of Useful Information
J Bostock (Jemist) · 2023-10-29T17:17:03.211Z · comments (0)
Clarifying the free energy principle (with quotes)
Ryo (Flewrint Ophiuni) · 2023-10-29T16:03:31.958Z · comments (0)
[link] A new intro to Quantum Physics, with the math fixed
titotal (lombertini) · 2023-10-29T15:11:27.168Z · comments (22)
My idea of sacredness, divinity, and religion
Kaj_Sotala · 2023-10-29T12:50:07.980Z · comments (10)
The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate
Hauke Hillebrandt (hauke-hillebrandt) · 2023-10-29T08:38:23.327Z · comments (0)
What's up with "Responsible Scaling Policies"?
habryka (habryka4) · 2023-10-29T04:17:07.839Z · comments (8)
Experiments as a Third Alternative
Adam Zerner (adamzerner) · 2023-10-29T00:39:31.399Z · comments (21)
Comparing representation vectors between llama 2 base and chat
Nina Rimsky (NinaR) · 2023-10-28T22:54:37.059Z · comments (5)
Vaniver's thoughts on Anthropic's RSP
Vaniver · 2023-10-28T21:06:07.323Z · comments (4)
Book Review: Orality and Literacy: The Technologizing of the Word
Fergus Fettes (fergus-fettes) · 2023-10-28T20:12:07.743Z · comments (0)
Regrant up to $600,000 to AI safety projects with GiveWiki
Dawn Drescher (Telofy) · 2023-10-28T19:56:06.676Z · comments (1)
[link] Shane Legg interview on alignment
Seth Herd · 2023-10-28T19:28:52.223Z · comments (20)
AI Existential Safety Fellowships
mmfli · 2023-10-28T18:07:19.773Z · comments (0)
[link] AI Safety Hub Serbia Official Opening
DusanDNesic · 2023-10-28T17:03:34.607Z · comments (0)
[link] Managing AI Risks in an Era of Rapid Progress
Algon · 2023-10-28T15:48:25.029Z · comments (3)
[question] ELI5 Why isn't alignment *easier* as models get stronger?
Logan Zoellner (logan-zoellner) · 2023-10-28T14:34:37.588Z · answers+comments (9)
Truthseeking, EA, Simulacra levels, and other stuff
Elizabeth (pktechgirl) · 2023-10-27T23:56:49.198Z · comments (12)
[question] Do you believe "E=mc^2" is a correct and/or useful equation, and, whether yes or no, precisely what are your reasons for holding this belief (with such a degree of confidence)?
l8c · 2023-10-27T22:46:51.020Z · answers+comments (14)
Value systematization: how values become coherent (and misaligned)
Richard_Ngo (ricraz) · 2023-10-27T19:06:26.928Z · comments (47)
[link] Techno-humanism is techno-optimism for the 21st century
Richard_Ngo (ricraz) · 2023-10-27T18:37:39.776Z · comments (5)
Sanctuary for Humans
nikola (nikolaisalreadytaken) · 2023-10-27T18:08:22.389Z · comments (9)
Wireheading and misalignment by composition on NetHack
pierlucadoro · 2023-10-27T17:43:41.727Z · comments (4)
We're Not Ready: thoughts on "pausing" and responsible scaling policies
HoldenKarnofsky · 2023-10-27T15:19:33.757Z · comments (33)
Aspiration-based Q-Learning
Clément Dumas (butanium) · 2023-10-27T14:42:03.292Z · comments (5)
[link] Linkpost: Rishi Sunak's Speech on AI (26th October)
bideup · 2023-10-27T11:57:46.575Z · comments (8)
ASPR & WARP: Rationality Camps for Teens in Taiwan and Oxford
Anna Gajdova (anna-gajdova) · 2023-10-27T08:40:35.436Z · comments (0)
[question] To what extent is the UK Government's recent AI Safety push entirely due to Rishi Sunak?
Stephen Fowler (LosPolloFowler) · 2023-10-27T03:29:28.465Z · answers+comments (4)
Bayesian Punishment
Rob Lucas · 2023-10-27T03:24:53.930Z · comments (1)
Online Dialogues Party — Sunday 5th November
Ben Pace (Benito) · 2023-10-27T02:41:00.506Z · comments (1)
OpenAI’s new Preparedness team is hiring
leopold · 2023-10-26T20:42:35.966Z · comments (2)
[link] Fake Deeply
Zack_M_Davis · 2023-10-26T19:55:22.340Z · comments (7)
Symbol/Referent Confusions in Language Model Alignment Experiments
johnswentworth · 2023-10-26T19:49:00.718Z · comments (44)
[link] Unsupervised Methods for Concept Discovery in AlphaZero
aogara (Aidan O'Gara) · 2023-10-26T19:05:57.897Z · comments (0)
next page (older posts) →