LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Spaciousness In Partner Dance: A Naturalism Demo
LoganStrohl (BrienneYudkowsky) · 2023-11-19T07:00:19.555Z · comments (5)
Reactions to the Executive Order
Zvi · 2023-11-01T20:40:02.438Z · comments (4)
Lying Alignment Chart
Zack_M_Davis · 2023-11-29T16:15:28.102Z · comments (17)
[link] Are language models good at making predictions?
dynomight · 2023-11-06T13:10:36.379Z · comments (14)
Anthropic Fall 2023 Debate Progress Update
Ansh Radhakrishnan (anshuman-radhakrishnan-1) · 2023-11-28T05:37:30.070Z · comments (9)
On the UK Summit
Zvi · 2023-11-07T13:10:04.895Z · comments (6)
Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"
johnswentworth · 2023-11-21T17:39:17.828Z · comments (84)
Testbed evals: evaluating AI safety even when it can’t be directly measured
joshc (joshua-clymer) · 2023-11-15T19:00:41.908Z · comments (2)
Some Rules for an Algebra of Bayes Nets
johnswentworth · 2023-11-16T23:53:11.650Z · comments (30)
[link] A framing for interpretability
Nina Rimsky (NinaR) · 2023-11-14T16:14:15.713Z · comments (5)
Alignment can improve generalisation through more robustly doing what a human wants - CoinRun example
Stuart_Armstrong · 2023-11-21T11:41:34.798Z · comments (9)
AI #39: The Week of OpenAI
Zvi · 2023-11-23T15:10:04.865Z · comments (8)
Intro to Superposition & Sparse Autoencoders (Colab exercises)
CallumMcDougall (TheMcDouglas) · 2023-11-29T12:56:21.608Z · comments (8)
[link] Why not electric trains and excavators?
bhauth · 2023-11-21T00:07:17.967Z · comments (39)
Reinforcement Via Giving People Cookies
Screwtape · 2023-11-15T04:34:21.119Z · comments (9)
AI Safety Research Organization Incubation Program - Expression of Interest
Alexandra Bos (AlexandraB) · 2023-11-21T10:23:14.204Z · comments (6)
[link] So you want to save the world? An account in paladinhood
Tamsin Leake (carado-1) · 2023-11-22T17:40:33.048Z · comments (19)
A to Z of things
KatjaGrace · 2023-11-17T05:20:03.134Z · comments (6)
Announcing New Beginner-friendly Book on AI Safety and Risk
Darren McKee · 2023-11-25T15:57:08.078Z · comments (2)
How to Control an LLM's Behavior (why my P(DOOM) went down)
RogerDearnaley (roger-d-1) · 2023-11-28T19:56:49.679Z · comments (30)
[link] A free to enter, 240 character, open-source iterated prisoner's dilemma tournament
Isaac King (KingSupernova) · 2023-11-09T08:24:43.277Z · comments (19)
Never Drop A Ball
Screwtape · 2023-11-23T04:15:35.834Z · comments (1)
Vote on worthwhile OpenAI topics to discuss
Ben Pace (Benito) · 2023-11-21T00:03:03.898Z · comments (55)
Raemon's Deliberate (“Purposeful?”) Practice Club
Raemon · 2023-11-14T18:24:19.335Z · comments (11)
Black Box Biology
GeneSmith · 2023-11-29T02:27:29.794Z · comments (30)
On OpenAI Dev Day
Zvi · 2023-11-09T16:10:06.646Z · comments (0)
"Epistemic range of motion" and LessWrong moderation
habryka (habryka4) · 2023-11-27T21:58:40.834Z · comments (3)
New paper shows truthfulness & instruction-following don't generalize by default
joshc (joshua-clymer) · 2023-11-19T19:27:30.735Z · comments (0)
[link] Sam Altman, Greg Brockman and others from OpenAI join Microsoft
Ozyrus · 2023-11-20T08:23:00.791Z · comments (15)
Paper out now on creatine and cognitive performance
Fabienne · 2023-11-26T10:58:29.745Z · comments (2)
AI Alignment Research Engineer Accelerator (ARENA): call for applicants
CallumMcDougall (TheMcDouglas) · 2023-11-07T09:43:41.606Z · comments (0)
Genetic fitness is a measure of selection strength, not the selection target
Kaj_Sotala · 2023-11-04T19:02:13.783Z · comments (43)
It's OK to be biased towards humans
dr_s · 2023-11-11T11:59:16.568Z · comments (69)
Thoughts on open source AI
Sam Marks (samuel-marks) · 2023-11-03T15:35:42.067Z · comments (17)
AMA: Earning to Give
jefftk (jkaufman) · 2023-11-07T16:20:10.972Z · comments (8)
[link] Theories of Change for AI Auditing
Lee Sharkey (Lee_Sharkey) · 2023-11-13T19:33:43.928Z · comments (0)
AI #37: Moving Too Fast
Zvi · 2023-11-09T17:50:04.324Z · comments (5)
Game Theory without Argmax [Part 1]
Cleo Nardo (strawberry calm) · 2023-11-11T15:59:47.486Z · comments (16)
[link] Open Phil releases RFPs on LLM Benchmarks and Forecasting
LawrenceC (LawChan) · 2023-11-11T03:01:09.526Z · comments (0)
[link] OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns
Seth Herd · 2023-11-20T14:20:33.539Z · comments (28)
The Assumed Intent Bias
silentbob · 2023-11-05T16:28:03.282Z · comments (13)
The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-11-07T16:12:20.031Z · comments (20)
GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt · 2023-11-10T07:30:06.480Z · comments (5)
Altman firing retaliation incoming?
trevor (TrevorWiesinger) · 2023-11-19T00:10:15.645Z · comments (23)
On Overhangs and Technological Change
Roko · 2023-11-05T22:58:51.306Z · comments (19)
They are made of repeating patterns
quetzal_rainbow · 2023-11-13T18:17:43.189Z · comments (4)
Public Weights?
jefftk (jkaufman) · 2023-11-02T02:50:18.095Z · comments (19)
Job listing: Communications Generalist / Project Manager
Gretta Duleba (gretta-duleba) · 2023-11-06T20:21:03.721Z · comments (7)
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter · 2023-11-08T11:37:43.997Z · comments (0)
[question] why did OpenAI employees sign
bhauth · 2023-11-27T05:21:28.612Z · answers+comments (23)
← previous page (newer posts) · next page (older posts) →