LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Caution when interpreting Deepmind's In-context RL paper
Sam Marks (samuel-marks) · 2022-11-01T02:42:06.766Z · comments (6)
EA & LW Forums Weekly Summary (24 - 30th Oct 22')
Zoe Williams (GreyArea) · 2022-11-01T02:58:09.914Z · comments (1)
ML Safety Scholars Summer 2022 Retrospective
ThomasW (ThomasWoodside) · 2022-11-01T03:09:10.305Z · comments (0)
Conversations on Alcohol Consumption
Annapurna (jorge-velez) · 2022-11-01T05:09:34.374Z · comments (6)
[link] Remember to translate your thoughts back again
brook · 2022-11-01T08:49:12.812Z · comments (11)
Auditing games for high-level interpretability
Paul Colognese (paul-colognese) · 2022-11-01T10:44:07.630Z · comments (1)
Clarifying AI X-risk
zac_kenton (zkenton) · 2022-11-01T11:03:01.144Z · comments (24)
Threat Model Literature Review
zac_kenton (zkenton) · 2022-11-01T11:03:22.610Z · comments (4)
[link] a casual intro to AI doom and alignment
Tamsin Leake (carado-1) · 2022-11-01T16:38:31.230Z · comments (0)
On the correspondence between AI-misalignment and cognitive dissonance using a behavioral economics model
Stijn Bruers · 2022-11-01T17:39:10.433Z · comments (0)
[link] Progress links and tweets, 2022-11-01
jasoncrawford · 2022-11-01T17:48:45.562Z · comments (4)
Mildly Against Donor Lotteries
jefftk (jkaufman) · 2022-11-01T18:10:06.458Z · comments (9)
Open & Welcome Thread - November 2022
MondSemmel · 2022-11-01T18:47:40.682Z · comments (46)
AI as a Civilizational Risk Part 4/6: Bioweapons and Philosophy of Modification
PashaKamyshev · 2022-11-01T20:50:54.078Z · comments (1)
[question] Which Issues in Conceptual Alignment have been Formalised or Observed (or not)?
ojorgensen · 2022-11-01T22:32:25.243Z · answers+comments (0)
All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles (robert-miles) · 2022-11-01T23:23:04.146Z · comments (105)
[link] Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?
Neel Nanda (neel-nanda-1) · 2022-11-01T23:56:06.215Z · comments (16)
Sequence Reread: Fake Beliefs [plus sequence spotlight meta]
Raemon · 2022-11-02T00:09:11.755Z · comments (3)
Information Markets
eva_ · 2022-11-02T01:24:11.639Z · comments (6)
Why is fiber good for you?
braces · 2022-11-02T02:04:35.579Z · comments (2)
AI Safety Needs Great Product Builders
goodgravy · 2022-11-02T11:33:59.283Z · comments (2)
Mind is uncountable
Filip Sondej · 2022-11-02T11:51:52.050Z · comments (22)
Housing and Transit Thoughts #1
Zvi · 2022-11-02T12:10:00.575Z · comments (5)
[link] Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim · 2022-11-02T12:57:23.445Z · comments (27)
Humans do acausal coordination all the time
Adam Jermyn (adam-jermyn) · 2022-11-02T14:40:39.730Z · comments (35)
"Are Experiments Possible?" Seeds of Science call for reviewers
rogersbacon · 2022-11-02T20:05:17.334Z · comments (0)
[question] Is there a good way to award a fixed prize in a prediction contest?
jchan · 2022-11-02T21:37:45.111Z · answers+comments (5)
AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk
PashaKamyshev · 2022-11-03T02:19:46.847Z · comments (0)
Lazy Python Argument Parsing
jefftk (jkaufman) · 2022-11-03T02:20:05.466Z · comments (3)
Open Letter Against Reckless Nuclear Escalation and Use
Max Tegmark (MaxTegmark) · 2022-11-03T05:34:44.529Z · comments (23)
The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter
mako yass (MakoYass) · 2022-11-03T06:47:56.376Z · comments (13)
The Rational Utilitarian Love Movement (A Historical Retrospective)
CBiddulph (caleb-biddulph) · 2022-11-03T07:11:28.679Z · comments (0)
Information Markets 2: Optimally Shaped Reward Bets
eva_ · 2022-11-03T11:08:49.126Z · comments (0)
K-types vs T-types — what priors do you have?
Cleo Nardo (strawberry calm) · 2022-11-03T11:29:00.809Z · comments (25)
[link] Adversarial Policies Beat Professional-Level Go AIs
sanxiyn · 2022-11-03T13:27:00.059Z · comments (35)
Covid 11/3/22: Asking Forgiveness
Zvi · 2022-11-03T13:50:00.448Z · comments (3)
Multiple Deploy-Key Repos
jefftk (jkaufman) · 2022-11-03T15:10:03.820Z · comments (0)
Why do we post our AI safety plans on the Internet?
Peter S. Park · 2022-11-03T16:02:21.428Z · comments (4)
A Mystery About High Dimensional Concept Encoding
Fabien Roger (Fabien) · 2022-11-03T17:05:56.034Z · comments (13)
AI as a Civilizational Risk Part 6/6: What can be done
PashaKamyshev · 2022-11-03T19:48:52.376Z · comments (4)
Further considerations on the Evidentialist's Wager
Martín Soto (martinsq) · 2022-11-03T20:06:31.997Z · comments (9)
[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament - Veritaserum
mako yass (MakoYass) · 2022-11-03T21:04:35.839Z · comments (1)
[question] Could a Supreme Court suit work to solve NEPA problems?
ChristianKl · 2022-11-03T21:10:48.344Z · answers+comments (0)
Mechanistic Interpretability as Reverse Engineering (follow-up to "cars and elephants")
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2022-11-03T23:19:20.458Z · comments (3)
[question] Don't you think RLHF solves outer alignment?
Charbel-Raphaël (charbel-raphael-segerie) · 2022-11-04T00:36:36.527Z · answers+comments (23)
[question] Are alignment researchers devoting enough time to improving their research capacity?
Carson Jones · 2022-11-04T00:58:21.349Z · answers+comments (3)
A newcomer’s guide to the technical AI safety field
zeshen · 2022-11-04T14:29:46.873Z · comments (3)
A new place to discuss cognitive science, ethics and human alignment
Daniel_Friedrich (Hominid Dan) · 2022-11-04T14:34:15.632Z · comments (4)
Weekly Roundup #4
Zvi · 2022-11-04T15:00:01.096Z · comments (1)
Monthly Shorts 10/22
Celer · 2022-11-04T16:30:07.616Z · comments (0)
next page (older posts) →