LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Is AlphaGo actually a consequentialist utility maximizer?
faul_sname · 2023-12-07T12:41:05.132Z · answers+comments (8)

[link] Genocide isn't Decolonization
robotelvis · 2023-10-20T04:14:07.716Z · comments (19)

Disentangling four motivations for acting in accordance with UDT
Julian Stastny · 2023-11-05T21:26:22.514Z · comments (3)

Helpful examples to get a sense of modern automated manipulation
trevor (TrevorWiesinger) · 2023-11-12T20:49:57.422Z · comments (3)

We have promising alignment plans with low taxes
Seth Herd · 2023-11-10T18:51:38.604Z · comments (9)

[link] The Hidden Complexity of Wishes - The Animation
Writer · 2023-09-27T17:59:37.188Z · comments (0)

Regrant up to $600,000 to AI safety projects with GiveWiki
Dawn Drescher (Telofy) · 2023-10-28T19:56:06.676Z · comments (1)

5. Moral Value for Sentient Animals? Alas, Not Yet
RogerDearnaley (roger-d-1) · 2023-12-27T06:42:09.130Z · comments (41)

[question] Feedback request: what am I missing?
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-02T17:38:39.625Z · answers+comments (5)

[link] Twitter thread on open-source AI
Richard_Ngo (ricraz) · 2024-07-31T00:26:11.655Z · comments (6)

How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-07-01T09:04:03.687Z · comments (4)

Mech Interp Lacks Good Paradigms
Daniel Tan (dtch1997) · 2024-07-16T15:47:32.171Z · comments (0)

Monthly Roundup #16: March 2024
Zvi · 2024-03-19T13:10:05.529Z · comments (4)

Experimentation (Part 7 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-18T21:25:56.527Z · comments (0)

More on the Apple Vision Pro
Zvi · 2024-02-13T17:40:05.388Z · comments (5)

LLMs can strategically deceive while doing gain-of-function research
Igor Ivanov (igor-ivanov) · 2024-01-24T15:45:08.795Z · comments (4)

Effectively Handling Disagreements - Introducing a New Workshop
Camille Berger (Camille Berger) · 2024-04-15T16:33:50.339Z · comments (2)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery (arjun-panickssery) · 2024-01-15T21:21:03.962Z · comments (0)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

Basics of Handling Disagreements with People
Camille Berger (Camille Berger) · 2024-11-12T17:55:08.143Z · comments (4)

DIY LessWrong Jewelry
Fluffnutt (Pear) · 2024-08-25T21:33:56.173Z · comments (0)

UDT1.01: Logical Inductors and Implicit Beliefs (5/10)
Diffractor · 2024-04-18T08:39:13.368Z · comments (2)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (46)

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5
VipulNaik · 2023-11-29T18:11:53.252Z · comments (16)

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

Important open problems in voting
Closed Limelike Curves · 2024-07-01T02:53:44.690Z · comments (1)

Investigating the Ability of LLMs to Recognize Their Own Writing
Christopher Ackerman (christopher-ackerman) · 2024-07-30T15:41:44.017Z · comments (0)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (12)

Learning Math in Time for Alignment
Nicholas / Heather Kross (NicholasKross) · 2024-01-09T01:02:37.446Z · comments (3)

[link] A computational complexity argument for many worlds
jessicata (jessica.liu.taylor) · 2024-08-13T19:35:10.116Z · comments (15)

Housing Roundup #10
Zvi · 2024-10-29T13:50:09.416Z · comments (2)

In Defense of Lawyers Playing Their Part
Isaac King (KingSupernova) · 2024-07-01T01:32:58.695Z · comments (9)

[link] NAO Updates, Fall 2024
jefftk (jkaufman) · 2024-10-18T00:00:04.142Z · comments (2)

[link] Concrete benefits of making predictions
Jonny Spicer (jonnyspicer) · 2024-10-17T14:23:17.613Z · comments (5)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

An argument that consequentialism is incomplete
cousin_it · 2024-10-07T09:45:12.754Z · comments (27)

Is suffering like shit?
KatjaGrace · 2024-05-31T01:20:03.855Z · comments (5)

Comparing Quantized Performance in Llama Models
NickyP (Nicky) · 2024-07-15T16:01:24.960Z · comments (2)

Being against involuntary death and being open to change are compatible
Andy_McKenzie · 2024-05-27T06:37:27.644Z · comments (5)

Monthly Roundup #13: December 2023
Zvi · 2023-12-19T15:10:08.293Z · comments (5)

A quick experiment on LMs’ inductive biases in performing search
Alex Mallen (alex-mallen) · 2024-04-14T03:41:08.671Z · comments (2)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (29)

[question] How unusual is the fact that there is no AI monopoly?
Viliam · 2024-08-16T20:21:51.012Z · answers+comments (15)

[link] the subreddit size threshold
bhauth · 2024-01-23T00:38:13.747Z · comments (3)

How I build and run behavioral interviews
benkuhn · 2024-02-26T05:50:05.328Z · comments (6)

Video and transcript of presentation on Scheming AIs
Joe Carlsmith (joekc) · 2024-03-22T15:52:03.311Z · comments (1)

RLHF is the worst possible thing done when facing the alignment problem
tailcalled · 2024-09-19T18:56:27.676Z · comments (10)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

If you are also the worst at politics
lukehmiles (lcmgcd) · 2024-05-26T20:07:49.201Z · comments (8)

← previous page (newer posts) · next page (older posts) →

The standard for the reader should still be "don't be sure the references support this unless you check them", which actually means that when the reader becomes a writer, that reader/writer should actually not only have checked their own references, but also checked the references of their references, before publishing anything. ↩︎
Perhaps excusable since nobody actually knows how to make the LLM get it right reliably. ↩︎

LessWrong 2.0 Reader

Archive

Recent comments