LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

What is it to solve the alignment problem?
Joe Carlsmith (joekc) · 2024-08-24T21:19:34.280Z · comments (17)

What is "True Love"?
johnswentworth · 2024-08-18T16:05:47.358Z · comments (9)

Interdictor Ship
lsusr · 2024-08-19T04:59:18.487Z · comments (9)

Base LLMs refuse too
Connor Kissane (ckkissane) · 2024-09-29T16:04:21.343Z · comments (20)

MATS Alumni Impact Analysis
utilistrutil · 2024-09-30T02:35:57.273Z · comments (6)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (6)

Pollsters Should Publish Question Translations
jefftk (jkaufman) · 2024-09-08T22:10:04.932Z · comments (3)

Self-explaining SAE features
Dmitrii Kharlapenko (dmitrii-kharlapenko) · 2024-08-05T22:20:36.041Z · comments (13)

AI #81: Alpha Proteo
Zvi · 2024-09-12T13:00:07.958Z · comments (3)

[link] on bacteria, on teeth
bhauth · 2024-09-30T15:56:56.830Z · comments (9)

Showing SAE Latents Are Not Atomic Using Meta-SAEs
Bart Bussmann (Stuckwork) · 2024-08-24T00:56:46.048Z · comments (9)

[link] AI, centralization, and the One Ring
owencb · 2024-09-13T14:00:16.126Z · comments (11)

[link] Announcing the $200k EA Community Choice
Austin Chen (austin-chen) · 2024-08-14T00:39:37.350Z · comments (8)

Mira Murati leaves OpenAI/ OpenAI to remove non-profit control
Sodium · 2024-09-25T21:15:17.315Z · comments (4)

How you can help pass important AI legislation with 10 minutes of effort
ThomasW · 2024-09-14T22:10:50.386Z · comments (2)

John Schulman leaves OpenAI for Anthropic
Sodium · 2024-08-06T01:23:15.427Z · comments (0)

Referendum Mechanics in a Marketplace of Ideas
Martin Sustrik (sustrik) · 2024-08-25T08:30:01.901Z · comments (2)

The Bitter Lesson for AI Safety Research
adamk · 2024-08-02T18:39:36.884Z · comments (5)

On the UBI Paper
Zvi · 2024-09-03T14:50:08.647Z · comments (6)

[link] Dario Amodei — Machines of Loving Grace
Matrice Jacobine · 2024-10-11T21:43:31.448Z · comments (20)

Rationalists are missing a core piece for agent-like structure (energy vs information overload)
tailcalled · 2024-08-17T09:57:19.370Z · comments (9)

[link] Congressional Insider Trading
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-30T13:32:57.264Z · comments (6)

AI Alignment via Slow Substrates: Early Empirical Results With StarCraft II
Lester Leong (lester-leong) · 2024-10-14T04:05:05.096Z · comments (9)

... Wait, our models of semantics should inform fluid mechanics?!?
johnswentworth · 2024-08-26T16:38:53.924Z · comments (18)

Evidence against Learned Search in a Chess-Playing Neural Network
p.b. · 2024-09-13T11:59:55.634Z · comments (3)

Some Unorthodox Ways To Achieve High GDP Growth
johnswentworth · 2024-08-08T18:58:56.046Z · comments (6)

[link] Pay-on-results personal growth: first success
Chipmonk · 2024-09-14T03:39:12.975Z · comments (2)

AI #84: Better Than a Podcast
Zvi · 2024-10-03T15:00:07.128Z · comments (7)

Measuring Structure Development in Algorithmic Transformers
Micurie (micurie) · 2024-08-22T08:38:02.140Z · comments (4)

Secret Collusion: Will We Know When to Unplug AI?
schroederdewitt · 2024-09-16T16:07:01.119Z · comments (7)

[link] Demis Hassabis — Google DeepMind: The Podcast
Zach Stein-Perlman · 2024-08-16T00:00:04.712Z · comments (8)

[link] Making Eggs Without Ovaries
Niko_McCarty (niko-2) · 2024-09-22T17:44:46.733Z · comments (3)

How the AI safety technical landscape has changed in the last year, according to some practitioners
tlevin (trevor) · 2024-07-26T19:06:47.126Z · comments (6)

[link] How much I'm paying for AI productivity software (and the future of AI use)
jacquesthibs (jacques-thibodeau) · 2024-10-11T17:11:27.025Z · comments (15)

A Path out of Insufficient Views
Unreal · 2024-09-24T20:00:27.332Z · comments (46)

Owain Evans on Situational Awareness and Out-of-Context Reasoning in LLMs
Michaël Trazzi (mtrazzi) · 2024-08-24T04:30:11.807Z · comments (0)

Thiel on AI & Racing with China
Ben Pace (Benito) · 2024-08-20T03:19:18.966Z · comments (10)

[link] Unlocking Solutions—By Understanding Coordination Problems
James Stephen Brown (james-brown) · 2024-07-27T04:52:13.435Z · comments (4)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (1)

Against empathy-by-default
Steven Byrnes (steve2152) · 2024-10-16T16:38:49.926Z · comments (12)

Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson (Rubi) · 2024-10-09T16:38:16.535Z · comments (10)

AI #76: Six Shorts Stories About OpenAI
Zvi · 2024-08-08T13:50:04.659Z · comments (10)

Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Patrick Leask (patrickleask) · 2024-08-17T01:16:53.764Z · comments (0)

The Geometry of Feelings and Nonsense in Large Language Models
7vik (satvik-golechha) · 2024-09-27T17:49:27.420Z · comments (10)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

Model evals for dangerous capabilities
Zach Stein-Perlman · 2024-09-23T11:00:00.866Z · comments (9)

Provably Safe AI: Worldview and Projects
bgold · 2024-08-09T23:21:02.763Z · comments (43)

Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.
Andrew_Critch · 2024-09-11T04:41:24.872Z · comments (7)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (25)

Rewilding the Gut VS the Autoimmune Epidemic
GGD · 2024-08-16T18:00:46.239Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

christiankl on Open Thread Fall 2024

If you search for "Less Wrong Census" you will find the existing surveys of the LessWrong readership.

purplehermann on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

Something like iterative/cliff, with fast and slow expressing time scales

nathan-helm-burger on The Hopium Wars: the AGI Entente Delusion

Dear Max, If you would like more confirmation of the immediacy and likely trajectory of the biorisk from AI, please have a private chat with Kevin Esvalt who is also at MIT. I speak with such concern about biorisk from AI because I've been helping his new AI Biorisk Eval team at SecureBio for the past year. Things are seeming pretty scary on that front.

adam-b on Concrete benefits of making predictions

I think it's still very useful to be able to predict your own behaviour (including in the case where you know you've made a prediction about it).

Things can get weird if you care more about the outcome of the prediction than the outcome of the event in itself, but this should rarely be the case - and is worth avoiding, I think.

purplehermann on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

Can you sort the poll options by popularity?

purplehermann on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

Iterative/Sudden

purplehermann on Overview of strong human intelligence amplification methods

I can only describe the Product, not the tech. The idea would be to plug in a bigger working memory in the area of the brain currently holding working memory. This is the piece I think matters most

On reflection something like wolfram alpha should be enough for calculations, and a well indexed reservoir of knowledge with an LLM pulling up relevant links with summaries should be good enough for the rest

purplehermann on Species as Canonical Referents of Super-Organisms

Inside the super organism you are correct, but the genome is influenced by outside forces as whole over the ages - and any place where this breaks down for long enough you eventually get two species instead of one.

Therefore outside groups can treat the species as a super organism in general, the individual members must be dealt with individually when there is previous loyalty to another member of the other species.

For example, an Englishman and his dog vs an eskimo and his dog. The two humans may be against each other, the dogs may be against each other, but the opposite human/dog interactions would be standard if they weren't already attached to other in-species members.

purplehermann on Species as Canonical Referents of Super-Organisms

This gives the bones of a proper theoretical foundation on the moral duties between members of different species.

For example, this would back the intuition of eating dog to be worse than eating a bear or octupus, regardless of intelligence, and of killing rats out of hand

purplehermann on Isaac King's Shortform

They'd not identical. First, they have a different status, much the same as citizens and aliens have different rights. Second, different species of animals have different relationships with humanity: Dogs are bred to be symbiotic companions Cats are parasites if allowed, pest control if tolerated Rats are disease vector scavengers Chickens are livestock - they lay infertile eggs for human consumption!