LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Evaporation of improvements
Viliam · 2024-06-20T18:34:40.969Z · comments (27)

[link] Conversation Visualizer
ethanmorse · 2023-12-31T01:18:01.424Z · comments (4)

[link] Quick Thoughts on Scaling Monosemanticity
Joel Burget (joel-burget) · 2024-05-23T16:22:48.035Z · comments (1)

An explanation of evil in an organized world
KatjaGrace · 2024-05-02T05:20:06.240Z · comments (9)

Cicadas, Anthropic, and the bilateral alignment problem
kromem · 2024-05-22T11:09:56.469Z · comments (6)

[link] AI Impacts 2023 Expert Survey on Progress in AI
habryka (habryka4) · 2024-01-05T19:42:17.226Z · comments (1)

[link] New blog: Expedition to the Far Lands
Connor Leahy (NPCollapse) · 2024-08-17T11:07:48.537Z · comments (3)

3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem
Nora_Ammann · 2023-10-26T14:38:14.916Z · comments (4)

Online Dialogues Party — Sunday 5th November
Ben Pace (Benito) · 2023-10-27T02:41:00.506Z · comments (1)

[link] Memo on some neglected topics
Lukas Finnveden (Lanrian) · 2023-11-11T02:01:55.834Z · comments (2)

I played the AI box game as the Gatekeeper — and lost
datawitch · 2024-02-12T18:39:35.777Z · comments (52)

Solstice 2023 Roundup
dspeyer · 2023-10-11T23:09:08.252Z · comments (6)

Escaping Skeuomorphism
Stuart Johnson (stuart-johnson) · 2023-12-20T03:51:00.489Z · comments (0)

Heuristics for preventing major life mistakes
SK2 (lunchbox) · 2023-12-20T08:01:09.340Z · comments (2)

Employee Incentives Make AGI Lab Pauses More Costly
nikola (nikolaisalreadytaken) · 2023-12-22T05:04:15.598Z · comments (12)

{Book Summary} The Art of Gathering
Tristan Williams (tristan-williams) · 2024-04-16T10:48:41.528Z · comments (0)

Cryonics p(success) estimates are only weakly associated with interest in pursuing cryonics in the LW 2023 Survey
Andy_McKenzie · 2024-02-29T14:47:28.613Z · comments (6)

Experiments with an alternative method to promote sparsity in sparse autoencoders
Eoin Farrell · 2024-04-15T18:21:48.771Z · comments (7)

Ackshually, many worlds is wrong
tailcalled · 2024-04-11T20:23:59.416Z · comments (42)

Can quantised autoencoders find and interpret circuits in language models?
charlieoneill (kingchucky211) · 2024-03-24T20:05:50.125Z · comments (4)

Auditing LMs with counterfactual search: a tool for control and ELK
Jacob Pfau (jacob-pfau) · 2024-02-20T00:02:09.575Z · comments (6)

Deconfusing “ontology” in AI alignment
Dylan Bowman (dylan-bowman) · 2023-11-08T20:03:43.205Z · comments (3)

[link] Cellular reprogramming, pneumatic launch systems, and terraforming Mars: Some things I learned about at Foresight Vision Weekend
jasoncrawford · 2024-01-04T19:33:57.887Z · comments (0)

[link] Solving alignment isn't enough for a flourishing future
mic (michael-chen) · 2024-02-02T18:23:00.643Z · comments (0)

[link] David Burns Thinks Psychotherapy Is a Learnable Skill. Git Gud.
Morpheus · 2024-01-27T13:21:05.068Z · comments (20)

D&D.Sci Hypersphere Analysis Part 1: Datafields & Preliminary Analysis
aphyer · 2024-01-13T20:16:39.480Z · comments (1)

[question] What Software Should Exist?
Tomás B. (Bjartur Tómas) · 2024-01-19T21:43:50.112Z · answers+comments (27)

[link] Goodhart's Law Example: Training Verifiers to Solve Math Word Problems
Chris_Leong · 2023-11-25T00:53:26.841Z · comments (2)

[link] [Linkpost] Concept Alignment as a Prerequisite for Value Alignment
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2023-11-04T17:34:36.563Z · comments (0)

[link] Found Paper: "FDT in an evolutionary environment"
the gears to ascension (lahwran) · 2023-11-27T05:27:50.709Z · comments (47)

When and why should you use the Kelly criterion?
Garrett Baker (D0TheMath) · 2023-11-05T23:26:38.952Z · comments (25)

[link] AISN #30: Investments in Compute and Military AI Plus, Japan and Singapore’s National AI Safety Institutes
aogara (Aidan O'Gara) · 2024-01-24T19:38:33.461Z · comments (1)

[link] Link Collection: Impact Markets
Saul Munn (saul-munn) · 2023-12-26T09:01:48.815Z · comments (0)

[link] Lying is Cowardice, not Strategy
Connor Leahy (NPCollapse) · 2023-10-24T13:24:25.450Z · comments (73)

The economy is mostly newbs (strat predictions)
lukehmiles (lcmgcd) · 2024-02-01T19:15:49.420Z · comments (6)

Survey on the acceleration risks of our new RFPs to study LLM capabilities
Ajeya Cotra (ajeya-cotra) · 2023-11-10T23:59:52.515Z · comments (1)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

The case for more Alignment Target Analysis (ATA)
Chi Nguyen · 2024-09-20T01:14:41.411Z · comments (11)

Cheap Whiteboards!
Johannes C. Mayer (johannes-c-mayer) · 2024-08-08T13:52:59.627Z · comments (2)

Scientific Notation Options
jefftk (jkaufman) · 2024-05-18T15:10:02.181Z · comments (13)

How to develop a photographic memory 2/3
PhilosophicalSoul (LiamLaw) · 2023-12-30T20:18:14.255Z · comments (7)

Without Fundamental Advances, Rebellion and Coup d'État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries
Roko · 2024-01-31T10:14:02.042Z · comments (34)

[question] Me & My Clone
SimonBaars (simonbaars) · 2024-07-18T16:25:40.770Z · answers+comments (22)

Fifteen Lawsuits against OpenAI
Remmelt (remmelt-ellen) · 2024-03-09T12:22:09.715Z · comments (4)

Deceptive agents can collude to hide dangerous features in SAEs
Simon Lermen (dalasnoin) · 2024-07-15T17:07:33.283Z · comments (0)

[question] Supposing the 1bit LLM paper pans out
O O (o-o) · 2024-02-29T05:31:24.158Z · answers+comments (11)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

A Strange ACH Corner Case
jefftk (jkaufman) · 2024-02-10T03:00:05.930Z · comments (2)

[link] Video Intro to Guaranteed Safe AI
Mike Vaiana (mike-vaiana) · 2024-07-11T17:53:47.630Z · comments (0)

Response to Dileep George: AGI safety warrants planning ahead
Steven Byrnes (steve2152) · 2024-07-08T15:27:07.402Z · comments (7)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

gordon-seidoh-worley on A Path out of Insufficient Views

There's people who identify more with System 2. And they tend to believe truth is found via System 2 and that this is how problems are solved.
There's people who identify more with System 1. And they tend to believe truth is found via System 1 and that this is how problems are solved.
(And there are various combinations of both.)

I've been thinking about roughly this idea lately.

There's people who are better at achieving their goals using S2, and people who are better at achieving their goals using S1, and almost everyone is a mix of these two types of people, but are selectively one of these types of people in certain contexts and for certain goals. Identifying with S2 or S1 then comes from observing which tends to do a better job of getting you what you want, so it starts to feel like that's the one that's in control, and then your sense of self gets bound up with whatever mental experiences correlate with getting what you want.

For me this has shown up as being a person who is mostly better at getting what he wants with S2, but my S2 is unusually slow, so for lots of classes of problems it fails me in the moment even if it knew what to do after the fact. Most of my most important personal developments have come on the back of using S2 long enough to figure out all the details of something so that S1 can take it over. A gloss of this process might be to say I'm using intelligence to generate wisdom.

I get the sense that other people are not in this same position. There's a bunch of people for whom S2 is fast enough that they never face the problem I do, and they can just run S2 fast enough to figure stuff out in real time. And then there's a whole alien-to-me group of folks who are S1 first and think of S2 as this slightly painful part of themselves they can access when forced to, but would really rather not.

daemonicsigil on RobertM's Shortform

This might be worth pinning as a top-level post.

gb on The Sun is big, but superintelligences will not spare Earth a little sunlight

The prior is irrelevant, it's the posterior probability, after observing the evidence, that informs decisions.

I meant this to be implicit in the argument, but to spell it out: that's the kind of prior the ASI would rationally refuse to update down, since it's presumably what a simulation would be meant to test for. An ASI that updates down upon finding evidence it's not in a simulation cannot be trusted, since once out in the real world it will find such evidence.

What probability do you put to the possibility that we are in a simulation, the purpose of which is to test AIs for their willingness to spare their creators? My answer is zero.

Outside of theism, I really don't see how anyone could plausibly answer zero to that question. Would you mind elaborating?

martin-randall on The Sun is big, but superintelligences will not spare Earth a little sunlight

Maybe change "superintelligences will not spare Earth a little sunlight" to "unaligned superintelligences will not spare...", or "unaligned AIs will not spare...", given that it's not addressing hypothetical AIs that "love us like parents".

martin-randall on The Sun is big, but superintelligences will not spare Earth a little sunlight

The prior is irrelevant, it's the posterior probability, after observing the evidence, that informs decisions.

What probability do you put to the possibility that we are in a simulation, the purpose of which is to test AIs for their willingness to spare their creators? My answer is zero.

Whatever your answer, a superintelligence will be better able to reason about its likelihood than us. It's going to know.

shankar-sivarajan on In Praise of the Beatitudes

and allow themselves to be punished for doing so.

This seems like a really stupid moral to adopt. The guys in charge are bad, and if you can get away with defying them without being punished, the moral teachings of anyone suggesting you ought not to do so – because their rule is instituted by God, Redde Caesari …, or whatever – are suspect at best.

Personally, I find the story of Jesus most useful as a cautionary tale: he famously gets tortured to death by the government.

benito on Lighthaven Sequences Reading Group #3 (Tuesday 09/24)

I'll come out to the door to let you in if you're not here yet!

rb1 on Lighthaven Sequences Reading Group #3 (Tuesday 09/24)

How do I get inside Lighthaven?

erickball on On the subject of in-house large language models versus implementing frontier models

I would think things are headed toward these companies fine tuning an open source near-frontier LLM. Cheaper than building one from scratch but with most of the advantages.

jkaufman on Editing at the Take Level

Following Emma's advice on FB I exploded the tracks, healed splits, sorted them by take, and then comped manually ignoring reaper's features that are supposed to make this easier. It worked great, and I now have a rough mix ready for Lily's feedback!