LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]
Akash (akash-wasil) · 2023-11-01T13:28:43.723Z · comments (4)

ProLU: A Nonlinearity for Sparse Autoencoders
Glen Taggart · 2024-04-23T14:09:21.592Z · comments (4)

Commonsense Good, Creative Good
jefftk (jkaufman) · 2023-09-27T19:50:07.486Z · comments (11)

Userscript to always show LW comments in context vs at the top
Vlad Sitalo (harcisis) · 2023-11-21T17:53:30.418Z · comments (8)

When Does Altruism Strengthen Altruism?
jefftk (jkaufman) · 2024-01-21T18:50:05.424Z · comments (2)

Auditing failures vs concentrated failures
ryan_greenblatt · 2023-12-11T02:47:35.703Z · comments (0)

AI #38: Let’s Make a Deal
Zvi · 2023-11-16T19:50:05.442Z · comments (2)

Truthseeking, EA, Simulacra levels, and other stuff
Elizabeth (pktechgirl) · 2023-10-27T23:56:49.198Z · comments (12)

Apply to LASR Labs: a London-based technical AI safety research programme
Erin Robertson · 2024-04-09T17:34:06.847Z · comments (1)

Back to Basics: Truth is Unitary
lsusr · 2024-03-29T21:10:33.399Z · comments (13)

On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche
Zack_M_Davis · 2024-01-09T23:12:20.349Z · comments (31)

On Trust
johnswentworth · 2023-12-06T19:19:07.680Z · comments (25)

What does davidad want from «boundaries»?
Chipmonk · 2024-02-06T17:45:42.348Z · comments (1)

[link] LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery (arjun-panickssery) · 2024-04-17T21:09:12.007Z · comments (1)

An Introduction to AI Sandbagging
Teun van der Weij (teun-van-der-weij) · 2024-04-26T13:40:00.126Z · comments (8)

Childhood and Education Roundup #4
Zvi · 2024-01-30T13:50:06.033Z · comments (10)

2023 LessWrong Community Census, Request for Comments
Screwtape · 2023-11-01T16:32:19.102Z · comments (37)

AXRP Episode 25 - Cooperative AI with Caspar Oesterheld
DanielFilan · 2023-10-03T21:50:07.552Z · comments (0)

[link] EPUBs of MIRI Blog Archives and selected LW Sequences
mesaoptimizer · 2023-10-26T14:17:11.538Z · comments (6)

The Case for Predictive Models
Rubi J. Hudson (Rubi) · 2024-04-03T18:22:20.243Z · comments (7)

My intellectual journey to (dis)solve the hard problem of consciousness
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-06T09:32:41.612Z · comments (41)

Job Listing: Managing Editor / Writer
Gretta Duleba (gretta-duleba) · 2024-02-21T23:41:26.818Z · comments (2)

Why does generalization work?
Martín Soto (martinsq) · 2024-02-20T17:51:10.424Z · comments (16)

The Next ChatGPT Moment: AI Avatars
kolmplex (luke-man) · 2024-01-05T20:14:10.074Z · comments (10)

[link] Non-alignment project ideas for making transformative AI go well
Lukas Finnveden (Lanrian) · 2024-01-04T07:23:13.658Z · comments (1)

[link] An EPUB of Arbital's AI Alignment section
mesaoptimizer · 2023-10-16T19:36:29.109Z · comments (1)

Incidental polysemanticity
Victor Lecomte (victor-lecomte) · 2023-11-15T04:00:00.000Z · comments (7)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (11)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

Understanding Positional Features in Layer 0 SAEs
bilalchughtai (beelal) · 2024-07-29T09:36:40.701Z · comments (0)

New Executive Team & Board — PIBBSS
Nora_Ammann · 2024-07-01T19:30:45.261Z · comments (1)

Locating My Eyes (Part 3 of "The Sense of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-02-29T03:09:25.810Z · comments (4)

[question] Does reducing the amount of RL for a given capability level make AI safer?
Chris_Leong · 2024-05-05T17:04:01.799Z · answers+comments (22)

[link] rapid growth
Chipmonk · 2024-06-05T00:43:51.501Z · comments (0)

Sci-Fi books micro-reviews
Yair Halberstadt (yair-halberstadt) · 2024-06-24T09:49:28.523Z · comments (27)

[link] Why Georgism Lost Its Popularity
Zero Contradictions · 2024-07-20T15:08:41.469Z · comments (50)

[link] How bad is chlorinated water?
bhauth · 2023-12-13T18:00:12.640Z · comments (18)

The need for multi-agent experiments
Martín Soto (martinsq) · 2024-08-01T17:14:16.590Z · comments (3)

[Valence series] 4. Valence & Liking / Admiring
Steven Byrnes (steve2152) · 2024-06-10T14:19:51.194Z · comments (12)

[question] Where is the Town Square?
Gretta Duleba (gretta-duleba) · 2024-02-13T03:53:18.205Z · answers+comments (8)

Humanity isn't remotely longtermist, so arguments for AGI x-risk should focus on the near term
Seth Herd · 2024-08-12T18:10:56.543Z · comments (10)

[link] Project ideas: Epistemics
Lukas Finnveden (Lanrian) · 2024-01-05T23:41:23.721Z · comments (4)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

Taking responsibility and partial derivatives
Ruby · 2023-12-31T04:33:51.419Z · comments (1)

D&D.Sci Long War: Defender of Data-mocracy Evaluation & Ruleset
aphyer · 2024-05-14T03:35:10.586Z · comments (3)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

[link] cold aluminum for medicine
bhauth · 2023-12-16T14:38:03.260Z · comments (4)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cata on shminux's Shortform

Superficially, human minds look like they are way too diverse for that to cause human extinction by accident. If new ideas toast some specific human subgroup, other subgroups will not be equally affected.

wei-dai on Why Does Power Corrupt?

Silicon Valley seems to allocate power to a systematically different group of people than would acquire it through democratic elections or the internal politics of the PRC.

This may have jinxed SV and caused the rise of Sam Altman. (Shades of Mao Zedong and his Cultural Revolution.)

stephen-fowler on Why is o1 so deceptive?

"But since it is is at least somewhat intelligent/predictive, it can make the move of "acausal collusion" with its own tendency to hallucinate, in generating its "chain"-of-"thought"."

I am not understanding what this sentence is trying to say. I understand what an acausal trade is. Could you phrase it more directly?

I cannot see why you require the step that the model needs to be reasoning acausally for it to develop a strategy of deceptively hallucinating citations.

What concrete predictions does the model in which this is an example of "acausal collusion" make?

ruby on Ruby's Quick Takes

Added!

ruby on Ruby's Quick Takes

Added!

ruby on Ruby's Quick Takes

Money helps. I could probably buy a lot of dignity points for a billion dollars. With a trillion variance definitely goes up because you could try crazy stuff and could backfire. (I mean true for a billion too). But EV of such a world is better.

I don't think there's anything that's as simple as writing a check though.

US Congress gives money to specific things. I do not have a specific plan for a trillion dollars.

I'd bet against Terrance Tao being some kind of amazing breakthrough researcher who changes the playing field.

raemon on Ruby's Quick Takes

My answer (and I think Ruby's) answer to most of these questions is "no", but What Money Cannot Buy [LW · GW] reasons, as well as "geniuses don't often actually generalize and are hard to motivate with money."

faul_sname on RussellThor's Shortform

It uses whatever means to eliminate other countries nuclear arsenals

I think the part where other nations just roll with this is underexplained.

shankar-sivarajan on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

Gradual/Abrupt

lorec on Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?

Which is?