LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (10)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb · 2024-10-28T17:10:04.272Z · comments (3)

ARENA 4.0 Impact Report
Chloe Li (chloe-li-1) · 2024-11-27T20:51:54.844Z · comments (2)

AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan · 2024-12-01T06:00:06.345Z · comments (0)

[question] Are You More Real If You're Really Forgetful?
Thane Ruthenis · 2024-11-24T19:30:55.233Z · answers+comments (24)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (19)

[link] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Tamay · 2024-11-14T06:13:22.042Z · comments (0)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (13)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

Signaling with Small Orange Diamonds
jefftk (jkaufman) · 2024-11-07T20:20:08.026Z · comments (1)

[link] Characterizing stable regions in the residual stream of LLMs
Jett Janiak (jett) · 2024-09-26T13:44:58.792Z · comments (4)

Book Review: On the Edge: The Business
Zvi · 2024-09-25T12:20:06.230Z · comments (0)

[link] An X-Ray is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
hugofry · 2024-10-07T08:53:14.658Z · comments (0)

Compelling Villains and Coherent Values
Cole Wyeth (Amyr) · 2024-10-06T19:53:47.891Z · comments (4)

Glitch Token Catalog - (Almost) a Full Clear
Lao Mein (derpherpize) · 2024-09-21T12:22:16.403Z · comments (3)

0.202 Bits of Evidence In Favor of Futarchy
niplav · 2024-09-29T21:57:59.896Z · comments (0)

AI Safety Camp 10
Robert Kralisch (nonmali-1) · 2024-10-26T11:08:09.887Z · comments (9)

[link] AISafety.info: What is the "natural abstractions hypothesis"?
Algon · 2024-10-05T12:31:14.195Z · comments (2)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)

[link] Generative ML in chemistry is bottlenecked by synthesis
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-16T16:31:34.801Z · comments (2)

Drug development costs can range over two orders of magnitude
rossry · 2024-11-03T23:13:17.685Z · comments (0)

I'm creating a deep dive podcast episode about the original Leverage Research - would you like to take part?
spencerg · 2024-09-22T14:03:22.164Z · comments (2)

A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
Lao Mein (derpherpize) · 2024-09-20T13:13:26.181Z · comments (7)

LASR Labs Spring 2025 applications are open!
Erin Robertson · 2024-10-04T13:44:20.524Z · comments (0)

The murderous shortcut: a toy model of instrumental convergence
Thomas Kwa (thomas-kwa) · 2024-10-02T06:48:06.787Z · comments (0)

Exploring SAE features in LLMs with definition trees and token lists
mwatkins · 2024-10-04T22:15:28.108Z · comments (5)

Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi (jarviniemi) · 2024-09-05T19:13:34.443Z · comments (0)

OODA your OODA Loop
Raemon · 2024-10-11T00:50:48.119Z · comments (3)

Book a Time to Chat about Interp Research
Logan Riggs (elriggs) · 2024-12-03T17:27:46.808Z · comments (2)

COT Scaling implies slower takeoff speeds
Logan Zoellner (logan-zoellner) · 2024-09-28T16:20:00.320Z · comments (56)

Eye contact is effortless when you’re no longer emotionally blocked on it
Chipmonk · 2024-09-27T21:47:01.970Z · comments (24)

[link] A Percentage Model of a Person
Sable · 2024-10-12T17:55:07.560Z · comments (3)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (46)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (38)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

Doing Research Part-Time is Great
casualphysicsenjoyer (hatta_afiq) · 2024-11-22T19:01:15.542Z · comments (7)

[question] Feedback request: what am I missing?
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-02T17:38:39.625Z · answers+comments (5)

Book Review: On the Edge: The Gamblers
Zvi · 2024-09-24T11:50:06.065Z · comments (1)

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean (euanmclean) · 2024-10-29T12:16:18.448Z · comments (8)

[link] On Fables and Nuanced Charts
Niko_McCarty (niko-2) · 2024-09-08T17:09:07.503Z · comments (2)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data
Sohaib Imran (sohaib-imran) · 2024-11-16T23:22:21.857Z · comments (11)

Video and transcript of presentation on Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-10-08T22:30:38.054Z · comments (1)

Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft
Andrew_Critch · 2024-12-03T09:29:49.745Z · comments (2)

[link] Locally optimal psychology
Chipmonk · 2024-11-25T18:35:11.985Z · comments (7)

Monthly Roundup #22: September 2024
Zvi · 2024-09-17T12:20:08.297Z · comments (10)

(Maybe) A Bag of Heuristics is All There Is & A Bag of Heuristics is All You Need
Sodium · 2024-10-03T19:11:58.032Z · comments (17)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

yair-halberstadt on Drexler's Nanotech Software

I would have concerns about suitably generic, flexible and sensitive humanoid robots, yes.

towards_keeperhood on Orca communication project - seeking feedback (and collaborators)

Currently we basically don't have any datasets where it's labelled what orca says what. When I listen to recordings, I cannot distinguish voices, though idk it's possible that people who listened a lot more can. I think just unsupervised voice clustering would probably not work very accurately. I'd guess it's probably possible to get data on who said what by using an array of hydrophones to infer the location of the sound, but we need very accurate position inference because different orcas are often just 1-10m distance from each other, and for this we might need to get/infer decent estimates of how water temperature varies by depth, and generally there have not yet been attempts to get high precision through this method. (It's definitely harder in water than in air.)

Yeah basically I initially also had rough thoughts into this direction, but I think the create-and-teach language way is probably a lot faster.

I think the earth species project is trying to use AI to decode animal communication, though they don't focus on orcas in particular, but many species including e.g. beluga whales. Didn't look into it a lot but seems possible I could do sth like this in a smarter and more promising way, but probably still would take long.

romeostevensit on Which Biases are most important to Overcome?

Is-ought confabulation Means-ends confabulation Scope sensitivity Fundamental attribution error Attribute substitution Ambiguity aversion Reasoning from consequences

viliam on Linkpost: Rat Traps by Sheon Han in Asterisk Mag

Yeah, I am not even sure what was the point of the article. What is the thing we are supposed to update about? Writing in a different style, or changing our opinions (about what exactly?), or finding completely new topics to talk about so that we are not boring the article author, or...?

daniele-de-nuntiis on A Meritocracy of Taste

From my experience they're getting pretty good, depends on the social but IG reels or YT can keep me entertained with nothing-content for hours

simon on Do simulacra dream of digital sheep?

The argument presented by Aaronson is that, since it would take as much computation to convert the rock/waterfall computation into a usable computation as it would be to just do the usable computation directly, the rock/waterfall isn't really doing the computation.

I find this argument rather convincing, as we are talking about a possible internal property here, and not about the external relation with the rest of the world (which we already agree is rather useless).

self on [deleted]

(I've since found https://www.lesswrong.com/rationality, [? · GW] which does the job.)

davidmanheim on Do simulacra dream of digital sheep?

As with OP, I strongly recommend Aaronson, who explains why waterfalls aren't doing computation in ways that refute the rock example you discuss: https://www.scottaaronson.com/papers/philos.pdf

japancolorado on You are not too "irrational" to know your preferences.

I agree with that. I think that the general ick that I get from the dialogue is the presumption and general tone of Bryce. Thanks for clarifying!

davidmanheim on Do simulacra dream of digital sheep?

You seem to fundamentally misunderstand computation, in ways similar to Searle. I can't engage deeply, but recommend Scott Aaronson's primer on computational complexity: https://www.scottaaronson.com/papers/philos.pdf