LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Please do not use AI to write for you
Richard_Kennaway · 2024-08-21T09:53:34.425Z · comments (34)

The Geometry of Feelings and Nonsense in Large Language Models
7vik (satvik-golechha) · 2024-09-27T17:49:27.420Z · comments (10)

[link] Announcing the $200k EA Community Choice
Austin Chen (austin-chen) · 2024-08-14T00:39:37.350Z · comments (8)

On the UBI Paper
Zvi · 2024-09-03T14:50:08.647Z · comments (6)

[link] Congressional Insider Trading
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-30T13:32:57.264Z · comments (6)

[link] The Alignment Trap: AI Safety as Path to Power
crispweed · 2024-10-29T15:21:26.545Z · comments (17)

Referendum Mechanics in a Marketplace of Ideas
Martin Sustrik (sustrik) · 2024-08-25T08:30:01.901Z · comments (2)

AI #84: Better Than a Podcast
Zvi · 2024-10-03T15:00:07.128Z · comments (7)

Measuring Structure Development in Algorithmic Transformers
Micurie (micurie) · 2024-08-22T08:38:02.140Z · comments (4)

[link] Making Eggs Without Ovaries
Niko_McCarty (niko-2) · 2024-09-22T17:44:46.733Z · comments (3)

Evidence against Learned Search in a Chess-Playing Neural Network
p.b. · 2024-09-13T11:59:55.634Z · comments (3)

[link] How much I'm paying for AI productivity software (and the future of AI use)
jacquesthibs (jacques-thibodeau) · 2024-10-11T17:11:27.025Z · comments (16)

... Wait, our models of semantics should inform fluid mechanics?!?
johnswentworth · 2024-08-26T16:38:53.924Z · comments (18)

Owain Evans on Situational Awareness and Out-of-Context Reasoning in LLMs
Michaël Trazzi (mtrazzi) · 2024-08-24T04:30:11.807Z · comments (0)

Secret Collusion: Will We Know When to Unplug AI?
schroederdewitt · 2024-09-16T16:07:01.119Z · comments (7)

A Path out of Insufficient Views
Unreal · 2024-09-24T20:00:27.332Z · comments (46)

Seeking Collaborators
abramdemski · 2024-11-01T17:13:36.162Z · comments (14)

[link] Demis Hassabis — Google DeepMind: The Podcast
Zach Stein-Perlman · 2024-08-16T00:00:04.712Z · comments (8)

Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson (Rubi) · 2024-10-09T16:38:16.535Z · comments (10)

[question] Could orcas be (trained to be) smarter than humans? 
Towards_Keeperhood (Simon Skade) · 2024-11-04T23:29:26.677Z · answers+comments (11)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder
Steven Byrnes (steve2152) · 2024-10-15T13:31:46.157Z · comments (7)

Thiel on AI & Racing with China
Ben Pace (Benito) · 2024-08-20T03:19:18.966Z · comments (10)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (1)

AI #87: Staying in Character
Zvi · 2024-10-29T07:10:08.212Z · comments (3)

Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Patrick Leask (patrickleask) · 2024-08-17T01:16:53.764Z · comments (0)

[link] The Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-10-18T13:26:25.565Z · comments (9)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

[link] Prices are Bounties
Maxwell Tabarrok (maxwell-tabarrok) · 2024-10-12T14:51:40.689Z · comments (13)

Rewilding the Gut VS the Autoimmune Epidemic
GGD · 2024-08-16T18:00:46.239Z · comments (0)

Model evals for dangerous capabilities
Zach Stein-Perlman · 2024-09-23T11:00:00.866Z · comments (9)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (26)

Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.
Andrew_Critch · 2024-09-11T04:41:24.872Z · comments (7)

Claude Sonnet 3.5.1 and Haiku 3.5
Zvi · 2024-10-24T14:50:06.286Z · comments (9)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

[link] Anthropic's updated Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-10-15T16:46:48.727Z · comments (3)

[link] Can AI Outpredict Humans? Results From Metaculus's Q3 AI Forecasting Benchmark
ChristianWilliams · 2024-10-10T18:58:46.041Z · comments (2)

AI #82: The Governor Ponders
Zvi · 2024-09-19T13:30:04.863Z · comments (8)

Applications of Chaos: Saying No (with Hastings Greer)
Elizabeth (pktechgirl) · 2024-09-21T16:30:07.415Z · comments (16)

The Fragility of Life Hypothesis and the Evolution of Cooperation
KristianRonn · 2024-09-04T21:04:49.878Z · comments (6)

Low Probability Estimation in Language Models
Gabriel Wu (gabriel-wu) · 2024-10-18T15:50:05.947Z · comments (0)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

[link] The Evals Gap
Marius Hobbhahn (marius-hobbhahn) · 2024-11-11T16:42:46.287Z · comments (7)

AI and the Technological Richter Scale
Zvi · 2024-09-04T14:00:08.625Z · comments (8)

Interested in Cognitive Bootcamp?
Raemon · 2024-09-19T22:12:13.348Z · comments (0)

[link] Book review: Xenosystems
jessicata (jessica.liu.taylor) · 2024-09-16T20:17:56.670Z · comments (18)

[question] If I wanted to spend WAY more on AI, what would I spend it on?
Logan Zoellner (logan-zoellner) · 2024-09-15T21:24:46.742Z · answers+comments (16)

Evaluating the truth of statements in a world of ambiguous language.
Hastings (hastings-greer) · 2024-10-07T18:08:09.920Z · comments (19)

An alternative approach to superbabies
Towards_Keeperhood (Simon Skade) · 2024-11-05T22:56:15.740Z · comments (19)

[link] Active Recall and Spaced Repetition are Different Things
Saul Munn (saul-munn) · 2024-11-08T20:14:56.092Z · comments (2)

← previous page (newer posts) · next page (older posts) →

^{^}

"Curated", a term which here means "This just got emailed to 30,000 people, of whom typically half open the email, and it gets shown at the top of the frontpage to anyone who hasn't read it for ~1 week."

LessWrong 2.0 Reader

Archive

Recent comments