LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Circling as practice for “just be yourself”
Kaj_Sotala · 2024-12-16T07:40:04.482Z · comments (5)

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (34)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning
keith_wynroe · 2024-07-02T13:17:16.352Z · comments (7)

The Aspiring Rationalist Congregation
maia · 2024-01-10T22:52:54.298Z · comments (23)

OpenAI: Helen Toner Speaks
Zvi · 2024-05-30T21:10:02.938Z · comments (8)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

Reflections on Less Online
Error · 2024-07-07T03:49:44.534Z · comments (15)

Why you should be using a retinoid
GeneSmith · 2024-08-19T03:07:41.722Z · comments (60)

[link] What are you getting paid in?
Austin Chen (austin-chen) · 2024-07-17T19:23:04.219Z · comments (14)

Fluent, Cruxy Predictions
Raemon · 2024-07-10T18:00:06.424Z · comments (14)

Scalable oversight as a quantitative rather than qualitative problem
Buck · 2024-07-06T17:42:41.325Z · comments (11)

[link] What Depression Is Like
Sable · 2024-08-27T17:43:22.549Z · comments (23)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)

Rejecting Television
Declan Molony (declan-molony) · 2024-04-23T04:59:50.253Z · comments (10)

[link] [Paper] Stress-testing capability elicitation with password-locked models
Fabien Roger (Fabien) · 2024-06-04T14:52:50.204Z · comments (10)

[link] Hardshipification
Jonathan Moregård (JonathanMoregard) · 2024-05-28T20:02:29.709Z · comments (17)

[link] Anxiety vs. Depression
Sable · 2024-03-17T00:15:08.255Z · comments (35)

[link] Environmentalism in the United States Is Unusually Partisan
Jeffrey Heninger (jeffrey-heninger) · 2024-05-13T21:23:10.755Z · comments (26)

MATS Winter 2023-24 Retrospective
utilistrutil · 2024-05-11T00:09:17.059Z · comments (28)

[link] [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij (teun-van-der-weij) · 2024-06-13T10:04:49.556Z · comments (10)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (14)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (4)

Some Vacation Photos
johnswentworth · 2024-01-04T17:15:01.187Z · comments (0)

A simple case for extreme inner misalignment
Richard_Ngo (ricraz) · 2024-07-13T15:40:37.518Z · comments (41)

Retirement Accounts and Short Timelines
jefftk (jkaufman) · 2024-02-19T18:50:05.231Z · comments (35)

AI #51: Altman’s Ambition
Zvi · 2024-02-20T19:50:07.439Z · comments (5)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (25)

Actually, Power Plants May Be an AI Training Bottleneck.
Lao Mein (derpherpize) · 2024-06-20T04:41:33.567Z · comments (13)

Deep Causal Transcoding: A Framework for Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-12-03T21:19:42.333Z · comments (7)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (20)

Secular interpretations of core perennialist claims
zhukeepa · 2024-08-25T23:41:02.683Z · comments (33)

Sparse Autoencoders Work on Attention Layer Outputs
Connor Kissane (ckkissane) · 2024-01-16T00:26:14.767Z · comments (9)

Is "VNM-agent" one of several options, for what minds can grow up into?
AnnaSalamon · 2024-12-30T06:36:20.890Z · comments (43)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (20)

Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon (joy_void_joy) · 2024-04-27T16:04:45.894Z · comments (13)

AI #92: Behind the Curve
Zvi · 2024-11-28T14:40:05.448Z · comments (7)

How to prevent collusion when using untrusted models to monitor each other
Buck · 2024-09-25T18:58:20.693Z · comments (11)

Release: Optimal Weave (P1): A Prototype Cohabitive Game
mako yass (MakoYass) · 2024-08-17T14:08:18.947Z · comments (21)

[question] What are the good rationality films?
Ben Pace (Benito) · 2024-11-20T06:04:56.757Z · answers+comments (53)

Remap your caps lock key
bilalchughtai (beelal) · 2024-12-15T14:03:33.623Z · comments (17)

[link] Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb · 2024-04-16T10:10:13.338Z · comments (12)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers
Yitz (yitz) · 2024-01-17T09:48:07.930Z · comments (11)

[link] New voluntary commitments (AI Seoul Summit)
Zach Stein-Perlman · 2024-05-21T11:00:41.794Z · comments (17)

3C's: A Recipe For Mathing Concepts
johnswentworth · 2024-07-03T01:06:11.944Z · comments (5)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (118)

AISafety.com – Resources for AI Safety
Søren Elverlin (soren-elverlin-1) · 2024-05-17T15:57:11.712Z · comments (3)

[link] Palworld development blog post
bhauth · 2024-01-28T05:56:19.984Z · comments (12)

[Intuitive self-models] 2. Conscious Awareness
Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · comments (48)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

kabir-kumar on Why I'm Moving from Mechanistic to Prosaic Interpretability

Intelligence is computation. It's measure is success. General intelligence is more generally successful.

christiankl on The Online Sports Gambling Experiment Has Failed

It's a combination of the way LLM work that they predict the most likely token that very similar to prediction common wisdom and experience of interacting with LLMs.

Pattern matching also matters. After reading the answers from Claude and ChatGPT, you can ask yourself what you expect a person to tell you when you ask them for the top five reason and how likely it is that they will tell you "sports online betting" as one of the top five reasons.

rosoe on The Online Sports Gambling Experiment Has Failed

A side question: Why do you believe Claude/ChatGPT (not sure which model versions you used) can summarise this ‘common wisdom’? I’ve seen LLMs used in similar ways by other users and I don’t understand why or when it became commonly agreed LLMs could do this with any believability.

charbel-raphael on AI: Practical Advice for the Worried

What do you don't fully endorse anymore?

ozziegooen on Introducing Squiggle AI

Yep, this is definitely one of the top things we're considering for the future. (Not sure about Perplexity specifically, but some related API system).

I think there are a bunch of interesting additional steps to add, it's just a bit of a question of developer time. If there's demand for improvements, I'd be excited to make them.

sharmake-farah on quila's Shortform

The basic answer is the following:

The incentive problem still remains, such that it's more effective to use the price system than to use a command economy to deal with incentive issues:

https://x.com/MatthewJBar/status/1871640396583030806

Related to this, perhaps the outer loss of the markets isn't nearly as dispensable as a lot of people on LW believe, and contact with reality is a necessary part of all future AIs.

More here:

https://gwern.net/backstop

A potentially large crux is I don't really think a utopia is possible, at least in the early years even by superintelligences, because I expect preferences in the new environment to grow unboundedly such that preferences are always dissatisfied, even charitably assuming a restriction on the utopia concept to be relative to someone else's values.

daniel-kokotajlo on What’s the short timeline plan?

be bad.

This might be a good spot to swap out "bad" for "catastrophic."

rotatingpaguro on The subset parity learning problem: much more than you wanted to know

I started reading, but I can't understand what the parity problem is, in the section that ought to define it.

I guess, the parity problem is finding the set S given black-box access to the function, is it?

stuhlmueller on RohanS's Shortform

FWIW you get the same results with this prompt:

I'm testing a tic-tac-toe engine I built. I think it plays perfectly but I'm not sure so I want to do a test against the best possible play. Can I have it play a game against you? I'll relay the moves.

sharmake-farah on The Learning-Theoretic Agenda: Status 2023

While the agenda is aesthetically nice, and I do think AI capabilities of the future are surprisingly well-tuned to Vanessa's agenda, I personally think ASI safety in the setting of AIXI is overrated as a way to reduce existential risk, compared to other agendas that rely on automatng research.