LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Reflections on Less Online
Error · 2024-07-07T03:49:44.534Z · comments (15)

Scalable oversight as a quantitative rather than qualitative problem
Buck · 2024-07-06T17:42:41.325Z · comments (11)

Fluent, Cruxy Predictions
Raemon · 2024-07-10T18:00:06.424Z · comments (14)

[link] Nietzsche's Morality in Plain English
Arjun Panickssery (arjun-panickssery) · 2023-12-04T00:57:42.839Z · comments (13)

MATS Winter 2023-24 Retrospective
utilistrutil · 2024-05-11T00:09:17.059Z · comments (28)

[link] Hardshipification
Jonathan Moregård (JonathanMoregard) · 2024-05-28T20:02:29.709Z · comments (17)

[link] "AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case
habryka (habryka4) · 2024-05-03T18:10:12.478Z · comments (10)

[link] [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij (teun-van-der-weij) · 2024-06-13T10:04:49.556Z · comments (10)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (4)

[link] A Universal Emergent Decomposition of Retrieval Tasks in Language Models
Alexandre Variengien (alexandre-variengien) · 2023-12-19T11:52:27.354Z · comments (3)

[link] What are you getting paid in?
Austin Chen (austin-chen) · 2024-07-17T19:23:04.219Z · comments (14)

[link] [Paper] Stress-testing capability elicitation with password-locked models
Fabien Roger (Fabien) · 2024-06-04T14:52:50.204Z · comments (10)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

Some for-profit AI alignment org ideas
Eric Ho (eh42) · 2023-12-14T14:23:20.654Z · comments (19)

Actually, Power Plants May Be an AI Training Bottleneck.
Lao Mein (derpherpize) · 2024-06-20T04:41:33.567Z · comments (13)

Some Vacation Photos
johnswentworth · 2024-01-04T17:15:01.187Z · comments (0)

Why you should be using a retinoid
GeneSmith · 2024-08-19T03:07:41.722Z · comments (60)

AI #51: Altman’s Ambition
Zvi · 2024-02-20T19:50:07.439Z · comments (5)

Sparse Autoencoders Work on Attention Layer Outputs
Connor Kissane (ckkissane) · 2024-01-16T00:26:14.767Z · comments (9)

[link] What Depression Is Like
Sable · 2024-08-27T17:43:22.549Z · comments (23)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (25)

Retirement Accounts and Short Timelines
jefftk (jkaufman) · 2024-02-19T18:50:05.231Z · comments (35)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (13)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (19)

Release: Optimal Weave (P1): A Prototype Cohabitive Game
mako yass (MakoYass) · 2024-08-17T14:08:18.947Z · comments (21)

Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon (joy_void_joy) · 2024-04-27T16:04:45.894Z · comments (13)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers
Yitz (yitz) · 2024-01-17T09:48:07.930Z · comments (11)

Secular interpretations of core perennialist claims
zhukeepa · 2024-08-25T23:41:02.683Z · comments (32)

[link] Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb · 2024-04-16T10:10:13.338Z · comments (12)

AISafety.com – Resources for AI Safety
Søren Elverlin (soren-elverlin-1) · 2024-05-17T15:57:11.712Z · comments (3)

[link] New voluntary commitments (AI Seoul Summit)
Zach Stein-Perlman · 2024-05-21T11:00:41.794Z · comments (17)

[question] What are the good rationality films?
Ben Pace (Benito) · 2024-11-20T06:04:56.757Z · answers+comments (50)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning
keith_wynroe · 2024-07-02T13:17:16.352Z · comments (7)

Refusal mechanisms: initial experiments with Llama-2-7b-chat
Andy Arditi (andy-arditi) · 2023-12-08T17:08:01.250Z · comments (7)

[link] Palworld development blog post
bhauth · 2024-01-28T05:56:19.984Z · comments (12)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (17)

How to prevent collusion when using untrusted models to monitor each other
Buck · 2024-09-25T18:58:20.693Z · comments (6)

3C's: A Recipe For Mathing Concepts
johnswentworth · 2024-07-03T01:06:11.944Z · comments (5)

The Gemini Incident
Zvi · 2024-02-22T21:00:04.594Z · comments (19)

Survey of 2,778 AI authors: six parts in pictures
KatjaGrace · 2024-01-06T04:43:34.590Z · comments (1)

[link] Not every accommodation is a Curb Cut Effect: The Handicapped Parking Effect, the Clapper Effect, and more
Michael Cohn (michael-cohn) · 2024-09-15T05:27:36.691Z · comments (39)

[link] Gwern Branwen interview on Dwarkesh Patel’s podcast: “How an Anonymous Researcher Predicted AI's Trajectory”
Said Achmiz (SaidAchmiz) · 2024-11-14T23:53:34.922Z · comments (0)

Studying The Alien Mind
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-12-05T17:27:28.049Z · comments (10)

Self-Referential Probabilistic Logic Admits the Payor's Lemma
Yudhister Kumar (randomwalks) · 2023-11-28T10:27:29.029Z · comments (14)

Darwinian Traps and Existential Risks
KristianRonn · 2024-08-25T22:37:14.142Z · comments (14)

Quick look: applications of chaos theory
Elizabeth (pktechgirl) · 2024-08-18T15:00:07.853Z · comments (51)

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.
Andrew_Critch · 2024-11-22T03:26:11.681Z · comments (52)

[link] MIRI's May 2024 Newsletter
Harlan · 2024-05-15T00:13:30.153Z · comments (1)

Graceful Degradation
Screwtape · 2024-11-05T23:57:53.362Z · comments (8)

[link] My thesis (Algorithmic Bayesian Epistemology) explained in more depth
Eric Neyman (UnexpectedValues) · 2024-05-09T19:43:16.543Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

whestler on Which things were you surprised to learn are not metaphors?

And here I was thinking it was a metaphor. Like, they feel literally inflated? If I've been climbing and I'm tired my muscles feel weak, but not inflated. I've never felt that way before.

milan-w on You are Pablo Neruda. It is dawn at Isla Negra in 1968. Matilde is asleep in your bed. Write what comes.

I got this continuation on a Nicanor Parra piece by having Claude Sonnet iterate on it's outputs with my feedback:

Yo soy Lucila Alcayaga
alias Gabriela Mistral
primero me gané el Nobel
y después el Nacional.

A pesar de que estoy muerta
me sigo sintiendo mal
porque no me dieron nunca
el Premio Municipal.

Dicen que fui muy austera
casi una santa moral
pero tengo mis secretos
que no voy a confesar.

En los billetes de cinco
mi cara pueden mirar
pero ahora que estoy muerta
no los puedo ni gastar.

Desolación fue mi obra
mas no me pueden culpar
si en el Valle de Elqui
me tocó penar.

Y aunque parezca una broma
de la fauna nacional
más conocen mi retrato
que lo que debo cobrar.

Stanzas 1 and 2 are original Parra content, all the rest is our continuation.

chris_leong on Chris_Leong's Shortform

Sharing this resource doc on AI Safety & Entrepreneurship that I created in case anyone finds this helpful:

https://docs.google.com/document/d/1m_5UUGf7do-H1yyl1uhcQ-O3EkWTwsHIxIQ1ooaxvEE/edit?usp=sharing

lee_0505 on Bogdan Ionut Cirstea's Shortform

There was an article about it before the release.

https://archive.is/IwKSP

At the same meeting, company leadership gave a demonstration of a research project involving its GPT-4 AI model that OpenAI thinks shows some new skills that rise to human-like reasoning, according to a person familiar with the discussion who asked not to be identified because they were not authorized to speak to press.

alexander-gietelink-oldenziel on Could orcas be (trained to be) smarter than humans? 

Sorry i phrased this wrong. You are right. I meant roundtrip time which is twice the length but scales linearly not quadratically.

I actually ran the debate contest to get to the bottom of Jake Cannells arguments. Some of the argument, especially around the landauer argument dont hold up but i think it s important not to throw out the baby with bathwater. I think most of the analysis holds up.

https://www.lesswrong.com/posts/fm88c8SvXvemk3BhW/brain-efficiency-cannell-prize-contest-award-ceremony [LW · GW]

viliam on On AI Detectors Regarding College Applications

For the human who writes the text, is there a simple way to protect yourself from becoming a false positive? Such as, make a very unlikely typo (that wouldn't appear in LLM's training text).

michael-cohn on You are not too "irrational" to know your preferences.

I'm a little surprised by how you view the subtext of the ice cream example. If I imagine myself in either role, I would not interpret Bryce as saying Ash shouldn't like ice cream in some very base sense. I interpret conversations like that as meaning either:

1) "You might have a desire for X but you shouldn't indulge that desire because it has net bad consequences"

or

2) "If you knew all the negative things that X causes, it would spoil your enjoyment of it and you wouldn't be attracted to it anymore."

or

3) "If you knew all the negative things that X causes, your hedonic attraction to the better world that not-x would create would outweigh your hedonic attraction to the experience of X."

Those can all create some kind of unhealthy or manipulative dynamic but I don't see them as the same thing you're saying, which is more like 4) "it's wrong and stupid to enjoy the physical sensation of eating ice cream."

Do you agree with my reading that 1-3 are different from what you're talking about, or do you think they're included within it?

yanling-guo on How Universal Basic Income Could Help Us Build a Brighter Future

If definition was so important to me, I could argue with you what unconditional really means and if the unsupervised Uganda program falls under the definition of UBI even when it’s only granted to applicants with a valid proposal. But I give up, you win. And I don’t have to defend Britannica, because it’s so well established.

euanmclean on Is the mind a program?

thanks, corrected

shoshannah-tekofsky on Is School of Thought related to the Rationality Community?

Thanks for the explanation! Are you familiar with the community here and around Astral Codex Ten (ACX)? There are meetups and events (and a lot of writers) who focus on the art and skill of rationality. That was what led to my question originally.