LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan · 2024-04-11T21:30:04.244Z · comments (10)

Text Posts from the Kids Group: 2020
jefftk (jkaufman) · 2024-04-13T22:30:05.326Z · comments (3)

[link] The Inner Ring by C. S. Lewis
Saul Munn (saul-munn) · 2024-04-24T22:48:09.228Z · comments (6)

Best in Class Life Improvement
sapphire (deluks917) · 2024-04-04T01:51:02.556Z · comments (20)

Implementing activation steering
Annah (annah) · 2024-02-05T17:51:55.851Z · comments (7)

[link] New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters
Jesse Hoogland (jhoogland) · 2024-11-27T22:06:12.914Z · comments (4)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

Personal AI Planning
jefftk (jkaufman) · 2024-11-10T14:00:06.837Z · comments (10)

Generalized Stat Mech: The Boltzmann Approach
David Lorell · 2024-04-12T17:47:31.880Z · comments (7)

Duct Tape security
Isaac King (KingSupernova) · 2024-04-26T18:57:05.659Z · comments (11)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

Different senses in which two AIs can be “the same”
Vivek Hebbar (Vivek) · 2024-06-24T03:16:43.400Z · comments (1)

When Are Circular Definitions A Problem?
johnswentworth · 2024-05-28T20:00:23.408Z · comments (15)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

Indecision and internalized authority figures
Kaj_Sotala · 2024-07-06T10:10:02.528Z · comments (1)

[link] The 2nd Demographic Transition
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-06T14:10:13.095Z · comments (17)

"Fractal Strategy" workshop report
Raemon · 2024-04-06T21:26:53.263Z · comments (22)

[link] Anthropic leadership conversation
Zach Stein-Perlman · 2024-12-20T22:00:45.229Z · comments (17)

What and Why: Developmental Interpretability of Reinforcement Learning
Garrett Baker (D0TheMath) · 2024-07-09T14:09:40.649Z · comments (4)

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Joar Skalse (Logical_Lunatic) · 2024-05-17T19:13:31.380Z · comments (10)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

minutes from a human-alignment meeting
bhauth · 2024-05-24T05:01:53.904Z · comments (4)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

Introducing AI-Powered Audiobooks of Rational Fiction Classics
Askwho · 2024-05-04T17:32:49.719Z · comments (14)

AE Studio @ SXSW: We need more AI consciousness research (and further resources)
AE Studio (AEStudio) · 2024-03-26T20:59:09.129Z · comments (8)

What is "True Love"?
johnswentworth · 2024-08-18T16:05:47.358Z · comments (11)

Friendship is transactional, unconditional friendship is insurance
Ruby · 2024-07-17T22:52:41.967Z · comments (24)

Ophiology (or, how the Mamba architecture works)
Danielle Ensign (phylliida-dev) · 2024-04-09T19:31:09.975Z · comments (8)

SB 1047 Is Weakened
Zvi · 2024-06-06T13:40:41.547Z · comments (4)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

Timaeus is hiring!
Jesse Hoogland (jhoogland) · 2024-07-12T23:42:28.651Z · comments (6)

Preventing model exfiltration with upload limits
ryan_greenblatt · 2024-02-06T16:29:33.999Z · comments (22)

How to be an amateur polyglot
arisAlexis (arisalexis) · 2024-05-08T15:08:11.404Z · comments (16)

[link] Most experts believe COVID-19 was probably not a lab leak
DanielFilan · 2024-02-02T19:28:00.319Z · comments (89)

Understanding SAE Features with the Logit Lens
Joseph Bloom (Jbloom) · 2024-03-11T00:16:57.429Z · comments (0)

[link] shoes with springs
bhauth · 2023-12-30T21:46:55.319Z · comments (8)

SAEs (usually) Transfer Between Base and Chat Models
Connor Kissane (ckkissane) · 2024-07-18T10:29:46.138Z · comments (0)

[question] Will quantum randomness affect the 2028 election?
Thomas Kwa (thomas-kwa) · 2024-01-24T22:54:30.800Z · answers+comments (52)

The Third Fundamental Question
Screwtape · 2024-11-15T04:01:33.770Z · comments (7)

OpenAI's Preparedness Framework: Praise & Recommendations
Akash (akash-wasil) · 2024-01-02T16:20:04.249Z · comments (1)

Please do not use AI to write for you
Richard_Kennaway · 2024-08-21T09:53:34.425Z · comments (34)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (18)

Do Not Mess With Scarlett Johansson
Zvi · 2024-05-22T15:10:03.215Z · comments (7)

AI #69: Nice
Zvi · 2024-06-20T12:40:02.566Z · comments (9)

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours
Seth Herd · 2024-08-05T15:38:09.682Z · comments (22)

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

mondsemmel on Bryce Robertson's Shortform

Recommendation: make the "Last updated" timestamp on these pages way more prominent, e.g. by moving them to the top below the page title. (Like what most news websites nowadays do for SEO, or like where timestamps are located on LW posts.) Otherwise absolutely no-one will know that you do this, or that these resources are not outdated but are actually up-to-date.

The current timestamp location is so unusual that I only noticed it by accident, and was in fact about to write a comment suggesting you add a timestamp at all.

lee-aao on o3, Oh My

I think I'm confused here.
Is it fair to say that o3 does math and coding better than the average SWE?
If this is true, then I really don't understand why it hasn't made all the headlines.
Any explanation?

otto-barten on otto.barten's Shortform

Assuming positive defense/offense balance can be achieved in principle, what would an AGI-powered defense look like?

abstractapplic on D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset

Just realized I forgot to mention this: I really like how the interactive handled the Bonus Objective, i.e. if the player is thinking along the right lines their character automatically makes the in-universe sensible/optimal decision for them (which means you can set up a fair Bonus Objective for players who don't live in that universe and so don't have all the context).

morpheus on On Eating the Sun

I was already sold on singularity. For what it's worth I found the post and comments very helpful for why you would want to take the sun apart in the first place and why it would be feasible and desirable for superintelligent and non-superintelligent civilization (Turning the sun into a smaller sun that doesn't explode seems nicer than having it explode. Fusion gives off way more energy than lifting the material. Gravity is the weakest of the 4 forces after all. In a superintelligent civilization with reversible computers, not taking apart the sun will make readily available mass a taut constraint).

quetzal_rainbow on On Eating the Sun

If you can use 1kg of hydrogen to lift x>1kg of hydrogen using proton-proton fusion, you are getting exponential bulidup, limited only by "how many proton-proton reactors you can build in Solar system" and "how willing you are to actually build them", and you can use exponential buildup to create all necessary infrastructure.

antontimmer on Nathan Young's Shortform

Here is an example which I believe is directionally correct, it took me roughly 20 minutes to come up with it. The prompt is "how do living systems create meaning "?:

My life feels like it has meaning (sensory-motor behavior and conceptual intentional aspects). Looking at it through an evolutionary perspective, it is highly likely that meaning assignment is the way through which living systems survived. Thus, there has to be some base biological level at which meaning is created through cell-cell communication/ bioelectricity/ biochemistry /biosensoring etc.
Life is just made of atoms. Atoms are just automata. This implies, there is no meaning at the atom level and thus it cannot pop at a higher levels through emergence or some shit. You are delusional to believe there is some meaning assignment in life.
Meaning is something that is defined through the language that we speak. It is well known that different cultures have different words and conceptual framing which implies that meaning is different in different cultures. Meaning thus only depends on language.
Meaning is just a social construct and we can define anything to have meaning. Thus it doesn't matter what you find meaningful since it is just something you inherited through society and parenting.

I believe points 1-3 are fine, point 4 is kinda shaky.

jessica-liu-taylor on On Eating the Sun

Doesn't have to expend the energy. It's about reshaping the matter to machines. Computers take lots of mass-energy to constitute them, not to power them.
Things can go 6 orders of magnitude faster due to intelligence/agency, it's not highly unlikely in general.
I agree that in theory the arguments here could be better. It might require knowing more physics than I do, and has the "how does Kasparov beat you at chess" problem.

auspicious on 2024 in AI predictions

Thank you for putting this together.

Something I find interesting is that even many of the highest-profile skeptics of AI progress are surprisingly bullish (from an objective perspective).

For example, Yann LeCun has said we might get to AGI within a decade or two, and even Gary Marcus has gone on record saying "I do think we will eventually reach AGI (artificial general intelligence), and quite possibly before the end of this century."

"Before the end of this century" might seem pessimistic, but you'd think a true pessimist would say it will take centuries or millennia or even never happen at all. Almost no one seems to be saying that.

ted-sanders on Tips On Empirical Research Slides

Management consulting firms have lots of great ideas on slide design: https://www.theanalystacademy.com/consulting-presentations/

Some things they do well:

They treat slides as documents that can be understood standalone (this is even useful when presenting, as not everyone is following every word)
They employ a lot of hierarchy to help make the content skimmable (helpful for efficiency)
They put conclusions / summaries / action items up front, details behind (helpful for efficiency, especially in a high trust environments)