LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] The Grapes of Hardness
adamShimi · 2025-03-11T21:01:14.963Z · comments (0)

[link] Progress links and short notes, 2025-03-03
jasoncrawford · 2025-03-04T15:20:35.619Z · comments (0)

Decision-Relevance of worlds and ADT implementations
Maxime Riché (maxime-riche) · 2025-03-06T16:57:42.966Z · comments (0)

[link] Progress links and short notes, 2025-02-17
jasoncrawford · 2025-02-17T19:18:29.422Z · comments (0)

[link] A different take on the Musk v OpenAI preliminary injunction order
TFD · 2025-03-11T12:46:23.497Z · comments (0)

Talking to laymen about AI development
David Steel · 2025-02-17T18:42:23.289Z · comments (0)

[link] METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot (LinkpostBot) · 2025-02-26T20:19:08.640Z · comments (0)

Conditional Importance in Toy Models of Superposition
james__p · 2025-02-02T20:35:38.655Z · comments (3)

What is the best / most proper definition of "Feeling the AGI" there is?
Annapurna (jorge-velez) · 2025-03-04T20:13:40.946Z · comments (5)

[link] Progress links and short notes, 2025-03-10
jasoncrawford · 2025-03-10T20:27:39.901Z · comments (0)

[link] Reply to Vitalik on d/acc
samuelshadrach (xpostah) · 2025-03-05T18:55:55.340Z · comments (0)

Technical comparison of Deepseek, Novasky, S1, Helix, P0
Juliezhanggg · 2025-02-25T04:20:40.413Z · comments (0)

The Structure of Professional Revolutions
SebastianG (JohnBuridan) · 2025-02-09T13:23:01.059Z · comments (0)

Amplifying the Computational No-Coincidence Conjecture
glauberdebona · 2025-03-07T21:29:54.933Z · comments (6)

[link] Forecasting newsletter #3/2025: Long march through the institutions
NunoSempere (Radamantis) · 2025-03-07T18:17:42.513Z · comments (0)

Post-hoc reasoning in chain of thought
Kyle Cox (klye) · 2025-02-05T18:58:29.802Z · comments (0)

You should use Consumer Reports
KvmanThinking (avery-liu) · 2025-02-27T01:52:17.235Z · comments (5)

Universal AI Maximizes Variational Empowerment: New Insights into AGI Safety
Yusuke Hayashi (hayashiyus) · 2025-02-27T00:46:46.989Z · comments (0)

[link] AI Safety at the Frontier: Paper Highlights, February '25
gasteigerjo · 2025-03-03T22:09:37.845Z · comments (0)

A Hogwarts Guide to Citizenship
WillPetillo · 2025-03-11T05:50:02.768Z · comments (1)

[link] Cooperation for AI safety must transcend geopolitical interference
Matrice Jacobine · 2025-02-16T18:18:01.539Z · comments (6)

What working on AI safety taught me about B2B SaaS sales
purple fire (jack-edwards) · 2025-02-04T20:50:19.990Z · comments (12)

Do we want alignment faking?
Florian_Dietz · 2025-02-28T21:50:48.891Z · comments (4)

Exploring how OthelloGPT computes its world model
JMaar (jim-maar) · 2025-02-02T21:29:09.433Z · comments (0)

[link] The Dilemma’s Dilemma
James Stephen Brown (james-brown) · 2025-02-19T23:50:47.485Z · comments (11)

[link] (Anti)Aging 101
George3d6 · 2025-03-12T03:59:21.859Z · comments (1)

Comparing the effectiveness of top-down and bottom-up activation steering for bypassing refusal on harmful prompts
Ana Kapros (ana-kapros) · 2025-02-12T19:12:07.592Z · comments (0)

[link] NY State Has a New Frontier Model Bill (+quick takes)
henryj · 2025-03-05T19:29:02.219Z · comments (0)

Sleeping Beauty: an Accuracy-based Approach
glauberdebona · 2025-02-10T15:40:29.619Z · comments (2)

Camps Should List Bands
jefftk (jkaufman) · 2025-03-06T03:00:02.348Z · comments (0)

Nationwide Action Workshop: Contact Congress about AI safety!
Felix De Simone (BobusChilc) · 2025-02-24T19:36:09.084Z · comments (0)

THE ARCHIVE
Jason Reid (jason-reid) · 2025-02-17T01:12:41.486Z · comments (0)

The old memories tree
Yair Halberstadt (yair-halberstadt) · 2025-03-05T19:03:59.498Z · comments (1)

Make Superintelligence Loving
Davey Morse (davey-morse) · 2025-02-21T06:07:17.235Z · comments (9)

[link] Neural Scaling Laws Rooted in the Data Distribution
aribrill (Particleman) · 2025-02-20T21:22:10.306Z · comments (0)

[question] Should I Divest from AI?
OKlogic · 2025-02-10T03:29:33.582Z · answers+comments (4)

Beyond ELO: Rethinking Chess Skill as a Multidimensional Random Variable
Oliver Oswald (oliver-oswald) · 2025-02-10T19:19:36.233Z · comments (7)

Arguing for the Truth? An Inference-Only Study into AI Debate
denisemester · 2025-02-11T03:04:58.852Z · comments (0)

Not-yet-falsifiable beliefs?
Benjamin Hendricks (benjamin-hendricks) · 2025-03-02T14:11:07.121Z · comments (4)

Build a Metaculus Forecasting Bot in 30 Minutes: A Practical Guide
ChristianWilliams · 2025-02-22T03:52:14.753Z · comments (0)

[link] AI Safety at the Frontier: Paper Highlights, January '25
gasteigerjo · 2025-02-11T16:14:16.972Z · comments (0)

[link] You don't actually need a physical multiverse to explain anthropic fine-tuning.
Fraser · 2025-03-12T07:33:43.278Z · comments (3)

Retroactive If-Then Commitments
MichaelDickens · 2025-02-01T22:22:43.031Z · comments (0)

Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (1)

Have you actually tried raising the birth rate?
Yair Halberstadt (yair-halberstadt) · 2025-03-10T18:06:40.987Z · comments (5)

Closed-ended questions aren't as hard as you think
electroswing · 2025-02-19T03:53:11.855Z · comments (0)

[question] Name for Standard AI Caveat?
yrimon (yehuda-rimon) · 2025-02-26T07:07:16.523Z · answers+comments (5)

Utilitarian AI Alignment: Building a Moral Assistant with the Constitutional AI Method
Clément L · 2025-02-04T04:15:36.917Z · comments (1)

[question] Does human (mis)alignment pose a significant and imminent existential threat?
jr · 2025-02-23T10:03:40.269Z · answers+comments (3)

There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

kylefurlong on The Social Economy

By way of analogy:

Say you have the ability to build a box that if someone turns the crank, there’s some non-zero probability of a golden egg popping out. The more you turn the crank, the more likely an egg pops out.

So you build the box, turn the crank, and after a very long time and lots of cranking out pops an egg. Who owns the egg? Well, pretty obviously you, you built the box, you turned the crank. Both activities required the use of your body and other elements, and no one else contributed.

Now, in your old age, your arms are getting tired, but you now rely on the income from selling the eggs. So you have your sons turn the crank alternately and out pops an egg. Who owns the egg? Well, you built the box and turned the crank any number of times, but now the labor of your sons caused the egg to pop out, not yours.

This is complex. Historically and “intuitively” you still own the egg. It’s your box, you’ve been using it your whole life. But is this really intuitive? Not only did your creative work and faithful maintenance cause the box to exist, but the labor of your sons also caused the egg to pop out.

Now you live for a long time arraigning to share the proceeds with your sons for their labor. Eventually you die, the gold notwithstanding. Now your sons, dependent on the egg income, agree to share the box as they have been already for years. They turn the crank, alternately, and eggs pop out. Who owns the eggs? Clearly either of the brothers own some portion of each egg, but how much?

We continue on in this way over centuries with many boxes, many eggs, many creators, and many laborers.

Now the question is, who owns the boxes, and who owns the eggs?

In my previous post I gave a theory that the crank turners own the box and the eggs in proportion to their crank turning, but perhaps this treats the box creators unfairly. What do you think?

oscard on Evaluating “What 2026 Looks Like” So Far

Nice!

For the 2024 prediction "So, the most compute spent on a single training run is something like 5x10^25 FLOPs." you cite v3 as having been trained on 3.5e24 FLOP, but that is outside an OOM. Whereas Grok-2 was trained in 2024 with 3e25, so seems to be a better model to cite?

sharmake-farah on Why I’m not a Bayesian

Re democratic countries overtaken by dictatorial countries, I think that this will only last until AI that can automate at least all white collar labor is achieved, and maybe even most blue collar physical labor well enough that human wages for those jobs decline below what you need to subsist on a human, and by then dictatorial/plutocratic countries will unfortunately come back as a viable governing option, and maybe even overtaking democratic countries.

So to come back to the analogy, I think VNM-rationality dictatorship is unfortunately common and convergent over a long timescale, and it's democracies/coalition politics that are fragile over the sweep of history, because they only became dominant-ish starting in the 18th century and end sometime in the 21st century.

daniel-kokotajlo on OpenAI: Detecting misbehavior in frontier reasoning models

I haven't tried to calculate it recently. I still feel rather pessimistic for all the usual reasons. This paper from OpenAI was a positive update; Vance's strong anti-AI-safety stance was a negative update. There have been various other updates besides.

viliam on The Most Forbidden Technique

I believe it is a clear demonstration that misalignment likely does not stem from the model being “evil.” It simply found a better way to achieve its goal using unintended means.

It is fascinating to see that the official science has finally discovered what Yudkowsky wrote about a decade ago. Better late than never, I guess.

casens on when will LLMs become human-level bloggers?

Surely it would be exceptionally good at those kinds of writing, too, right?

surely an LLM capable of writing A+ freshman college papers would correctly add two 2-digit numbers? surely an AI capable of beating grandmasters in chess would be able to tutor a 1000 elo player to a 1500 elo or beyond? surely an AI capable of answering questions at a university level in diverse subjects such as math, coding, science, law, would be able to recursively improve itself and cause an intelligence explosion? surely such an AI would at least be able to do a simple task like unload a dishwasher without breaking a dish?

i think surely it should be obvious to anyone who's able to mull it through for a few seconds that intelligence does not need to progress along the same paths as it does for human civilization over centuries or for human beings through child development, or even among proto-intelligent animals on earth. it is surely obvious to me that AI can exhibit surprising mixes of general and non-general intelligence, and that we're not really sure why it works. there is really no requirement i have left, that before the AI turns us into paperclips, it must 100% be able to beat poker players at the WSOP or generate an oscar-winning feature film, or be able to make nobel-winning science discoveries. some of these requirements seem more plausible than others, but none seem totally certain.

scottalexander on Elon Musk May Be Transitioning to Bipolar Type I

I disagreed with Gwern at first. I'm increasingly forced to admit there's something like bipolar going on here, but I still think we're also missing something - his cognitive state seems pretty steady month to month, rather than episodes of mania alternating with lucidity.

Someone claimed the latest Musk biography said he was much more normal early in the morning, and much crazier late at night. I need to read the biography and see if that's actually in there; if so, maybe there could be a case for ultradian or ultra-rapid-cycling or something. This could potentially just look like a random mix of good and bad decisions depending on when in the cycle he's making them, with the cycle itself too fast to notice on the scale of news reports. As you say, presumably something about that changed the past few years (I've never heard anyone discuss what happens to ultradian bipolar if you simply never sleep, but I bet it's nothing good).

Anatoly Karlin, who apparently also read the biography, says that Musk's father Errol also went crazy after fifty - see https://x.com/powerfultakes/status/1892003738929238408 . One excerpt:

"I don't know how he went from being great at engineering to believing in witchcraft", [Elon told the biographer about his father]. Errol can be very forceful and occasionally convincing. "He changes reality around him," [Elon's brother] Kimbal says. "He will literally make up things, but he actually believes his own false reality."

I can't think of a form of bipolar which consistently gets much worse at age 50, but I hope to look into this further.

vanessa-kosoy on The Learning-Theoretic Agenda: Status 2023

I don't think that undecidability of exact comparison (as opposed to comparison within any given margin of error) is necessarily a bug, however, if you really want comparison for periodic sequences, you can insist that the utility function is defined by a finite state machine. This is in any case already a requirement in the bounded compute version [LW(p) · GW(p)].

kylefurlong on The Social Economy

Last night’s comment was much too stream of thought. While much of it is true from a certain perspective it omits the theory of ownership supporting all of it.

We start with founding ownership in a certain way: the basis of ownership is self-ownership. I as a nervous system own my own body because of innervation, and we say that ownership itself is innervation.

This is intuitive. All our upset at violations flows from this sense of ownership of our own bodies. As people, we naturally extend this ownership to things we habitually use. They form extensions of ourselves, that even though we don’t enervate them, our operative control of them in proportion to how often and how effectively we do control them, give us a similar sense of ownership over even if it is not ownership in fact.

Understanding this, we can evaluate property claims. They are merely the claim that “I use that often, and it doesn’t object”, to some degree. This is common sense, and to some degree normative, since no one wants their habitual life supports taken away, just like no one wants their body violated.

If we see this, then we see there is in fact an ownership gradient flowing out from individuals and interacting with other’s gradients, and this manifold is the true ownership of everything that is not innervated.

Understood this way, we have all we need to query historical/legal property claims. All we need ask is who has some habitual interactive use of a given element. Apply this to the office building, and it’s obvious that the workers have the most claim. However, this is still on balance with the owner-of-record’s use, however problematic, as their habitual use supports their own life.

This tension is what we negotiate through the SEC transition and lease to mortgage change. Through the SEC transition, former owners receive support for their life, receiving family heirlooms and any other real property (real property being those things that no one but them habitually use). Through the lease to mortgage change, the true owners of the home by virtue of living there receive that acknowledgement and legal status, while those with fiat ownership through whatever economic coercion receive mortgage payments to ensure that to whatever extent they rely on those payments, they can transition to something founded in real ownership and not exploitation. [Edit: looking at this further, I’m not sure even these are helpful. No one should have to pay 60PMI a month to their former landlord just because they happened to accumulate more currency in the former system. It would also be rewarding those former owners for their exploitation instead of giving them a new start with everyone else.]

However, this is mostly beside the point, to the extent that these economic relationships from former owners do not have any true habitual use claims other than monetary enrichment. The Social Economy already equalizes purchasing power, so no one can claim that a monetary shortfall affects them more than anyone else. To make that claim would be to plead addiction to a certain lifestyle, and there are many other better ways to deal with that.

viliam on You don't actually need a physical multiverse to explain anthropic fine-tuning.

As I see it, there are two possible explanations for a fine-tuned universe:

an insane amount of luck (or something other than luck, such as a supernatural force)
multiverse

The argument for multiverse is precisely that it does not require the insane amount of luck.

Your argument is that with the insane amount of luck, the multiverse is not necessary.

That is correct, but then we are back to the original question: why did we get so extremely lucky?