LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

LessWrong's (first) album: I Have Been A Good Bing
habryka (habryka4) · 2024-04-01T07:33:45.242Z · comments (156)

There is way too much serendipity
Malmesbury (Elmer of Malmesbury) · 2024-01-19T19:37:57.068Z · comments (56)

[link] [April Fools' Day] Introducing Open Asteroid Impact
Linch · 2024-04-01T08:14:15.800Z · comments (29)

Transformers Represent Belief State Geometry in their Residual Stream
Adam Shai (adam-shai) · 2024-04-16T21:16:11.377Z · comments (63)

The Best Tacit Knowledge Videos on Every Subject
Parker Conley (parker-conley) · 2024-03-31T17:14:31.199Z · comments (123)

[link] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
evhub · 2024-01-12T19:51:01.021Z · comments (94)

Gentleness and the artificial Other
Joe Carlsmith (joekc) · 2024-01-02T18:21:34.746Z · comments (33)

[link] Scale Was All We Needed, At First
Gabriel Mukobi (gabe-mukobi) · 2024-02-14T01:49:16.184Z · comments (31)

Express interest in an "FHI of the West"
habryka (habryka4) · 2024-04-18T03:32:58.592Z · comments (39)

On green
Joe Carlsmith (joekc) · 2024-03-21T17:38:56.295Z · comments (33)

[link] Thoughts on seed oil
dynomight · 2024-04-20T12:29:14.212Z · comments (79)

[link] Paul Christiano named as US AI Safety Institute Head of AI Safety
Joel Burget (joel-burget) · 2024-04-16T16:22:06.937Z · comments (55)

[link] My PhD thesis: Algorithmic Bayesian Epistemology
Eric Neyman (UnexpectedValues) · 2024-03-16T22:56:59.283Z · comments (14)

The case for ensuring that powerful AIs are controlled
ryan_greenblatt · 2024-01-24T16:11:51.354Z · comments (66)

Failures in Kindness
silentbob · 2024-03-26T21:30:11.052Z · comments (26)

[link] "No-one in my org puts money in their pension"
Tobes (tobias-jolly) · 2024-02-16T18:33:28.996Z · comments (7)

My Clients, The Liars
ymeskhout · 2024-03-05T21:06:36.669Z · comments (85)

Brute Force Manufactured Consensus is Hiding the Crime of the Century
Roko · 2024-02-03T20:36:59.806Z · comments (156)

MIRI 2024 Mission and Strategy Update
Malo (malo) · 2024-01-05T00:20:54.169Z · comments (44)

CFAR Takeaways: Andrew Critch
Raemon · 2024-02-14T01:37:03.931Z · comments (62)

ChatGPT can learn indirect control
Raymond D · 2024-03-21T21:11:06.649Z · comments (23)

Believing In
AnnaSalamon · 2024-02-08T07:06:13.072Z · comments (49)

[link] "How could I have thought that faster?"
mesaoptimizer · 2024-03-11T10:56:17.884Z · comments (30)

[link] Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy
garrison · 2024-02-10T19:52:55.191Z · comments (52)

Modern Transformers are AGI, and Human-Level
abramdemski · 2024-03-26T17:46:19.373Z · comments (89)

My Interview With Cade Metz on His Reporting About Slate Star Codex
Zack_M_Davis · 2024-03-26T17:18:05.114Z · comments (186)

[link] Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”
Ricki Heicklen (bayesshammai) · 2024-02-22T23:56:02.318Z · comments (5)

[link] Daniel Kahneman has died
DanielFilan · 2024-03-27T15:59:14.517Z · comments (11)

Toward A Mathematical Framework for Computation in Superposition
Dmitry Vaintrob (dmitry-vaintrob) · 2024-01-18T21:06:57.040Z · comments (16)

The impossible problem of due process
mingyuan · 2024-01-16T05:18:33.415Z · comments (63)

This might be the last AI Safety Camp
Remmelt (remmelt-ellen) · 2024-01-24T09:33:29.438Z · comments (33)

Funny Anecdote of Eliezer From His Sister
Daniel Birnbaum (daniel-birnbaum) · 2024-04-22T22:05:31.886Z · comments (4)

Introducing Alignment Stress-Testing at Anthropic
evhub · 2024-01-12T23:51:25.875Z · comments (23)

OMMC Announces RIP
Adam Scholl (adam_scholl) · 2024-04-01T23:20:00.433Z · comments (5)

Every "Every Bay Area House Party" Bay Area House Party
Richard_Ngo (ricraz) · 2024-02-16T18:53:28.567Z · comments (6)

[link] FHI (Future of Humanity Institute) has shut down (2005–2024)
gwern · 2024-04-17T13:54:16.791Z · comments (21)

[link] Toward a Broader Conception of Adverse Selection
Ricki Heicklen (bayesshammai) · 2024-03-14T22:40:57.920Z · comments (61)

Timaeus's First Four Months
Jesse Hoogland (jhoogland) · 2024-02-28T17:01:53.437Z · comments (6)

Reconsider the anti-cavity bacteria if you are Asian
Lao Mein (derpherpize) · 2024-04-15T07:02:02.655Z · comments (40)

'Empiricism!' as Anti-Epistemology
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-03-14T02:02:59.723Z · comments (84)

Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
Jeremy Gillen (jeremy-gillen) · 2024-01-26T07:22:06.370Z · comments (60)

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer
johnswentworth · 2024-04-18T00:27:43.451Z · comments (17)

What’s up with LLMs representing XORs of arbitrary features?
Sam Marks (samuel-marks) · 2024-01-03T19:44:33.162Z · comments (61)

Many arguments for AI x-risk are wrong
TurnTrout · 2024-03-05T02:31:00.990Z · comments (76)

[question] Examples of Highly Counterfactual Discoveries?
johnswentworth · 2024-04-23T22:19:19.399Z · answers+comments (80)

Apologizing is a Core Rationalist Skill
johnswentworth · 2024-01-02T17:47:35.950Z · comments (42)

2023 Survey Results
Screwtape · 2024-02-16T22:24:28.132Z · comments (26)

[link] Making every researcher seek grants is a broken model
jasoncrawford · 2024-01-26T16:06:26.688Z · comments (41)

[link] Vernor Vinge, who coined the term "Technological Singularity", dies at 79
Kaj_Sotala · 2024-03-21T22:14:14.699Z · comments (24)

[link] Masterpiece
Richard_Ngo (ricraz) · 2024-02-13T23:10:35.376Z · comments (20)

next page (older posts) →

Archive

Recent comments

abstractapplic on D&D.Sci Long War: Defender of Data-mocracy

Thanks for running this when my one was going to be late, and thanks for checking with me beforehand.

(Also, thanks for the scenario, like, in general: it looks like a fun one!)

devrandom on We are headed into an extreme compute overhang

On the other hand, the world already contains over 8 billion human intelligences. So I think you are assuming that a few million AGIs, possibly running at several times human speed (and able to work 24/7, exchange information electronically, etc.), will be able to significantly "outcompete" (in some fashion) 8 billion humans? This seems worth further exploration / justification.

Good point, but a couple of thoughts:

the operational definition of AGI referred in the article is significantly stronger than the average human
the humans are poorly organized
the 8 billion humans are supporting a civilization, while the AGIs can focus on AI research and self-improvement

devrandom on We are headed into an extreme compute overhang

Thank you, I missed it while looking for prior art.

stephen-mcaleese on We are headed into an extreme compute overhang

Currently, groups of LLM agents can collaborate using frameworks such as ChatDev, which simulates a virtual software company using LLM agents with different roles. Though I think human organizations are still more effective for now. For example, corporations such as Microsoft have over 200,000 employees and can work on multi-year projects. But it's conceivable that in the future there could be virtual companies composed of millions of AIs that can coordinate effectively and can work continuously at superhuman speed for long periods of time.

gunnar_zarncke on Exploring the Esoteric Pathways to AI Sentience (Part One)

In order to fulfill that dream, AI must be sentient, and that requires it have consciousness.

This is a surprising statement. Why do you think so?

gunnar_zarncke on Exploring the Esoteric Pathways to AI Sentience (Part One)

In order to fulfill that dream, AI must be sentient, and that requires it have consciousness.

THis is a surprising statement. Why do you think so?

rana-dexsin on On Not Pulling The Ladder Up Behind You

In less serious (but not fully unserious) citation of that particular site, it also contains an earlier depiction of literally pulling up ladders (as part of a comic based on treating LOTR as though it were a D&D campaign) that shows off what can sometimes result: a disruptive shock from the ones stuck on the lower side, in this case via a leap in technology level.

lblack on Examples of Highly Counterfactual Discoveries?

I would not say that the central insight of SLT is about priors. Under weak conditions the prior is almost irrelevant. Indeed, the RLCT is independent of the prior under very weak nonvanishing conditions.

I don't think these conditions are particularly weak at all. Any prior that fulfils it is a prior that would not be normalised right if the parameter-function map were one-to-one.

It's a kind of prior like to use a lot, but that doesn't make it a sane choice.

A well-normalised prior for a regular model probably doesn't look very continuous or differentiable in this setting, I'd guess.

To be sure - generic symmetries are seen by the RLCT. But these are, in some sense, the uninteresting ones. The interesting thing is the local singular structure and its unfolding in phase transitions during training.

The generic symmetries are not what I'm talking about. There are symmetries in neural networks that are neither generic, nor only present at finite sample size. These symmetries correspond to different parametrisations that implement the same input-output map. Different regions in parameter space can differ in how many of those equivalent parametrisations they have, depending on the internal structure of the networks at that point.

The issue of the true distribution not being contained in the model is called 'unrealizability' in Bayesian statistics. It is dealt with in Watanabe's second 'green' book. Nonrealizability is key to the most important insight of SLT contained in the last sections of the second to last chapter of the green book: algorithmic development during training through phase transitions in the free energy.

I know it 'deals with' unrealizability in this sense, that's not what I meant.

I'm not talking about the problem of characterising the posterior right when the true model is unrealizable. I'm talking about the problem where the actual logical statement we defined our prior and thus our free energy relative to is an insane statement to make and so the posterior you put on it ends up negligibly tiny compared to the probability mass that lies outside the model class.

But looking at the green book, I see it's actually making very different, stat-mech style arguments that reason about the KL divergence between the true distribution and the guess made by averaging the predictions of all models in the parameter space according to their support in the posterior. I'm going to have to translate more of this into Bayes to know what I think of it.

cheer-poasting on Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?

I know that you said comments should focus on things that were confusing, so I'll admit to being quite confused.

Early in the article you said that it's not possible to agree on definitions of man and woman because of competing ideological needs -- directly after creating a functional evo-psych justification for a set of answers that you claim is accepted by nearly every people group to have ever existed. I find this confusing. Perhaps it is better to use a different example, because the one you used seemed so convincing that it overshadowed your point.
There is, in my opinion, and unreasonably large distance between when you talk about "uncertainty" and when you talk about the fact that it can be almost completely ignored in daily life. If it's not so important in general daily life, then mentioning this early will help people understand better as you show examples where it actually does matter.
As far as choiceless mode goes, you say something to the effect of "if people can have any (moral?) choice at all, then it's not actually choiceless mode at all". However, this would imply that choiceless mode has actually never existed, as there has always been some degree of choice in morality and worldview. Either what people were yearning for wasn't choiceless mode, or that there is some threshold of moral choice that cannot be exceeded.
I believe it would be less confusing if you mentioned earlier that "moral uncertainty" refers to an individual being uncertain about any specific moral judgment, rather than a sense of "morality doesn't exist" or "morality is unknowable".
I feel that, as a chapter, I'm not completely sure what I'm supposed to take away from it. Perhaps the use of some progressive summarization or some signposting would help in that regard. It's not that any of the points made are bad or something like this, and I'm not talking about individual sentence structure. But overall, there doesn't really feel like a huge connection between the sections. Logically, I can see what the connection is supposed to be, but when reading it feels more like mini essays arranged on a topic than a chapter.

Overall, I found the chapter interesting. And as I said, I was actually very convinced by the evo-psych answer to "man" and "woman" and plan to write on it in the near future.

wei-dai on Eric Neyman's Shortform

Thank you for detailing your thoughts. Some differences for me:

I'm also worried about unaligned AIs as a competitor to aligned AIs/civilizations in the acausal economy/society. For example, suppose there are vulnerable AIs "out there" that can be manipulated/taken over via acausal means, unaligned AI could compete with us (and with others with better values from our perspective) in the race to manipulate them.
I'm perhaps less optimistic than you about commitment races.
I have some credence on max good and max bad being not close to balanced, that additionally pushes me towards the "unaligned AI is bad" direction.

ETA: Here's a more detailed argument for 1, that I don't think I've written down before. Our universe is small enough [? · GW] that it seems plausible (maybe even likely) that most of the value or disvalue created by a human-descended civilization comes from its acausal influence on the rest of the multiverse. An aligned AI/civilization would influence the rest of the multiverse in a positive direction, whereas an unaligned AI/civilization would (probably) influence the rest of the multiverse in a negative direction. This effect may outweigh what happens in our own universe/lightcone so much that the positive value from unaligned AI doing valuable things in our universe as a result of acausal trade is totally swamped by the disvalue created by its negative acausal influence.