LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

LessWrong's (first) album: I Have Been A Good Bing
habryka (habryka4) · 2024-04-01T07:33:45.242Z · comments (156)

The Talk: a brief explanation of sexual dimorphism
Malmesbury (Elmer of Malmesbury) · 2023-09-18T16:23:56.073Z · comments (72)

[link] How much do you believe your results?
Eric Neyman (UnexpectedValues) · 2023-05-06T20:31:31.277Z · comments (14)

Steering GPT-2-XL by adding an activation vector
TurnTrout · 2023-05-13T18:42:41.321Z · comments (97)

[link] The ants and the grasshopper
Richard_Ngo (ricraz) · 2023-06-04T22:00:04.577Z · comments (35)

[link] Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?
gwern · 2023-07-03T00:48:47.131Z · comments (54)

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
GeneSmith · 2023-12-12T18:14:51.438Z · comments (162)

[link] Things I Learned by Spending Five Thousand Hours In Non-EA Charities
jenn (pixx) · 2023-06-01T20:48:03.940Z · comments (34)

GPTs are Predictors, not Imitators
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2023-04-08T19:59:13.601Z · comments (90)

[link] Statement on AI Extinction - Signed by AGI Labs, Top Academics, and Many Other Notable Figures
Dan H (dan-hendrycks) · 2023-05-30T09:05:25.986Z · comments (77)

There is way too much serendipity
Malmesbury (Elmer of Malmesbury) · 2024-01-19T19:37:57.068Z · comments (56)

How to have Polygenically Screened Children
GeneSmith · 2023-05-07T16:01:07.096Z · comments (108)

Inside Views, Impostor Syndrome, and the Great LARP
johnswentworth · 2023-09-25T16:08:17.040Z · comments (53)

Sharing Information About Nonlinear
Ben Pace (Benito) · 2023-09-07T06:51:11.846Z · comments (323)

[link] [April Fools' Day] Introducing Open Asteroid Impact
Linch · 2024-04-01T08:14:15.800Z · comments (29)

[link] EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem
Elizabeth (pktechgirl) · 2023-09-28T23:30:03.390Z · comments (246)

Against Almost Every Theory of Impact of Interpretability
Charbel-Raphaël (charbel-raphael-segerie) · 2023-08-17T18:44:41.099Z · comments (83)

Shallow review of live agendas in alignment & safety
technicalities · 2023-11-27T11:10:27.464Z · comments (69)

Alignment Grantmaking is Funding-Limited Right Now
johnswentworth · 2023-07-19T16:49:08.811Z · comments (67)

Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research
evhub · 2023-08-08T01:30:10.847Z · comments (26)

Transformers Represent Belief State Geometry in their Residual Stream
Adam Shai (adam-shai) · 2024-04-16T21:16:11.377Z · comments (63)

The Best Tacit Knowledge Videos on Every Subject
Parker Conley (parker-conley) · 2024-03-31T17:14:31.199Z · comments (123)

[link] Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky
jacquesthibs (jacques-thibodeau) · 2023-03-29T23:16:19.431Z · comments (296)

Book Review: How Minds Change
bc4026bd4aaa5b7fe (bc4026bd4aaa5b7fe0bdcd47da7a22b453953f990d35286b9d315a619b23667a) · 2023-05-25T17:55:32.218Z · comments (51)

LW Team is adjusting moderation policy
Raemon · 2023-04-04T20:41:07.603Z · comments (182)

[link] When do "brains beat brawn" in Chess? An experiment
titotal (lombertini) · 2023-06-28T13:33:23.854Z · comments (79)

[link] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
evhub · 2024-01-12T19:51:01.021Z · comments (94)

Predictable updating about AI risk
Joe Carlsmith (joekc) · 2023-05-08T21:53:34.730Z · comments (23)

Speaking to Congressional staffers about AI risk
Akash (akash-wasil) · 2023-12-04T23:08:52.055Z · comments (23)

[link] Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2023-10-05T21:01:39.767Z · comments (21)

Social Dark Matter
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-11-16T20:00:00.000Z · comments (112)

Hooray for stepping out of the limelight
So8res · 2023-04-01T02:45:31.397Z · comments (24)

OpenAI: The Battle of the Board
Zvi · 2023-11-22T17:30:04.574Z · comments (82)

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI
Andrew_Critch · 2023-05-24T00:02:08.836Z · comments (39)

Notes on Teaching in Prison
jsd · 2023-04-19T01:53:00.427Z · comments (12)

Guide to rationalist interior decorating
mingyuan · 2023-06-19T06:47:13.704Z · comments (45)

The Base Rate Times, news through prediction markets
vandemonian · 2023-06-06T17:42:56.718Z · comments (39)

Gentleness and the artificial Other
Joe Carlsmith (joekc) · 2024-01-02T18:21:34.746Z · comments (33)

Accidentally Load Bearing
jefftk (jkaufman) · 2023-07-13T16:10:00.806Z · comments (14)

OpenAI: Facts from a Weekend
Zvi · 2023-11-20T15:30:06.732Z · comments (158)

[link] Scale Was All We Needed, At First
Gabriel Mukobi (gabe-mukobi) · 2024-02-14T01:49:16.184Z · comments (31)

Express interest in an "FHI of the West"
habryka (habryka4) · 2024-04-18T03:32:58.592Z · comments (39)

The 6D effect: When companies take risks, one email can be very powerful.
scasper · 2023-11-04T20:08:39.775Z · comments (40)

Constellations are Younger than Continents
Jeffrey Heninger (jeffrey-heninger) · 2023-12-19T06:12:40.667Z · comments (22)

[link] Thoughts on seed oil
dynomight · 2024-04-20T12:29:14.212Z · comments (79)

On green
Joe Carlsmith (joekc) · 2024-03-21T17:38:56.295Z · comments (33)

AI Timelines
habryka (habryka4) · 2023-11-10T05:28:24.841Z · comments (74)

[link] [SEE NEW EDITS] No, *You* Need to Write Clearer
NicholasKross · 2023-04-29T05:04:01.559Z · comments (64)

Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)
Chris Scammell (chris-scammell) · 2023-05-10T19:04:21.138Z · comments (53)

Dear Self; we need to talk about ambition
Elizabeth (pktechgirl) · 2023-08-27T23:10:04.720Z · comments (25)

next page (older posts) →

Archive

Recent comments

abstractapplic on D&D.Sci Long War: Defender of Data-mocracy

Thanks for running this when my one was going to be late, and thanks for checking with me beforehand.

(Also, thanks for the scenario, like, in general: it looks like a fun one!)

devrandom on We are headed into an extreme compute overhang

On the other hand, the world already contains over 8 billion human intelligences. So I think you are assuming that a few million AGIs, possibly running at several times human speed (and able to work 24/7, exchange information electronically, etc.), will be able to significantly "outcompete" (in some fashion) 8 billion humans? This seems worth further exploration / justification.

Good point, but a couple of thoughts:

the operational definition of AGI referred in the article is significantly stronger than the average human
the humans are poorly organized
the 8 billion humans are supporting a civilization, while the AGIs can focus on AI research and self-improvement

devrandom on We are headed into an extreme compute overhang

Thank you, I missed it while looking for prior art.

stephen-mcaleese on We are headed into an extreme compute overhang

Currently, groups of LLM agents can collaborate using frameworks such as ChatDev, which simulates a virtual software company using LLM agents with different roles. Though I think human organizations are still more effective for now. For example, corporations such as Microsoft have over 200,000 employees and can work on multi-year projects. But it's conceivable that in the future there could be virtual companies composed of millions of AIs that can coordinate effectively and can work continuously at superhuman speed for long periods of time.

gunnar_zarncke on Exploring the Esoteric Pathways to AI Sentience (Part One)

In order to fulfill that dream, AI must be sentient, and that requires it have consciousness.

This is a surprising statement. Why do you think so?

gunnar_zarncke on Exploring the Esoteric Pathways to AI Sentience (Part One)

In order to fulfill that dream, AI must be sentient, and that requires it have consciousness.

THis is a surprising statement. Why do you think so?

rana-dexsin on On Not Pulling The Ladder Up Behind You

In less serious (but not fully unserious) citation of that particular site, it also contains an earlier depiction of literally pulling up ladders (as part of a comic based on treating LOTR as though it were a D&D campaign) that shows off what can sometimes result: a disruptive shock from the ones stuck on the lower side, in this case via a leap in technology level.

lblack on Examples of Highly Counterfactual Discoveries?

I would not say that the central insight of SLT is about priors. Under weak conditions the prior is almost irrelevant. Indeed, the RLCT is independent of the prior under very weak nonvanishing conditions.

I don't think these conditions are particularly weak at all. Any prior that fulfils it is a prior that would not be normalised right if the parameter-function map were one-to-one.

It's a kind of prior like to use a lot, but that doesn't make it a sane choice.

A well-normalised prior for a regular model probably doesn't look very continuous or differentiable in this setting, I'd guess.

To be sure - generic symmetries are seen by the RLCT. But these are, in some sense, the uninteresting ones. The interesting thing is the local singular structure and its unfolding in phase transitions during training.

The generic symmetries are not what I'm talking about. There are symmetries in neural networks that are neither generic, nor only present at finite sample size. These symmetries correspond to different parametrisations that implement the same input-output map. Different regions in parameter space can differ in how many of those equivalent parametrisations they have, depending on the internal structure of the networks at that point.

The issue of the true distribution not being contained in the model is called 'unrealizability' in Bayesian statistics. It is dealt with in Watanabe's second 'green' book. Nonrealizability is key to the most important insight of SLT contained in the last sections of the second to last chapter of the green book: algorithmic development during training through phase transitions in the free energy.

I know it 'deals with' unrealizability in this sense, that's not what I meant.

I'm not talking about the problem of characterising the posterior right when the true model is unrealizable. I'm talking about the problem where the actual logical statement we defined our prior and thus our free energy relative to is an insane statement to make and so the posterior you put on it ends up negligibly tiny compared to the probability mass that lies outside the model class.

But looking at the green book, I see it's actually making very different, stat-mech style arguments that reason about the KL divergence between the true distribution and the guess made by averaging the predictions of all models in the parameter space according to their support in the posterior. I'm going to have to translate more of this into Bayes to know what I think of it.

cheer-poasting on Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?

I know that you said comments should focus on things that were confusing, so I'll admit to being quite confused.

Early in the article you said that it's not possible to agree on definitions of man and woman because of competing ideological needs -- directly after creating a functional evo-psych justification for a set of answers that you claim is accepted by nearly every people group to have ever existed. I find this confusing. Perhaps it is better to use a different example, because the one you used seemed so convincing that it overshadowed your point.
There is, in my opinion, and unreasonably large distance between when you talk about "uncertainty" and when you talk about the fact that it can be almost completely ignored in daily life. If it's not so important in general daily life, then mentioning this early will help people understand better as you show examples where it actually does matter.
As far as choiceless mode goes, you say something to the effect of "if people can have any (moral?) choice at all, then it's not actually choiceless mode at all". However, this would imply that choiceless mode has actually never existed, as there has always been some degree of choice in morality and worldview. Either what people were yearning for wasn't choiceless mode, or that there is some threshold of moral choice that cannot be exceeded.
I believe it would be less confusing if you mentioned earlier that "moral uncertainty" refers to an individual being uncertain about any specific moral judgment, rather than a sense of "morality doesn't exist" or "morality is unknowable".
I feel that, as a chapter, I'm not completely sure what I'm supposed to take away from it. Perhaps the use of some progressive summarization or some signposting would help in that regard. It's not that any of the points made are bad or something like this, and I'm not talking about individual sentence structure. But overall, there doesn't really feel like a huge connection between the sections. Logically, I can see what the connection is supposed to be, but when reading it feels more like mini essays arranged on a topic than a chapter.

Overall, I found the chapter interesting. And as I said, I was actually very convinced by the evo-psych answer to "man" and "woman" and plan to write on it in the near future.

wei-dai on Eric Neyman's Shortform

Thank you for detailing your thoughts. Some differences for me:

I'm also worried about unaligned AIs as a competitor to aligned AIs/civilizations in the acausal economy/society. For example, suppose there are vulnerable AIs "out there" that can be manipulated/taken over via acausal means, unaligned AI could compete with us (and with others with better values from our perspective) in the race to manipulate them.
I'm perhaps less optimistic than you about commitment races.
I have some credence on max good and max bad being not close to balanced, that additionally pushes me towards the "unaligned AI is bad" direction.

ETA: Here's a more detailed argument for 1, that I don't think I've written down before. Our universe is small enough [? · GW] that it seems plausible (maybe even likely) that most of the value or disvalue created by a human-descended civilization comes from its acausal influence on the rest of the multiverse. An aligned AI/civilization would likely influence the rest of the multiverse in a positive direction, whereas an unaligned AI/civilization would probably influence the rest of the multiverse in a negative direction. This effect may outweigh what happens in our own universe/lightcone so much that the positive value from unaligned AI doing valuable things in our universe as a result of acausal trade is totally swamped by the disvalue created by its negative acausal influence.