LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search
Arjun Panickssery (arjun-panickssery) · 2024-02-12T00:56:44.944Z · comments (13)

Calculating Natural Latents via Resampling
johnswentworth · 2024-06-06T00:37:42.127Z · comments (4)

A quick investigation of AI pro-AI bias
Fabien Roger (Fabien) · 2024-01-19T23:26:32.663Z · comments (1)

[link] AI Safety Hub Serbia Official Opening
DusanDNesic · 2023-10-28T17:03:34.607Z · comments (0)

[link] Making Eggs Without Ovaries
Niko_McCarty (niko-2) · 2024-09-22T17:44:46.733Z · comments (3)

On “first critical tries” in AI alignment
Joe Carlsmith (joekc) · 2024-06-05T00:19:02.814Z · comments (8)

On Anthropic’s Sleeper Agents Paper
Zvi · 2024-01-17T16:10:05.145Z · comments (5)

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.
Chi Nguyen · 2024-02-23T06:10:05.881Z · comments (18)

[link] Come to Manifest 2024 (June 7-9 in Berkeley)
Saul Munn (saul-munn) · 2024-03-27T21:30:17.306Z · comments (2)

Towards a formalization of the agent structure problem
Alex_Altair · 2024-04-29T20:28:15.190Z · comments (5)

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
leogao · 2023-12-16T05:39:10.558Z · comments (5)

[link] Questions are usually too cheap
Nathan Young · 2024-05-11T13:00:54.302Z · comments (19)

Thiel on AI & Racing with China
Ben Pace (Benito) · 2024-08-20T03:19:18.966Z · comments (10)

[link] Unlocking Solutions—By Understanding Coordination Problems
James Stephen Brown (james-brown) · 2024-07-27T04:52:13.435Z · comments (4)

[link] Theories of Change for AI Auditing
Lee Sharkey (Lee_Sharkey) · 2023-11-13T19:33:43.928Z · comments (0)

[link] Google Gemini Announced
Jacob G-W (g-w1) · 2023-12-06T16:14:07.192Z · comments (22)

Safe Stasis Fallacy
Davidmanheim · 2024-02-05T10:54:44.061Z · comments (2)

[Closed] PIBBSS is hiring in a variety of roles (alignment research and incubation program)
Nora_Ammann · 2024-04-09T08:12:59.241Z · comments (0)

Dating Roundup #2: If At First You Don’t Succeed
Zvi · 2024-01-02T16:00:04.955Z · comments (29)

[link] the micro-fulfillment cambrian explosion
bhauth · 2023-12-04T01:15:34.342Z · comments (5)

[link] [Closed] Agent Foundations track in MATS
Vanessa Kosoy (vanessa-kosoy) · 2023-10-31T08:12:50.482Z · comments (1)

AI #44: Copyright Confrontation
Zvi · 2023-12-28T14:30:10.237Z · comments (13)

Ten Modes of Culture War Discourse
jchan · 2024-01-31T13:58:20.572Z · comments (15)

Math-to-English Cheat Sheet
nahoj · 2024-04-08T09:19:40.814Z · comments (5)

[link] OpenAI releases GPT-4o, natively interfacing with text, voice and vision
Martín Soto (martinsq) · 2024-05-13T18:50:52.337Z · comments (23)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (1)

[link] Land Reclamation is in the 9th Circle of Stagnation Hell
Maxwell Tabarrok (maxwell-tabarrok) · 2024-01-12T13:36:27.159Z · comments (6)

Monthly Roundup #17: April 2024
Zvi · 2024-04-15T12:10:03.126Z · comments (4)

Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson (Rubi) · 2024-10-09T16:38:16.535Z · comments (10)

Fat Tails Discourage Compromise
niplav · 2024-06-17T09:39:16.489Z · comments (5)

[link] Breaking Circuit Breakers
mikes · 2024-07-14T18:57:20.251Z · comments (13)

Human wanting
TsviBT · 2023-10-24T01:05:39.374Z · comments (1)

[link] LLMs seem (relatively) safe
JustisMills · 2024-04-25T22:13:06.221Z · comments (24)

Be More Katja
Nathan Young · 2024-03-11T21:12:14.249Z · comments (0)

2022 (and All Time) Posts by Pingback Count
Raemon · 2023-12-16T21:17:00.572Z · comments (14)

[link] S-Risks: Fates Worse Than Extinction
aggliu · 2024-05-04T15:30:36.666Z · comments (2)

AI #76: Six Shorts Stories About OpenAI
Zvi · 2024-08-08T13:50:04.659Z · comments (10)

[question] Can we get an AI to "do our alignment homework for us"?
Chris_Leong · 2024-02-26T07:56:22.320Z · answers+comments (33)

AI #50: The Most Dangerous Thing
Zvi · 2024-02-08T14:30:13.168Z · comments (4)

Acting Wholesomely
owencb · 2024-02-26T21:49:16.526Z · comments (64)

Zvi's Manifold Markets House Rules
Zvi · 2023-11-13T00:28:02.147Z · comments (6)

Self-Blinded L-Theanine RCT
niplav · 2023-10-31T15:24:57.717Z · comments (12)

Trading off Lives
jefftk (jkaufman) · 2024-01-03T03:40:05.603Z · comments (12)

[link] Open Phil releases RFPs on LLM Benchmarks and Forecasting
LawrenceC (LawChan) · 2023-11-11T03:01:09.526Z · comments (0)

AI #40: A Vision from Vitalik
Zvi · 2023-11-30T17:30:08.350Z · comments (12)

Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Patrick Leask (patrickleask) · 2024-08-17T01:16:53.764Z · comments (0)

We are headed into an extreme compute overhang
devrandom · 2024-04-26T21:38:21.694Z · comments (33)

AI #37: Moving Too Fast
Zvi · 2023-11-09T17:50:04.324Z · comments (5)

AI #71: Farewell to Chevron
Zvi · 2024-07-04T13:40:05.905Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

niplav on yams's Shortform

Grants to Redwood Research, SERI MATS, NYU alignment group under Sam Bowman for scalable supervision, Palisade research, and many dozens more, most of which seem net positive wrt TAI risk.

martin-randall on The Hidden Complexity of Wishes

I think it's important to be able to make a narrow point about outer alignment without needing to defend a broader thesis about the entire alignment problem.

Indeed. For it is written [LW · GW]:

A mind that ever wishes to learn anything complicated, must learn to cultivate an interest in which particular exact argument steps are valid, apart from whether you yet agree or disagree with the final conclusion, because only in this way can you sort through all the arguments and finally sum them.

For more on this topic see "Local Validity as a Key to Sanity and Civilization [? · GW]."

abstractapplic on D&D Sci Coliseum: Arena of Data

I took an analytic approach and picked some reasonable choices based on that. I'll almost certainly try throwing ML at this problem some point but for now I want to note down what a me-who-can't-use-XGBoost would do.

Findings:

There are at least some fingerprintable gladiators who keep gladiating, and who need to be Accounted For (the presence of such people makes all archetypery suspect: are Dwarven Knights really that Good, or are there just a handful of super-prolific Dwarven Knights who give everyone an unfairly good impression?). This includes a Level 7 Elven Ninja, almost certainly Cadagal's Champion, who inexplicably insists on always wearing black (even though it doesn't seem to make a difference to how well ninjas ninj).

Level 4 Boots and Level 4 Gauntlets are super rare in the dataset. The Gauntlets are always worn by a pair of hypercompetent Level 7 Dwarven Monks; the Boots are always worn by the Level 7 Elven Ninja.

Despite this, Cadagal's Champion is facing us with Level 2 Boots.

We have some Level 4 Boots.

. . . we robbed this guy, didn't we? And if we wear the boots - our most powerful equipment - he'll flip out and set his House against us whether we win or lose? Dammit . . .

Who fights whom?

A is a Human Warrior. Warriors lose to Fencers, Humans lose to Fencers, Humans lose to Elves. We have an Elven Fencer on call; send Y.

B is a Human Knight. Rangers are best vs Knights, so send W. (Not super confident in this one)

C is an Elven Ninja. Ninjas are super weak against Knights. Send Z, the Elven Knight. (Slightly concerned by how underrepresented Elves are in the sample of gladiators who managed to beat this guy but I'm assuming that's either noise or an effect which Z will be able to shrug off with the Power of Friendship and/or Urgency)

D is a Dwarven Monk. Monks are weak to Ninjas; send U.

Who wears what?

I haven't managed to figure out how equipment works beyond "higher number good"; if there's specific synergies with/against specific classes/races/whatever they elude me. For that reason:

Y and Z are my best shots. I'll have them both wear what their opponents are wearing, to reduce the effects of uncertainty and turn those fights into "who wore it better?" contests. (So +3 Boots and +1 Gauntlets for Y, +2 Boots and +3 Gauntlets for Z.)

U vs D looks pretty solid so I'll give him the remaining +2 Gauntlets and +1 Boots.

W vs B is my most tenuous guess, I hope she won't hold a grudge after I send her out unequipped to boost everyone else's chances.

kareempforbes on Why I’m not a Bayesian

Your article is a great read!

In my view, we can categorize scientists into two broad types: technician scientists, who focus on refining and perfecting existing theories, and creative scientists, who make generational leaps forward with groundbreaking ideas. No theory is ever 100% correct—each is simply an attempt to better explain a phenomenon in a way that’s useful to us.

Take Newton, for example. His theory of gravity was revolutionary, introducing concepts no one had thought of before—it was a generational achievement. But then Einstein came along, asking why objects with mass attract one another. Newton's equations could predict gravitational forces accurately, but they didn’t explain the underlying cause. Einstein’s theory of relativity made a creative leap, adding a new dimension by introducing space-time as part of the explanation. It provided a more accurate theoretical representation of gravity and broadened our understanding of mass and space-time, marking yet another generational leap.

Then we have Oppenheimer, who refined existing theories in physics to develop the atomic bomb. While his work was groundbreaking, it was more a refinement of known principles rather than a creative leap like Einstein’s. I would classify Oppenheimer as more of a "technician scientist" than a "creative" one, although I defer to experts in physics, as it's not my field.

Regarding your article, I really enjoyed it. On the topic of the "Tiger in my house" question, the answer depends on how we define "tiger." If you mean a living, biological tiger, the answer is clearly no. But if you mean any type of tiger, such as a toy tiger, then the answer could be yes. This same creative reasoning can apply to the question about water in the fridge. Even if there’s no glass of water, the air in the fridge likely contains humidity, meaning there’s always some amount of water present.

Lastly, regarding the diagram with lines, curves, and points: one creative way to introduce ambiguity into this solidly two-dimensional example is by adding more dimensions. In a strictly two-dimensional projection, the relationships between points and lines as stated hold true. However, if a third spatial dimension or a fourth dimension (such as time) is introduced, the axioms that govern the two-dimensional model might no longer apply. For example, the rule that exactly one line passes through two points could be violated in a three-dimensional space, where two points might lie on different planes. Therefore, while the statements about the model are accurate within the context of two-dimensional space, they may no longer hold when additional dimensions are considered.

bhauth on Start an Upper-Room UV Installation Company?

I'm not convinced that far-UVC is safe enough around humans to be a good idea. It's strongly absorbed by proteins so it doesn't penetrate much, but:

It can make reactive compounds from organic compounds in air.
It can produce ozone, depending on the light. (That's why mercury vapor lamps block the 185nm emission.)
It could potentially make toxic compounds when it's absorbed by proteins in skin or eyes.
It definitely causes degradation of plastics.

And really, what's the point? Why not just have fans sending air to (cheap) mercury vapor lamps in a contained area where they won't hit people or plastics?

raemon on Struggling like a Shadowmoth

I agree this is how a lot of people execute "become stronger" this way. It's not too surprising if it turns out to be an essential part of the process (at least when implemented in humans).

But, at the very least there seem like more and less healthy ways of doing it, and I personally think I see the outlines of how I could operate fairly differently and still have my drive.

jay on Struggling like a Shadowmoth

What if there would never be someone I trusted who could tell me I was Good Enough, that things were in some sense Okay?

The internalized feeling that you're not okay is a huge part of what motivates you to become better. If you lost it, you would be much more likely to become complacent and stagnate. Both inner peace and relentless drive are profoundly valuable, but they are mutually exclusive.

david-gross on How have you become more hard-working?

FWIW: I've added my summary of the answers here to my Notes on Industriousness [LW · GW].

steve-roth on If far-UV is so great, why isn't it everywhere?

I’m stunned that the word “duct” doesn’t appear in this article. UV in ductwork is cheap, very effective, and has no downsides. I’m flummoxed why it isn’t widely employed. Can you help? Thanks.

roko on The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind

catgirls are consensually participating in a universe that is not optimal for them because they are stuck in the harem of a loser nerd with no other males and no other purpose in life other than being a concubine to Reedspacer

And, the problem with saying "OK let's just ban the creation of catgirls" is that then maybe Reedspacer builds a volcano lair just for himself and plays video games in it, and the catgirls whose existence you prevented are going to scream bloody murder because you took away from them a very good existence that they would have enjoyed and also made Reedsapcer sad.