LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Open Thread Fall 2024
habryka (habryka4) · 2024-10-05T22:28:50.398Z · comments (54)

[link] The Offense-Defense Balance of Gene Drives
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-27T16:47:25.976Z · comments (1)

[LDSL#3] Information-orientation is in tension with magnitude-orientation
tailcalled · 2024-08-10T21:58:27.659Z · comments (2)

[link] on Science Beakers and DDT
bhauth · 2024-09-05T03:21:21.382Z · comments (13)

AI Safety University Organizing: Early Takeaways from Thirteen Groups
agucova · 2024-10-02T15:14:00.137Z · comments (0)

[link] [Paper] Hidden in Plain Text: Emergence and Mitigation of Steganographic Collusion in LLMs
Yohan Mathew (ymath) · 2024-09-25T14:52:48.263Z · comments (0)

Geoffrey Hinton on the Past, Present, and Future of AI
Stephen McAleese (stephen-mcaleese) · 2024-10-12T16:41:56.796Z · comments (5)

[link] Day Zero Antivirals for Future Pandemics
Niko_McCarty (niko-2) · 2024-08-26T15:18:33.858Z · comments (2)

August 2024 Time Tracking
jefftk (jkaufman) · 2024-08-24T13:50:04.676Z · comments (0)

Deception and Jailbreak Sequence: 1. Iterative Refinement Stages of Deception in LLMs
Winnie Yang (winnie-yang) · 2024-08-22T07:32:07.600Z · comments (0)

Monthly Roundup #21: August 2024
Zvi · 2024-08-20T00:20:08.178Z · comments (6)

Can We Predict Persuasiveness Better Than Anthropic?
Lennart Finke (l-f) · 2024-08-04T14:05:33.668Z · comments (5)

[link] The Tech Industry is the Biggest Blocker to Meaningful AI Safety Regulations
garrison · 2024-08-16T19:37:28.416Z · comments (1)

[link] Hyperpolation
Gunnar_Zarncke · 2024-09-15T21:37:00.002Z · comments (6)

Alignment by default: the simulation hypothesis
gb (ghb) · 2024-09-25T16:26:00.552Z · comments (39)

[link] An ML paper on data stealing provides a construction for "gradient hacking"
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-07-30T21:44:37.310Z · comments (1)

AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan · 2024-08-24T22:30:02.039Z · comments (0)

AXRP Episode 37 - Jaime Sevilla on Forecasting AI
DanielFilan · 2024-10-04T21:00:03.077Z · comments (3)

[link] To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-19T16:13:55.835Z · comments (1)

Launching Adjacent News
Lucas Kohorst (lucas-kohorst) · 2024-10-16T17:58:10.289Z · comments (0)

[LDSL#5] Comparison and magnitude/diminishment
tailcalled · 2024-08-12T18:47:20.546Z · comments (0)

[link] The Great Organism Theory of Evolution
rogersbacon · 2024-08-10T12:26:02.434Z · comments (0)

Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev (sunishchal-dev) · 2024-10-15T18:25:08.179Z · comments (0)

Ransomware Payments Should Require a Sin Tax
Brian Bien (brian-bien) · 2024-07-22T21:16:29.029Z · comments (10)

My decomposition of the alignment problem
Daniel C (harper-owen) · 2024-09-02T00:21:08.359Z · comments (22)

[link] Four Randomized Control Trials In Economics
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-08T15:59:23.250Z · comments (1)

[link] Compression Moves for Prediction
adamShimi · 2024-09-14T17:51:12.004Z · comments (0)

[link] Anthropic is being sued for copying books to train Claude
Remmelt (remmelt-ellen) · 2024-08-31T02:57:27.092Z · comments (4)

How Often Does Taking Away Options Help?
niplav · 2024-09-21T21:52:40.822Z · comments (6)

Musings on Text Data Wall (Oct 2024)
Vladimir_Nesov · 2024-10-05T19:00:21.286Z · comments (2)

[link] Green and golden: a meditation
Richard_Ngo (ricraz) · 2024-08-18T01:36:43.613Z · comments (0)

The Bar for Contributing to AI Safety is Lower than You Think
Chris_Leong · 2024-08-16T15:20:19.055Z · comments (1)

Simon DeDeo on Explore vs Exploit in Science
Elizabeth (pktechgirl) · 2024-09-10T03:40:08.311Z · comments (0)

[link] How to choose what to work on
jasoncrawford · 2024-09-18T20:39:12.316Z · comments (6)

Gell-Mann checks
Cleo Scrolls (cleo-scrolls) · 2024-09-26T22:45:43.569Z · comments (7)

[link] [Linkpost] 'The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery'
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-08-15T21:32:59.979Z · comments (1)

[link] Does natural selection favor AIs over humans?
cdkg · 2024-10-03T18:47:43.517Z · comments (1)

[link] AI Model Registries: A Foundational Tool for AI Governance
Elliot Mckernon (elliot) · 2024-10-07T19:27:43.466Z · comments (1)

A necessary Membrane formalism feature
ThomasCederborg · 2024-09-10T21:33:09.508Z · comments (6)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (2)

Why I'm bearish on mechanistic interpretability: the shards are not in the network
tailcalled · 2024-09-13T17:09:25.407Z · comments (40)

Why Reflective Stability is Important
Johannes C. Mayer (johannes-c-mayer) · 2024-09-05T15:28:19.913Z · comments (2)

What program structures enable efficient induction?
Daniel C (harper-owen) · 2024-09-05T10:12:14.058Z · comments (4)

Announcing the PIBBSS Symposium '24!
DusanDNesic · 2024-09-03T11:19:47.568Z · comments (0)

Looking for Goal Representations in an RL Agent - Update Post
CatGoddess · 2024-08-28T16:42:19.367Z · comments (0)

[question] What are the best resources for building gears-level models of how governments actually work?
adamShimi · 2024-08-19T14:05:02.590Z · answers+comments (6)

[question] What should we do about COVID in 2024?
ChristianKl · 2024-08-04T10:57:24.140Z · answers+comments (2)

Tokenized SAEs: Infusing per-token biases.
tdooms · 2024-08-04T09:17:46.755Z · comments (20)

Scaling Laws and Likely Limits to AI
Davidmanheim · 2024-08-18T17:19:46.597Z · comments (0)

[link] To Be Born in a Bag
Niko_McCarty (niko-2) · 2024-10-06T17:21:00.605Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

niplav on yams's Shortform

Grants to Redwood Research, SERI MATS, NYU alignment group under Sam Bowman for scalable supervision, Palisade research, and many dozens more, most of which seem net positive wrt TAI risk.

martin-randall on The Hidden Complexity of Wishes

I think it's important to be able to make a narrow point about outer alignment without needing to defend a broader thesis about the entire alignment problem.

Indeed. For it is written [LW · GW]:

A mind that ever wishes to learn anything complicated, must learn to cultivate an interest in which particular exact argument steps are valid, apart from whether you yet agree or disagree with the final conclusion, because only in this way can you sort through all the arguments and finally sum them.

For more on this topic see "Local Validity as a Key to Sanity and Civilization [? · GW]."

abstractapplic on D&D Sci Coliseum: Arena of Data

I took an analytic approach and picked some reasonable choices based on that. I'll almost certainly try throwing ML at this problem some point but for now I want to note down what a me-who-can't-use-XGBoost would do.

Findings:

There are at least some fingerprintable gladiators who keep gladiating, and who need to be Accounted For (the presence of such people makes all archetypery suspect: are Dwarven Knights really that Good, or are there just a handful of super-prolific Dwarven Knights who give everyone an unfairly good impression?). This includes a Level 7 Elven Ninja, almost certainly Cadagal's Champion, who inexplicably insists on always wearing black (even though it doesn't seem to make a difference to how well ninjas ninj).

Level 4 Boots and Level 4 Gauntlets are super rare in the dataset. The Gauntlets are always worn by a pair of hypercompetent Level 7 Dwarven Monks; the Boots are always worn by the Level 7 Elven Ninja.

Despite this, Cadagal's Champion is facing us with Level 2 Boots.

We have some Level 4 Boots.

. . . we robbed this guy, didn't we? And if we wear the boots - our most powerful equipment - he'll flip out and set his House against us whether we win or lose? Dammit . . .

Who fights whom?

A is a Human Warrior. Warriors lose to Fencers, Humans lose to Fencers, Humans lose to Elves. We have an Elven Fencer on call; send Y.

B is a Human Knight. Rangers are best vs Knights, so send W. (Not super confident in this one)

C is an Elven Ninja. Ninjas are super weak against Knights. Send Z, the Elven Knight. (Slightly concerned by how underrepresented Elves are in the sample of gladiators who managed to beat this guy but I'm assuming that's either noise or an effect which Z will be able to shrug off with the Power of Friendship and/or Urgency)

D is a Dwarven Monk. Monks are weak to Ninjas; send U.

Who wears what?

I haven't managed to figure out how equipment works beyond "higher number good"; if there's specific synergies with/against specific classes/races/whatever they elude me. For that reason:

Y and Z are my best shots. I'll have them both wear what their opponents are wearing, to reduce the effects of uncertainty and turn those fights into "who wore it better?" contests. (So +3 Boots and +1 Gauntlets for Y, +2 Boots and +3 Gauntlets for Z.)

U vs D looks pretty solid so I'll give him the remaining +2 Gauntlets and +1 Boots.

W vs B is my most tenuous guess, I hope she won't hold a grudge after I send her out unequipped to boost everyone else's chances.

kareempforbes on Why I’m not a Bayesian

Your article is a great read!

In my view, we can categorize scientists into two broad types: technician scientists, who focus on refining and perfecting existing theories, and creative scientists, who make generational leaps forward with groundbreaking ideas. No theory is ever 100% correct—each is simply an attempt to better explain a phenomenon in a way that’s useful to us.

Take Newton, for example. His theory of gravity was revolutionary, introducing concepts no one had thought of before—it was a generational achievement. But then Einstein came along, asking why objects with mass attract one another. Newton's equations could predict gravitational forces accurately, but they didn’t explain the underlying cause. Einstein’s theory of relativity made a creative leap, adding a new dimension by introducing space-time as part of the explanation. It provided a more accurate theoretical representation of gravity and broadened our understanding of mass and space-time, marking yet another generational leap.

Then we have Oppenheimer, who refined existing theories in physics to develop the atomic bomb. While his work was groundbreaking, it was more a refinement of known principles rather than a creative leap like Einstein’s. I would classify Oppenheimer as more of a "technician scientist" than a "creative" one, although I defer to experts in physics, as it's not my field.

Regarding your article, I really enjoyed it. On the topic of the "Tiger in my house" question, the answer depends on how we define "tiger." If you mean a living, biological tiger, the answer is clearly no. But if you mean any type of tiger, such as a toy tiger, then the answer could be yes. This same creative reasoning can apply to the question about water in the fridge. Even if there’s no glass of water, the air in the fridge likely contains humidity, meaning there’s always some amount of water present.

Lastly, regarding the diagram with lines, curves, and points: one creative way to introduce ambiguity into this solidly two-dimensional example is by adding more dimensions. In a strictly two-dimensional projection, the relationships between points and lines as stated hold true. However, if a third spatial dimension or a fourth dimension (such as time) is introduced, the axioms that govern the two-dimensional model might no longer apply. For example, the rule that exactly one line passes through two points could be violated in a three-dimensional space, where two points might lie on different planes. Therefore, while the statements about the model are accurate within the context of two-dimensional space, they may no longer hold when additional dimensions are considered.

bhauth on Start an Upper-Room UV Installation Company?

I'm not convinced that far-UVC is safe enough around humans to be a good idea. It's strongly absorbed by proteins so it doesn't penetrate much, but:

It can make reactive compounds from organic compounds in air.
It can produce ozone, depending on the light. (That's why mercury vapor lamps block the 185nm emission.)
It could potentially make toxic compounds when it's absorbed by proteins in skin or eyes.
It definitely causes degradation of plastics.

And really, what's the point? Why not just have fans sending air to (cheap) mercury vapor lamps in a contained area where they won't hit people or plastics?

raemon on Struggling like a Shadowmoth

I agree this is how a lot of people execute "become stronger" this way. It's not too surprising if it turns out to be an essential part of the process (at least when implemented in humans).

But, at the very least there seem like more and less healthy ways of doing it, and I personally think I see the outlines of how I could operate fairly differently and still have my drive.

jay on Struggling like a Shadowmoth

What if there would never be someone I trusted who could tell me I was Good Enough, that things were in some sense Okay?

The internalized feeling that you're not okay is a huge part of what motivates you to become better. If you lost it, you would be much more likely to become complacent and stagnate. Both inner peace and relentless drive are profoundly valuable, but they are mutually exclusive.

david-gross on How have you become more hard-working?

FWIW: I've added my summary of the answers here to my Notes on Industriousness [LW · GW].

steve-roth on If far-UV is so great, why isn't it everywhere?

I’m stunned that the word “duct” doesn’t appear in this article. UV in ductwork is cheap, very effective, and has no downsides. I’m flummoxed why it isn’t widely employed. Can you help? Thanks.

roko on The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind

catgirls are consensually participating in a universe that is not optimal for them because they are stuck in the harem of a loser nerd with no other males and no other purpose in life other than being a concubine to Reedspacer

And, the problem with saying "OK let's just ban the creation of catgirls" is that then maybe Reedspacer builds a volcano lair just for himself and plays video games in it, and the catgirls whose existence you prevented are going to scream bloody murder because you took away from them a very good existence that they would have enjoyed and also made Reedsapcer sad.