LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

US Presidential Election: Tractability, Importance, and Urgency
kuhanj · 2024-05-29T23:52:22.420Z · comments (2)

Proselytizing
lsusr · 2025-02-22T11:54:12.740Z · comments (0)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (10)

Take SCIFs, it’s dangerous to go alone
latterframe · 2024-05-01T08:02:38.067Z · comments (1)

Debate: Get a college degree?
Ben Pace (Benito) · 2024-08-12T22:23:34.744Z · comments (14)

[link] IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman · 2024-10-24T20:30:41.159Z · comments (13)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (40)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

[link] you should probably eat oatmeal sometimes
bhauth · 2024-08-25T14:50:37.570Z · comments (32)

How ARENA course material gets made
CallumMcDougall (TheMcDouglas) · 2024-07-02T18:04:00.209Z · comments (2)

One-shot strategy games?
Raemon · 2024-03-11T00:19:20.480Z · comments (42)

Reflections on the Metastrategies Workshop
gw · 2024-10-24T18:30:46.255Z · comments (5)

What happens next?
Logan Zoellner (logan-zoellner) · 2024-12-29T01:41:33.685Z · comments (19)

[link] Adverse Selection by Life-Saving Charities
vaishnav92 · 2024-08-14T20:46:23.662Z · comments (16)

Sleep, Diet, Exercise and GLP-1 Drugs
Zvi · 2025-01-21T12:20:06.018Z · comments (5)

When Are Results from Computational Complexity Not Too Coarse?
Dalcy (Darcy) · 2024-07-03T19:06:44.953Z · comments (8)

Book review: The Quincunx
cousin_it · 2024-06-05T21:13:55.055Z · comments (12)

[link] We don't want to post again "This might be the last AI Safety Camp"
Remmelt (remmelt-ellen) · 2025-01-21T12:03:33.171Z · comments (17)

[link] Jailbreak steering generalization
Sarah Ball · 2024-06-20T17:25:24.110Z · comments (4)

A Teacher vs. Everyone Else
ronak69 · 2024-03-21T17:45:35.714Z · comments (8)

Causal Undertow: A Work of Seed Fiction
Daniel Murfet (dmurfet) · 2024-12-08T21:41:48.132Z · comments (0)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

[link] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

[link] What's important in "AI for epistemics"?
Lukas Finnveden (Lanrian) · 2024-08-24T01:27:06.771Z · comments (0)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

(Approximately) Deterministic Natural Latents
johnswentworth · 2024-07-19T23:02:12.306Z · comments (0)

[link] Beyond the Board: Exploring AI Robustness Through Go
AdamGleave · 2024-06-19T16:40:06.594Z · comments (2)

[link] A car journey with conservative evangelicals - Understanding some British political-religious beliefs
Nathan Young · 2024-12-06T11:22:45.563Z · comments (8)

Worries about latent reasoning in LLMs
CBiddulph (caleb-biddulph) · 2025-01-20T09:09:02.335Z · comments (3)

My January alignment theory Nanowrimo
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T00:07:24.050Z · comments (2)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (23)

Superintelligent AI is possible in the 2020s
HunterJay · 2024-08-13T06:03:26.990Z · comments (3)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

Surviving Seveneves
Yair Halberstadt (yair-halberstadt) · 2024-06-19T13:11:55.414Z · comments (4)

[link] Podcast with Yoshua Bengio on Why AI Labs are “Playing Dice with Humanity’s Future”
garrison · 2024-05-10T17:23:20.436Z · comments (0)

Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken
Zvi · 2024-04-01T19:10:12.193Z · comments (1)

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

Trying to translate when people talk past each other
Kaj_Sotala · 2024-12-17T09:40:02.640Z · comments (12)

GPT-4o My and Google I/O Day
Zvi · 2024-05-16T17:50:03.040Z · comments (2)

AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan · 2024-12-01T06:00:06.345Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

norimori1992 on Harry Potter and the Methods of Rationality discussion thread, part 3

Canon already acknowledges that it might be detrimental. "Sometimes I think we Sort too early."

willpetillo on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

If you want to get an informed opinion on how the general public perceives PauseAI, get a t-shirt and hand out some flyers in a high foot-traffic public space. If you want to be formal about it, bring a clipboard, track whatever seems interesting in advance, and share your results. It might not be publishable on an academic forum, but you could do it next week.

Here's what I expect you to find, based on my own experience and the reports of basically everyone who has done this:
- No one likes flyers, but get a lot more interested if you can catch their attention enough to say it's about AI.
- Everyone hates AI.
- Your biggest initial skepticism will be from people who think you are in favor of AI.
- Your biggest actual pushback will be from people who think that social change is impossible.
- Roughly 1/4 to 1/2 are amenable to (or have already heard about!) x-risk, most of the rest won't actively disagree but you can tell that particular message is not really "landing" and pay a lot more attention if you talk about something else (unemployment, military applications, deepfakes, etc.)
- Bring a clipboard for signups. Even if recruitment isn't your goal, if you don't have one you'll feel unprepared when people ask about it.

Also, protests are about Overton-window shifting, making AI danger a thing that is acceptable to talk about. And even if it makes a specific org look "fringe" (not a given, as Holly has argued), that isn't necessarily a bad thing for the underlying cause. For example, if I see an XR protest, my thought is (well, was before I knew the underlying methodology): "Ugh, those protestors...I mean, I like what they are fighting for and more really needs to be done, but I don't like the way they go about it" Notice that middle part. Activation of a sympathetic but passive audience was the point. That's a win from their perspective. And the people who are put off by methods then go on to (be more likely to) join allied organizations that believe the same things but use more moderate tactics. The even bigger win is when the enthusiasm catches the attention of people who want to be involved but are looking for orgs that are the "real deal," as measured by willingness to put effort where their words are.

samuelshadrach on xpostah's Shortform

I agree my point is less important if we get ASI by 2030, compared to if we don’t get ASI.

That being said, the arms race can develop over the timespan of years not decades. 6-year superhumans will prompt people to create the next generation of superhumans, and within 10-15 years we will have children from multiple generations where the younger generation have edits with stronger effect sizes. Once we can see the effects on these multiple generations, people might go at max pace.

thomas-kwa on Thomas Kwa's Shortform

Will we ever have Poké Balls in real life? How fast could they be at storing and retrieving animals? Requirements:

Made of atoms, no teleportation or fantasy physics.
Small enough to be easily thrown, say under 5 inches diameter
Must be able to disassemble and reconstruct an animal as large as an elephant in a reasonable amount of time, say 5 minutes, and store its pattern digitally
Must reconstruct the animal to enough fidelity that its memories are intact and it's physically identical for most purposes, though maybe not quite to the cellular level
No external power source
Works basically wherever you throw it, though it might be slower to print the animal if it only has air to use as feedstock mass or can't spread out to dissipate heat
Should not destroy nearby buildings when used
Animals must feel no pain during the process

It feels pretty likely to me that we'll be able to print complex animals eventually using nanotech/biotech, but the speed requirements here might be pushing the limits of what's possible. In particular heat dissipation seems like a huge challenge; assuming that 0.2 kcal/g of waste heat is created while assembling the elephant, which is well below what elephants need to build their tissues, you would need to dissipate about 5 GJ of heat, which would take even a full-sized nuclear power plant cooling tower a few seconds. Power might be another challenge. Drexler claims you can eat fuel and oxidizer, turn all the mass into basically any lower-energy state, and come out easily net positive on energy. But if there is none available you would need a nuclear reactor.

lsusr on lsusr's Shortform

Magicians don't pick locks on stage.

The locks are all fake. The test of skill is convincing you they're real.

Except when the locks are totally real and the magician just bypasses them.

yehuda-rimon on Name for Standard AI Caveat?

Generally it's the former, or someone who is faintly AI aware but not so interested in delving into the consequences. However, I'd like to represent my true opinions which involve significant AI driven disruption, hence the need for a caveat.

cubefox on xpostah's Shortform

Standard objection: Genetic engineering takes a lot of time till it has any effect. A baby doesn't develop into an adult over night. So it will almost certainly not matter relative to the rapid pace of AI development.

anthonyc on Name for Standard AI Caveat?

I'm curious as to the viewpoint of the other party in these conversations? If they're not aware of/interested in/likely to be thinking about the disruptive effects of AI, then I would usually just omit mentioning it. You know you're conditioning on that caveat, and their thinking does so without them realizing it.

If the other party is more AI-aware, and they know you are as well, you can maybe just keep it simple, something like, "assuming enough normality for this to matter."

mo-putera on Evaluating “What 2026 Looks Like” So Far

I'm curious now, given how accurate your forecasts have turned out and maybe taking into account Jonny's remark that "the predictions are (to my eye) under-optimistic on capabilities", what are the most substantive changes you'd make to your 2025-26 predictions?

cubefox on Benito's Shortform Feed

The highlights are officially called "text fragments" and the syntax is described here: https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments