LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Advice for journalists
Nathan Young · 2024-10-07T16:46:40.929Z · comments (53)

My AGI safety research—2024 review, ’25 plans
Steven Byrnes (steve2152) · 2024-12-31T21:05:19.037Z · comments (4)

You can, in fact, bamboozle an unaligned AI into sparing your life
David Matolcsi (matolcsid) · 2024-09-29T16:59:43.942Z · comments (171)

MIRI’s 2024 End-of-Year Update
Rob Bensinger (RobbBB) · 2024-12-03T04:33:47.499Z · comments (2)

The nihilism of NeurIPS
charlieoneill (kingchucky211) · 2024-12-20T23:58:11.858Z · comments (7)

Bigger Livers?
sarahconstantin · 2024-11-08T21:50:09.814Z · comments (13)

The "Think It Faster" Exercise
Raemon · 2024-12-11T19:14:10.427Z · comments (13)

[link] Seven lessons I didn't learn from election day
Eric Neyman (UnexpectedValues) · 2024-11-14T18:39:07.053Z · comments (33)

[question] What are the strongest arguments for very short timelines?
Kaj_Sotala · 2024-12-23T09:38:56.905Z · answers+comments (74)

The case for unlearning that removes information from LLM weights
Fabien Roger (Fabien) · 2024-10-14T14:08:04.775Z · comments (15)

[link] Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman · 2024-11-06T16:00:06.940Z · comments (33)

Deep Causal Transcoding: A Framework for Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-12-03T21:19:42.333Z · comments (7)

A breakdown of AI capability levels focused on AI R&D labor acceleration
ryan_greenblatt · 2024-12-22T20:56:00.298Z · comments (5)

[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)

[link] Sabotage Evaluations for Frontier Models
David Duvenaud (david-duvenaud) · 2024-10-18T22:33:14.320Z · comments (55)

2024 Petrov Day Retrospective
Ben Pace (Benito) · 2024-09-28T21:30:14.952Z · comments (25)

Reasons for and against working on technical AI safety at a frontier AI lab
bilalchughtai (beelal) · 2025-01-05T14:49:53.529Z · comments (12)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (11)

Introducing Squiggle AI
ozziegooen · 2025-01-03T17:53:42.915Z · comments (15)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

LLMs Look Increasingly Like General Reasoners
eggsyntax · 2024-11-08T23:47:28.886Z · comments (45)

Zvi’s Thoughts on His 2nd Round of SFF
Zvi · 2024-11-20T13:40:08.092Z · comments (2)

[link] The Intelligence Curse
lukedrago · 2025-01-03T19:07:43.493Z · comments (26)

Anvil Problems
Screwtape · 2024-11-13T22:57:41.974Z · comments (13)

A very strange probability paradox
notfnofn · 2024-11-22T14:01:36.587Z · comments (26)

Comment on "Death and the Gorgon"
Zack_M_Davis · 2025-01-01T05:47:30.730Z · comments (31)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

The subset parity learning problem: much more than you wanted to know
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-03T09:13:59.245Z · comments (18)

[link] Should you be worried about H5N1?
gw · 2024-12-05T21:11:06.996Z · comments (2)

Three Notions of "Power"
johnswentworth · 2024-10-30T06:10:08.326Z · comments (44)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

(Salt) Water Gargling as an Antiviral
Elizabeth (pktechgirl) · 2024-11-22T18:00:02.765Z · comments (6)

Parable of the vanilla ice cream curse (and how it would prevent a car from starting!)
Mati_Roy (MathieuRoy) · 2024-12-08T06:57:45.783Z · comments (21)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman (UnexpectedValues) · 2024-10-07T19:29:29.033Z · comments (2)

The purposeful drunkard
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-12T12:27:51.952Z · comments (8)

Circling as practice for “just be yourself”
Kaj_Sotala · 2024-12-16T07:40:04.482Z · comments (5)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (5)

Tips On Empirical Research Slides
James Chua (james-chua) · 2025-01-08T05:06:44.942Z · comments (4)

Is "VNM-agent" one of several options, for what minds can grow up into?
AnnaSalamon · 2024-12-30T06:36:20.890Z · comments (53)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (14)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

[link] On Eating the Sun
jessicata (jessica.liu.taylor) · 2025-01-08T04:57:20.457Z · comments (90)

Some arguments against a land value tax
Matthew Barnett (matthew-barnett) · 2024-12-29T15:17:00.740Z · comments (39)

Remap your caps lock key
bilalchughtai (beelal) · 2024-12-15T14:03:33.623Z · comments (17)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (21)

[question] What are the good rationality films?
Ben Pace (Benito) · 2024-11-20T06:04:56.757Z · answers+comments (53)

AI #92: Behind the Curve
Zvi · 2024-11-28T14:40:05.448Z · comments (7)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

mikhail-samin on No one has the ball on 1500 Russian olympiad winners who've received HPMOR

huh?

I would want people who might meaningfully contribute to solving what's probably the most important problem humanity has ever faced to learn about it and, if they judge they want to work on it, to be enabled to work on it. I think it'd be a good use of resources to make capable people learn about the problem and show them they can help with it. Why does it scream "cult tactic" to you?

niplav on How do you deal w/ Super Stimuli?

Solution: No internet at home. Gave me back ~3hr per day when nothing else worked.

Wikipedia can be downloaded via kiwix (~90GB for English WP with images), programming documentation with zeal & devdocs.io. No great solution for LLMs, all the ones I can run on my laptop are not good enough—may I should bite the bullet and get a GPU that can run of the big LLaMas locally). I have internet at my workplace, and a local library with internet 10 minutes away by walk that closes at 7PM. No mobile internet either.

sil-ver on Why it's so hard to talk about Consciousness

(Self-Review.)

I still endorse every claim in this post. The one thing I keep wondering is whether I should have used real examples from discussion threads on LessWrong to illustrate the application of the two camp model, rather than making up a fictional discussion as I did in the post. I think that would probably help, but it would require singling out someone and using them as a negative example, which I don't want to do. I'm still reading every new post and comment section about consciousness and often link to this post when I see something that looks like miscommunication to me; I think that works reasonably well.

However, I did streamline the second half of the post (took out the part about modeling the brain as a graph, I don't think that was necessary to make the point about research) and added a new section about terminology. I think that should make it a little easier to diagnose when the model is relevant in real discussions.

cody-rushing on How do you deal w/ Super Stimuli?

This might not work well for others, but a thing that's worked well for me has been to (basically) block cheap access to it with anticharities. Introducing friction in general is good

notfnofn on How do you deal w/ Super Stimuli?

There have been a lot of tricks I've used over the years, some of which I'm still using now, but many of which require some level of discipline. One requires basically none, has a huge upside (to me), and has been trivial for me to maintain for years: a "newsfeed eradicator" extension. I've never had the temptation to turn it off unless it really messes with the functionality of a website.

It basically turns off the "front page" of whatever website you apply it to (e.g. reddit/twitter/youtube/facebook) so that you don't see anything when you enter the site and have to actually search for whatever you're interested in. And for youtube, you never see suggestions to the right of or at the end of a video.

nim on Don’t Legalize Drugs

Alcohol is also a drug. If Dalrymple really means "drugs" when he says "drugs", it would follow that he's advocating for prohibition to protect alcoholics from themselves.

We seem to have found a relatively tolerable equilibrium around alcohol where the substance is widely available, the majority of individuals who can enjoy it recreationally are free to do so, and yet it's legally just as intolerable for an intoxicated person to harm others as it would be for a sober person to take the same actions. Some individuals have addiction problems, and we have varyingly effective programs in place to help them deal with that, but ultimately the right of the majority to enjoy it responsibly (and the rights of the businesses to sell it to those who can use it responsibly) trump the "rights" of the minority to be protected from themselves by the government.

Maybe to get the same equilibrium around other drugs, we would need harsher punishments for the antisocial behaviors that we're actually trying to prevent by banning the drugs themselves. All I know is that anyone who unironically makes "ban the intoxicants" claims without considering what we can learn from our most widely accepted and normalized intoxicants is speaking on some level other than the literal and logical.

jacques-thibodeau on Building AI Research Fleets

Agreed, but I will find a way.

mitchell_porter on Why do futurists care about the culture war?

Regarding Musk and Thiel, foremost they are billionaire capitalists, individuals who built enormous business empires. Even if we assume your thinking about the future is correct, we shouldn't assume that they have reproduced every step of it. You may simply be more advanced in your thinking about the future than they are. Their thought about the future crystallized in the 1980s, when they were young. Since then they have been preoccupied with building their empires.

This raises the question, how do they see the future, and their relationship to it? I think Musk's life purpose is the colonization of Mars, so that humanity's fate isn't tied to what happens on Earth. Everything else is subordinate to that, and even robots and AI are just servants and companions for humanity in its quest for other worlds. As for Thiel, I have less sense of the gestalt of his business activities, but philosophically, the culture war seems very important to him. He may have a European sense of how self-absorbed cultural elites can narrow a nation's horizons, that drives his sponsorship of "heterodox" intellectuals outside the academy.

If I'm right, the core of Musk's futurism is space colonization, and the core of Thiel's futurism is preserving an open society. They don't have the idea of an intelligence singularity whose outcome determines everything afterwards. In this regard, they're closer to e/acc than singularity thinking, because e/acc believes in a future that always remains open, uncertain, and pluralist, whereas singularity thinking tends towards a single apocalyptic moment in which superintelligence is achieved and irreversibly shapes the world.

There are other reasons I can see why they would involve themselves in the culture war. They don't want a socialism that would interfere with their empires; they think (or may have thought until the last few years) that superintelligence is decades away; they see their culture war opponents as a threat to a free future (whether that is seen in e/acc or singularity terms), or even to the very existence of any kind of technological future society.

But if I were to reduce it to one thing: they don't believe in models of the future according to which you get one thing right and then utopia follows, and they believe such thinking actually leads to totalitarian outcomes (where their definition of totalitarian may be, a techno-political order capable of preventing the building of a personal empire). Musk started OpenAI so Google wouldn't be the sole AI superpower; he was worried about centralization as such, not about whether they would get the value system right. Thiel gave up on MIRI's version of AI futurology years ago as a salvationist cult; I think he would actually prefer no AI to aligned AI, if the latter means alignment with a particular value system rather than alignment with what the user wants.

habryka4 on Habryka's Shortform Feed

Oops, you're right, fixed. That was just an accident.

christiankl on Fabien's Shortform

When it comes to blind spots, we do have areas like medicine where we don't pay as a result of outcomes of medical treatment. That leads to silly things that when surgeons say that having 4k monitors will obviously improve the way they do surgery because it allows them to see details that they otherwise wouldn't, without anyone running a clinical trial that shows 4k monitors to be superior, they don't get adopted.

Evidence-based medicine is a strong dogma [? · GW]that prevents market economies from making the medical provider that creates the best outcomes win.