LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Problem relaxation as a tactic
TurnTrout · 2020-04-22T23:44:42.398Z · comments (8)

Late 2021 MIRI Conversations: AMA / Discussion
Rob Bensinger (RobbBB) · 2022-02-28T20:03:05.318Z · comments (199)

Future ML Systems Will Be Qualitatively Different
jsteinhardt · 2022-01-11T19:50:11.377Z · comments (10)

Delta Strain: Fact Dump and Some Policy Takeaways
Connor_Flexman · 2021-07-28T03:38:34.455Z · comments (60)

Christiano, Cotra, and Yudkowsky on AI progress
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T16:45:32.482Z · comments (95)

AI catastrophes and rogue deployments
Buck · 2024-06-03T17:04:51.206Z · comments (16)

FHI paper published in Science: interventions against COVID-19
SoerenMind · 2020-12-16T21:19:00.441Z · comments (0)

Perpetual Dickensian Poverty?
jefftk (jkaufman) · 2021-12-21T13:30:03.543Z · comments (18)

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines
307th · 2023-10-20T16:37:46.541Z · comments (33)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
StefanHex (Stefan42) · 2023-05-09T19:41:10.528Z · comments (1)

RTFB: On the New Proposed CAIP AI Bill
Zvi · 2024-04-10T18:30:08.410Z · comments (14)

[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)

Revealing Intentionality In Language Models Through AdaVAE Guided Sampling
jdp · 2023-10-20T07:32:28.749Z · comments (15)

Unwitting cult leaders
Kaj_Sotala · 2021-02-11T11:10:04.504Z · comments (9)

Building AI Research Fleets
Ben Goldhaber (bgold) · 2025-01-12T18:23:09.682Z · comments (8)

AI #14: A Very Good Sentence
Zvi · 2023-06-01T21:30:04.548Z · comments (30)

[question] Why The Focus on Expected Utility Maximisers?
DragonGod · 2022-12-27T15:49:36.536Z · answers+comments (84)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (32)

The Standard Analogy
Zack_M_Davis · 2024-06-03T17:15:42.327Z · comments (28)

[question] Which skincare products are evidence-based?
Vanessa Kosoy (vanessa-kosoy) · 2024-05-02T15:22:12.597Z · answers+comments (48)

Unifying Bargaining Notions (2/2)
Diffractor · 2022-07-27T03:40:30.524Z · comments (19)

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years
basil.halperin (bhalperin) · 2023-01-10T16:06:52.329Z · comments (44)

Full-time AGI Safety!
Steven Byrnes (steve2152) · 2021-03-01T12:42:14.813Z · comments (3)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

AI Alignment Metastrategy
Vanessa Kosoy (vanessa-kosoy) · 2023-12-31T12:06:11.433Z · comments (13)

The case against AI alignment
andrew sauer (andrew-sauer) · 2022-12-24T06:57:53.405Z · comments (110)

Narrative Syncing
AnnaSalamon · 2022-05-01T01:48:45.889Z · comments (48)

Parable of the Dammed
johnswentworth · 2020-12-10T00:08:44.493Z · comments (29)

Goodhart's Law inside the human mind
Kaj_Sotala · 2023-04-17T13:48:13.183Z · comments (13)

Mental health benefits and downsides of psychedelic use in ACX readers: survey results
RationalElf · 2021-10-25T22:55:09.522Z · comments (18)

[link] Manifold: If okay AGI, why?
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2023-03-25T22:43:53.820Z · comments (37)

[question] How do we prepare for final crunch time?
Eli Tyre (elityre) · 2021-03-30T05:47:54.654Z · answers+comments (30)

Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide (anish-mudide) · 2024-07-22T18:45:53.502Z · comments (19)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

[question] How Hard Would It Be To Make A COVID Vaccine For Oneself?
johnswentworth · 2020-12-21T16:19:10.415Z · answers+comments (29)

“Reframing Superintelligence” + LLMs + 4 years
Eric Drexler · 2023-07-10T13:42:09.739Z · comments (9)

[link] Popular education in Sweden: much more than you wanted to know
Henrik Karlsson (henrik-karlsson) · 2022-05-17T20:07:50.318Z · comments (3)

A List of 45+ Mech Interp Project Ideas from Apollo Research’s Interpretability Team
Lee Sharkey (Lee_Sharkey) · 2024-07-18T14:15:50.248Z · comments (18)

[link] Scott Aaronson is joining OpenAI to work on AI safety
peterbarnett · 2022-06-18T04:06:55.465Z · comments (31)

GPT-3 Catching Fish in Morse Code
Megan Kinniment (megan-kinniment) · 2022-06-30T21:22:49.054Z · comments (27)

8 examples informing my pessimism on uploading without reverse engineering
Steven Byrnes (steve2152) · 2023-11-03T20:03:50.450Z · comments (12)

We have achieved Noob Gains in AI
phdead · 2022-05-18T20:56:49.143Z · comments (20)

unRLHF - Efficiently undoing LLM safeguards
Pranav Gade (pranav-gade) · 2023-10-12T19:58:08.811Z · comments (15)

Situating LessWrong in contemporary philosophy: An interview with Jon Livengood
Suspended Reason (suspended-reason) · 2020-07-01T00:37:00.695Z · comments (21)

Experiences raising children in shared housing
juliawise · 2021-12-21T17:09:05.008Z · comments (4)

Why I'm joining Anthropic
evhub · 2023-01-05T01:12:13.822Z · comments (4)

Honoring Petrov Day on LessWrong, in 2020
Ben Pace (Benito) · 2020-09-26T08:01:36.838Z · comments (100)

[question] What are your greatest one-shot life improvements?
Mark Xu (mark-xu) · 2020-05-16T16:53:40.608Z · answers+comments (171)

Soft optimization makes the value target bigger
Jeremy Gillen (jeremy-gillen) · 2023-01-02T16:06:50.229Z · comments (20)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lc on Applying traditional economic thinking to AGI: a trilemma

I am continually surprised how many people go out of their way not to conclude something like this.

screwtape on Never Drop A Ball

(Self review) I stand by this essay, and in particular I like having this essay to point to as an example of why some organizations are not holding the idiot ball quite as much as people might assume. This essay is somewhat self defense? I work like this most of the time these days.

Followup work on how to better juggle balls is useful, and basically leads into an existing field of management. If One Day Sooner is unusual startup mode, Never Drop A Ball is a very normal middle and end stage of many organizations, and for good reasons. It's also a genuinely superior way for many groups to work. (Consider a hospital emergency room. Dr. House going deep into one patient's medical minutia is not as good as making sure that zero people have unsterilized and unbandaged bleeding wounds.) Having a shorter pointer is useful, though probably this could be made shorter and serve as a somewhat better pointer.

Followup on when and how to set balls down would be useful. Someone else should write that, I'm rubbish at it =P

nathan-helm-burger on Governance Course - Week 1 Reflections

Yeah, the thing the 'scaling extrapolation' view doesn't take into account is that as soon as radical speed-ups to algorithmic research are made possible by AI R&D agents, suddenly the trendlines for algorithmic progress should be projected to steepen. How much and for how long before slow-downs are hit? That's unclear. I think there is at least some substantial probability that no slow-downs are hit before full AGI, and some smaller but still considerable probability that the improvement cycle rushes forward at high speed past that point to ASI.

This should be assumed to potentially involve dramatic gains in both peak capabilities, and in efficiency and speed of training and inference. If so, then compute governance becomes completely irrelevant for blocking creation of dangerously powerful AI. It can still help put limits on the amount of inference used. Why? Because no matter how efficient the AI is, if you have more compute you have more parallel copies (and can run them faster up to the limits of the system, which is probably somewhere between 100x to 1000x human thought speed).

If we are going to head this off, we need new governance methods, and soon. Maybe really really soon, like, before the end of 2025. Hopefully we have until more like 2028, but we can't count on that for sure.

I have very little faith in current governments to implement and enforce policies that are more complex than things on the order of governance compute and chip export controls. Much less to do so within the short timeframes we are facing.

I think the conclusion this points towards is that we need new forms of governance. Not to replace existing governments, but to complement them. Voluntary mutual inspection contracts with privacy-respecting technology using AI inspectors. Something of that sort.

Here's some recent evidence of compute thresholds not being reliable: https://novasky-ai.github.io/posts/sky-t1/

Here's some self-links to some of my thoughts on this (I recommend reading the posts these comments are on as well):

https://www.lesswrong.com/posts/DvHokvyr2cZiWJ55y/2-skim-the-manual-intelligent-voluntary-cooperation?commentId=BBjpfYXWywb2RKjz5 [LW(p) · GW(p)]

https://www.lesswrong.com/posts/FEcw6JQ8surwxvRfr/human-takeover-might-be-worse-than-ai-takeover?commentId=uSPR9svtuBaSCoJ5P [LW(p) · GW(p)]

https://www.lesswrong.com/posts/tdrK7r4QA3ifbt2Ty/is-ai-alignment-enough?commentId=An6L68WETg3zCQrHT [LW(p) · GW(p)]

screwtape on One Day Sooner

(Self review) I stand by this essay and think more people should read it, though they don't need to read it deeply.

I think some people knew this kind of work and so this serves as a pointer to "yeah, that thing we did at my last company" and some people did not realize this was an option. Making people aware of potentially exciting options they could choose in life is (in my opinion) a good use of an essay. In my ideal world everyone would read something describing the One Day Sooner mindset as they were choosing their first careers so they could have it in mind as a possible trait jobs could have. Is that trait positive or negative? Depends on the person!

I don't know if it's LessWrong Best Of material, but given the number of people who work in this manner that work in the community I think it's good to have some term for it in the water supply.

Best when paired with Never Drop A Ball.

johnny-lin on Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

apologies for the issue with the neuronpedia link. it's now been resolved.

p-joao on Could my work, "Beyond HaHa" benefit the LessWrong community?

You make a very good point—there are different ways to create contrasts in information that are quickly recognized, and that’s how I’ve come to understand humor. The faster the brain recognizes the information, the higher the chance it triggers a sense of pleasure, or perhaps falls somewhere along a spectrum of pleasure tied to recognizing patterns or resolving contrasts.

I also agree that many jokes can reduce authority. They signal that someone is not a threat, that they lower themselves to gain acceptance, which is often done by people who aren’t strong or authoritative and who use humor as a way to find their space. On the other hand, humor can also be used by authoritative figures to reinforce their power—when others laugh, it confirms that they don’t perceive the person as a threat. Some studies even suggest that chimpanzee laughter might be linked to this dynamic.

In my case, I would describe myself as someone who, in many ways, felt “weak” and used humor to create space for myself. I wasn’t in a position to demand authority outright. I had to teach skills like first aid in a very short amount of time, and I found that humor as positive reinforcement was much more effective than relying on negative reinforcement.

It’s a complex topic, isn’t it? There are so many variables in humor, but this is the perspective I’ve been able to develop so far: humor as something that operates on a spectrum of pleasure derived from the rapid recognition of information.

Additionally, I plan to share some stories about how humor has helped me stay attentive and better use what a class or learning environment offers in a more constructive way.

t3t on [New Feature] Your Subscribed Feed

Well, that's unfortunate. That feature isn't super polished and isn't currently in the active development path, but will try to see if it's something obvious. (In the meantime, would recommend subscribing to fewer people, or seeing if the issue persists in Chrome. Other people on the team are subscribed to 100-200 people without obvious issues.)

p-joao on Could my work, "Beyond HaHa" benefit the LessWrong community?

Thank you for your interest! My first idea for a post on LessWrong was actually about that—my journey from being a firefighter to discovering rationality. However, I hesitated because it felt very personal, and some of the most interesting parts of my story would be hard to verify. To summarize, I found myself unable to adapt to the "ethics" of the role, which eventually led me to leave and seek rationality as a way to rebuild my life. At the time, it felt like I had nothing left, as I had dedicated my entire life to becoming a firefighter.

Interestingly, there are some parallels between my experiences and the Brazilian movies Tropa de Elite. That kind of intense, complex environment leaves you with stories that are hard to explain but deeply shape who you are.

Thanks to your comment, though, I’m reconsidering publishing my story. Perhaps I could frame it as partly real, partly exaggerated—after all, not everything has to be 100% factual, right? Haha.

lc on [New Feature] Your Subscribed Feed

Just reproduced it; all I have to do is subscribe to a bunch of people and this happens and the site becomes unusable:

raemon on Ureshiku Naritai

I'm curious how this seems to have gone for you 14 years later.