LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)

Do Not Mess With Scarlett Johansson
Zvi · 2024-05-22T15:10:03.215Z · comments (7)

[link] Static Analysis As A Lifestyle
adamShimi · 2024-07-03T18:29:37.384Z · comments (11)

SAEs (usually) Transfer Between Base and Chat Models
Connor Kissane (ckkissane) · 2024-07-18T10:29:46.138Z · comments (0)

2. Corrigibility Intuition
Max Harms (max-harms) · 2024-06-08T15:52:29.971Z · comments (10)

METR is hiring!
Beth Barnes (beth-barnes) · 2023-12-26T21:00:50.625Z · comments (1)

AI Safety is Dropping the Ball on Clown Attacks
trevor (TrevorWiesinger) · 2023-10-22T20:09:31.810Z · comments (76)

Schelling game evaluations for AI control
Olli Järviniemi (jarviniemi) · 2024-10-08T12:01:24.389Z · comments (5)

[link] How LDT helps reduce the AI arms race
Tamsin Leake (carado-1) · 2023-12-10T16:21:44.409Z · comments (13)

Advice to junior AI governance researchers
Akash (akash-wasil) · 2024-07-08T19:19:07.316Z · comments (1)

List of how people have become more hard-working
Chi Nguyen · 2023-09-29T11:30:38.802Z · comments (7)

[link] AI Safety Hub Serbia Soft Launch
DusanDNesic · 2023-10-20T07:11:48.389Z · comments (1)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (17)

[link] The Perceptron Controversy
Yuxi_Liu · 2024-01-10T23:07:23.341Z · comments (18)

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours
Seth Herd · 2024-08-05T15:38:09.682Z · comments (22)

On the Debate Between Jezos and Leahy
Zvi · 2024-02-06T14:40:05.487Z · comments (6)

How to Control an LLM's Behavior (why my P(DOOM) went down)
RogerDearnaley (roger-d-1) · 2023-11-28T19:56:49.679Z · comments (30)

Announcing New Beginner-friendly Book on AI Safety and Risk
Darren McKee · 2023-11-25T15:57:08.078Z · comments (2)

A gentle introduction to mechanistic anomaly detection
Erik Jenner (ejenner) · 2024-04-03T23:06:16.778Z · comments (2)

[link] DeepMind: Frontier Safety Framework
Zach Stein-Perlman · 2024-05-17T17:30:02.504Z · comments (0)

Book Review: On the Edge: The Fundamentals
Zvi · 2024-09-23T13:40:11.058Z · comments (3)

[link] The Gods of Straight Lines
Richard_Ngo (ricraz) · 2023-10-14T04:10:50.020Z · comments (13)

Personal AI Planning
jefftk (jkaufman) · 2024-11-10T14:00:06.837Z · comments (10)

On the Gladstone Report
Zvi · 2024-03-20T19:50:05.186Z · comments (11)

A to Z of things
KatjaGrace · 2023-11-17T05:20:03.134Z · comments (6)

Complex systems research as a field (and its relevance to AI Alignment)
Nora_Ammann · 2023-12-01T22:10:25.801Z · comments (11)

[Interim research report] Activation plateaus & sensitive directions in GPT2
StefanHex (Stefan42) · 2024-07-05T17:05:25.631Z · comments (2)

Superposition is not "just" neuron polysemanticity
LawrenceC (LawChan) · 2024-04-26T23:22:06.066Z · comments (4)

[link] A free to enter, 240 character, open-source iterated prisoner's dilemma tournament
Isaac King (KingSupernova) · 2023-11-09T08:24:43.277Z · comments (19)

AI research assistants competition 2024Q3: Tie between Elicit and You.com
Elizabeth (pktechgirl) · 2024-10-12T15:10:05.417Z · comments (2)

Against most, but not all, AI risk analogies
Matthew Barnett (matthew-barnett) · 2024-01-14T03:36:16.267Z · comments (41)

Bayesian updating in real life is mostly about understanding your hypotheses
Max H (Maxc) · 2024-01-01T00:10:30.978Z · comments (4)

AiPhone
Zvi · 2024-06-12T22:20:02.141Z · comments (4)

What mistakes has the AI safety movement made?
EuanMcLean (euanmclean) · 2024-05-23T11:19:02.717Z · comments (29)

[link] AI, centralization, and the One Ring
owencb · 2024-09-13T14:00:16.126Z · comments (11)

Self-Awareness: Taxonomy and eval suite proposal
Daniel Kokotajlo (daniel-kokotajlo) · 2024-02-17T01:47:01.802Z · comments (2)

Generalization, from thermodynamics to statistical physics
Jesse Hoogland (jhoogland) · 2023-11-30T21:28:50.089Z · comments (9)

[link] A primer on why computational predictive toxicology is hard
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-19T17:16:37.735Z · comments (2)

All About Concave and Convex Agents
mako yass (MakoYass) · 2024-03-24T21:37:17.922Z · comments (23)

[link] Improving Dictionary Learning with Gated Sparse Autoencoders
Senthooran Rajamanoharan (SenR) · 2024-04-25T18:43:47.003Z · comments (38)

On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg
Zvi · 2024-04-22T13:10:02.645Z · comments (4)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (7)

Another argument against maximizer-centric alignment paradigms
Fiora from Rosebloom · 2024-09-22T07:28:27.856Z · comments (39)

[link] Moving on from community living
Vika · 2024-04-17T17:02:11.357Z · comments (7)

[question] Is cybercrime really costing trillions per year?
Fabien Roger (Fabien) · 2024-09-27T08:44:07.621Z · answers+comments (28)

A framework for thinking about AI power-seeking
Joe Carlsmith (joekc) · 2024-07-24T22:41:01.685Z · comments (15)

[link] Pay-on-results personal growth: first success
Chipmonk · 2024-09-14T03:39:12.975Z · comments (5)

Thoughts on open source AI
Sam Marks (samuel-marks) · 2023-11-03T15:35:42.067Z · comments (17)

Taxonomy of AI-risk counterarguments
Odd anon · 2023-10-16T00:12:51.021Z · comments (13)

Black Box Biology
GeneSmith · 2023-11-29T02:27:29.794Z · comments (30)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lukehmiles on Project Adequate: Seeking Cofounders/Funders

Wasted opportunity to guarantee this post keeps getting holywar comments for the next hundred years.

lukehmiles on Project Adequate: Seeking Cofounders/Funders

This is pretty inspiring to me. Thank you for sharing.

elityre on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

I suspect it would still involve billions of $ of funding, partnerships like the one with Microsoft, and other for-profit pressures to be the sort of player it is today. So I don't know that Musk's plan was viable at all.

Note that all of this happened before the scaling hypothesis was really formulated, much less made obvious.

We now know, with the benefit of hindsight that developing AI and it's precursors is extremely compute intensive, which means capital intensive. There was some reason to guess this might be true at the time, but it wasn't a forgone conclusion—it was still an open question if the key to AGI would be mostly some technical innovation that hadn't been developed yet.

elityre on Lao Mein's Shortform

Those people don't get substantial equity in most business in the world. They generally get paid a salary and benefits in exchange for their work, and that's about it.

zy on Shortform

Haven't looked too closely at this, but since there are some upvotes, wanted to comment with my initial two thoughts:

child consent is tricky.
likely many are foreign children, which may or may not be in the 75 million statistic

It is good to think critically, but I think it would be beneficial to present more evidence before making the claim or conclusion

lukehmiles on Shortform

The other day I was trying to think of information leaks that a competent conspiracy couldn't prevent, regarding this. I just thought of one small one: people will sometimes randomly die or have their homes raided. If the slavery is common, then sometimes the slaves will be discovered during these events. Even if the escapees wanted to silence the story out of shame, cops would probably gossip to the press.

So you can probably tally such events, crunch the numbers, and get a decent conspiracy-resistant estimate.

lukehmiles on Alexander Gietelink Oldenziel's Shortform

As a layman, I have not seen much unrealistic hype. I think the hype-level is just about right.

lukehmiles on Alexander Gietelink Oldenziel's Shortform

You should not bury such a good post in a shortform

lukehmiles on Which evals resources would be good?

Maybe it should be a game that everyone can play

lukehmiles on lukehmiles's Shortform

You didn't ask me to pitch you but I will say a short pitch here for any bystanders. I know how to how find a handful of good people and I know how to let a good chef cook without isolating them either. And I can make a pretty good fried egg if we're starving.

One funny weak bit of evidence for my lack of politicking is that I have like six co-authored papers but no first-authored papers.