LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Announcing New Beginner-friendly Book on AI Safety and Risk
Darren McKee · 2023-11-25T15:57:08.078Z · comments (2)

What mistakes has the AI safety movement made?
EuanMcLean (euanmclean) · 2024-05-23T11:19:02.717Z · comments (29)

AiPhone
Zvi · 2024-06-12T22:20:02.141Z · comments (4)

AI research assistants competition 2024Q3: Tie between Elicit and You.com
Elizabeth (pktechgirl) · 2024-10-12T15:10:05.417Z · comments (2)

Self-Awareness: Taxonomy and eval suite proposal
Daniel Kokotajlo (daniel-kokotajlo) · 2024-02-17T01:47:01.802Z · comments (2)

All About Concave and Convex Agents
mako yass (MakoYass) · 2024-03-24T21:37:17.922Z · comments (23)

Another argument against utility-centric alignment paradigms
Fiora from Rosebloom · 2024-09-22T07:28:27.856Z · comments (39)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (6)

Bayesian updating in real life is mostly about understanding your hypotheses
Max H (Maxc) · 2024-01-01T00:10:30.978Z · comments (4)

Generalization, from thermodynamics to statistical physics
Jesse Hoogland (jhoogland) · 2023-11-30T21:28:50.089Z · comments (9)

[link] AI, centralization, and the One Ring
owencb · 2024-09-13T14:00:16.126Z · comments (11)

On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg
Zvi · 2024-04-22T13:10:02.645Z · comments (4)

[link] A primer on why computational predictive toxicology is hard
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-19T17:16:37.735Z · comments (2)

[link] Improving Dictionary Learning with Gated Sparse Autoencoders
Senthooran Rajamanoharan (SenR) · 2024-04-25T18:43:47.003Z · comments (38)

Against most, but not all, AI risk analogies
Matthew Barnett (matthew-barnett) · 2024-01-14T03:36:16.267Z · comments (41)

[question] Is cybercrime really costing trillions per year?
Fabien Roger (Fabien) · 2024-09-27T08:44:07.621Z · answers+comments (28)

[link] Moving on from community living
Vika · 2024-04-17T17:02:11.357Z · comments (7)

Do not delete your misaligned AGI.
mako yass (MakoYass) · 2024-03-24T21:37:07.724Z · comments (13)

RTFB: California’s AB 3211
Zvi · 2024-07-30T13:10:03.853Z · comments (2)

[link] Ice: The Penultimate Frontier
Roko · 2024-07-13T23:44:56.827Z · comments (56)

Never Drop A Ball
Screwtape · 2023-11-23T04:15:35.834Z · comments (1)

AI #55: Keep Clauding Along
Zvi · 2024-03-14T15:40:09.335Z · comments (16)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (6)

[link] Twitter thread on AI safety evals
Richard_Ngo (ricraz) · 2024-07-31T00:18:14.076Z · comments (3)

Don't sleep on Coordination Takeoffs
trevor (TrevorWiesinger) · 2024-01-27T19:55:26.831Z · comments (24)

Catastrophic Goodhart in RL with KL penalty
Thomas Kwa (thomas-kwa) · 2024-05-15T00:58:20.763Z · comments (10)

[link] Pay-on-results personal growth: first success
Chipmonk · 2024-09-14T03:39:12.975Z · comments (5)

E.T. Jaynes Probability Theory: The logic of Science I
Jan Christian Refsgaard (jan-christian-refsgaard) · 2023-12-27T23:47:52.579Z · comments (20)

Book Review: On the Edge: The Future
Zvi · 2024-09-27T14:00:05.279Z · comments (1)

Black Box Biology
GeneSmith · 2023-11-29T02:27:29.794Z · comments (30)

What is a Tool?
johnswentworth · 2024-06-25T23:40:07.483Z · comments (4)

[link] Superforecasting the Origins of the Covid-19 Pandemic
DanielFilan · 2024-03-12T19:01:15.914Z · comments (0)

[link] Outrage Bonding
Jonathan Moregård (JonathanMoregard) · 2024-08-09T13:46:59.818Z · comments (12)

On coincidences and Bayesian reasoning, as applied to the origins of COVID-19
viking_math · 2024-02-19T01:14:06.772Z · comments (28)

A framework for thinking about AI power-seeking
Joe Carlsmith (joekc) · 2024-07-24T22:41:01.685Z · comments (15)

AI #78: Some Welcome Calm
Zvi · 2024-08-22T14:20:10.812Z · comments (15)

Social status part 2/2: everything else
Steven Byrnes (steve2152) · 2024-03-05T16:29:19.072Z · comments (2)

[question] We might be dropping the ball on Autonomous Replication and Adaptation.
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-31T13:49:11.327Z · answers+comments (30)

[link] DeepMind: Evaluating Frontier Models for Dangerous Capabilities
Zach Stein-Perlman · 2024-03-21T03:00:31.599Z · comments (8)

Vote on worthwhile OpenAI topics to discuss
Ben Pace (Benito) · 2023-11-21T00:03:03.898Z · comments (55)

[link] on bacteria, on teeth
bhauth · 2024-09-30T15:56:56.830Z · comments (9)

[link] Dario Amodei — Machines of Loving Grace
Matrice Jacobine · 2024-10-11T21:43:31.448Z · comments (26)

Managing risks while trying to do good
Wei Dai (Wei_Dai) · 2024-02-01T18:08:46.506Z · comments (26)

A civilization ran by amateurs
Olli Järviniemi (jarviniemi) · 2024-05-30T17:57:32.601Z · comments (7)

The proper response to mistakes that have harmed others?
Ruby · 2023-12-31T04:06:31.505Z · comments (12)

Offering AI safety support calls for ML professionals
Vael Gates · 2024-02-15T23:48:12.797Z · comments (1)

What is SB 1047 *for*?
Raemon · 2024-09-05T17:39:39.871Z · comments (8)

Balancing Games
jefftk (jkaufman) · 2024-02-24T14:40:04.237Z · comments (18)

Inspired by: Failures in Kindness
X4vier · 2024-07-27T01:21:42.848Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

avturchin on Quantum Immortality: A Perspective if AI Doomers are Probably Right

In MWI some part of the body always remains. But why destuctible?

avturchin on Quantum Immortality: A Perspective if AI Doomers are Probably Right

I think that first perspective is meaningful as it allows me to treat my self as a random sample from some group of minds.

ramblindash on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

Sorry

philosophicalsoul on Thomas Kwa's Shortform

Answering this from a legal perspective:

What is the easiest and most practical way to translate legalese into scientifically accurate terms, thus bridging the gap between AI experts and lawyers? Stated differently, how do we move from localised papers that only work in law or AI fields respectively, to papers that work in both?

philosophicalsoul on Saul Munn's Shortform

Glad somebody finally made a post about this. I experimented with the distinction in my trio of posts on photographic memory a while back.

wassname on Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research

To the people disagreeing, what part do you disagree with? My main point, or my example? Or something else

philosophicalsoul on Does LessWrong make a difference when it comes to AI alignment?

I was naive during the period in which I made this particular post. I'm happy with the direction LW is going in, having experienced more of the AI world, and read many more posts. Thank you for your input regardless.

kaj_sotala on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

As far as I know, the latest representative expert survey on the topic is "Thousands of AI Authors on the Future of AI", in which the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased:

If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047. [...] However, the chance of all human occupations becoming fully automatable was forecast to reach 10% by 2037, and 50% as late as 2116 (compared to 2164 in the 2022 survey).

Not that these numbers would mean much because AI experts aren't experts on forecasting, but it still suggests a substantial possibility for AGI to take quite a while yet.

richard_kennaway on What are the primary drivers that caused selection pressure for intelligence in humans?

Or briefly, intelligence is good for everything.

poignardazur on A basic systems architecture for AI agents that do autonomous research

I don't find that satisfying. Anyone can point out that a perimeter is "reasonably hard" to breach by pointing at a high wall topped with barbed wire, and naive observers will absolutely agree that the wall sure is very high and sure is made of reinforced concrete.

The perimeter is still trivially easy to breach if, say, the front desk is susceptible to social engineering tactics.

Claiming that an architecture is even reasonably secure still requires looking at it with an attacker's mindset. If you just look at the parts of the security you like, you can make a very convincing-sounding case that still misses glaring flaws. I'm not definitely saying that's what this article does, but it sure is giving me this vibe.