LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Two flaws in the Machiavelli Benchmark
TheManxLoiner · 2025-02-12T19:34:35.241Z · comments (0)

[link] Notes on the Presidential Election of 1836
Arjun Panickssery (arjun-panickssery) · 2025-02-13T23:40:23.224Z · comments (0)

[link] The Peeperi (unfinished) - By Katja Grace
Nathan Young · 2025-02-17T19:33:29.894Z · comments (0)

MATS Spring 2024 Extension Retrospective
HenningB (HenningBlue) · 2025-02-12T22:43:58.193Z · comments (0)

System 2 Alignment
Seth Herd · 2025-02-13T19:17:56.868Z · comments (0)

[question] What are the surviving worlds like?
KvmanThinking (avery-liu) · 2025-02-17T00:41:49.810Z · answers+comments (1)

Come join Dovetail's agent foundations fellowship talks & discussion
Alex_Altair · 2025-02-15T22:10:02.166Z · comments (0)

Moral Hazard in Democratic Voting
lsusr · 2025-02-12T23:17:39.355Z · comments (8)

Longtermist implications of aliens Space-Faring Civilizations - Introduction
Maxime Riché (maxime-riche) · 2025-02-21T12:08:42.403Z · comments (0)

Undergrad AI Safety Conference
JoNeedsSleep (joanna-j-1) · 2025-02-19T03:43:47.969Z · comments (0)

[link] When should we worry about AI power-seeking?
Joe Carlsmith (joekc) · 2025-02-19T19:44:25.062Z · comments (0)

6 (Potential) Misconceptions about AI Intellectuals
ozziegooen · 2025-02-14T23:51:44.983Z · comments (11)

Studies of Human Error Rate
tin482 · 2025-02-13T13:43:30.717Z · comments (3)

[link] Ascetic hedonism
dkl9 · 2025-02-17T15:56:30.267Z · comments (9)

Literature Review of Text AutoEncoders
NickyP (Nicky) · 2025-02-19T21:54:14.905Z · comments (1)

[link] Systematic Sandbagging Evaluations on Claude 3.5 Sonnet
farrelmahaztra · 2025-02-14T01:22:46.695Z · comments (0)

The Takeoff Speeds Model Predicts We May Be Entering Crunch Time
johncrox · 2025-02-21T02:26:31.768Z · comments (0)

MAISU - Minimal AI Safety Unconference
Linda Linsefors · 2025-02-21T11:36:25.202Z · comments (0)

I'm making a ttrpg about life in an intentional community during the last year before the Singularity
bgaesop · 2025-02-13T21:54:09.002Z · comments (2)

Hopeful hypothesis, the Persona Jukebox.
Donald Hobson (donald-hobson) · 2025-02-14T19:24:35.514Z · comments (4)

Using Prompt Evaluation to Combat Bio-Weapon Research
Stuart_Armstrong · 2025-02-19T12:39:00.491Z · comments (0)

[link] US AI Safety Institute will be 'gutted,' Axios reports
Matrice Jacobine · 2025-02-20T14:40:13.049Z · comments (0)

[link] The current AI strategic landscape: one bear's perspective
Matrice Jacobine · 2025-02-15T09:49:13.120Z · comments (0)

[link] Inside the dark forests of the internet
Itay Dreyfus (itay-dreyfus) · 2025-02-12T10:20:59.426Z · comments (0)

Dovetail's agent foundations fellowship talks & discussion
Alex_Altair · 2025-02-13T00:49:48.854Z · comments (0)

[link] DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability
garrison · 2025-02-19T21:02:42.879Z · comments (1)

[link] Metaculus Q4 AI Benchmarking: Bots Are Closing The Gap
Molly (hickman-santini) · 2025-02-19T22:42:39.055Z · comments (0)

Human-AI Relationality is Already Here
bridgebot (puppy) · 2025-02-20T07:08:22.420Z · comments (0)

[link] Published report: Pathways to short TAI timelines
Zershaaneh Qureshi (zershaaneh-qureshi) · 2025-02-20T22:10:12.276Z · comments (0)

SWE Automation Is Coming: Consider Selling Your Crypto
A_donor · 2025-02-13T20:17:59.227Z · comments (8)

[link] Introduction to Expected Value Fanaticism
Petra Kosonen · 2025-02-14T19:05:26.556Z · comments (8)

Call for Applications: XLab Summer Research Fellowship
JoNeedsSleep (joanna-j-1) · 2025-02-18T19:19:20.155Z · comments (0)

Talking to laymen about AI development
David Steel · 2025-02-17T18:42:23.289Z · comments (0)

What makes a theory of intelligence useful?
Cole Wyeth (Amyr) · 2025-02-20T19:22:29.725Z · comments (0)

[link] Won't vs. Can't: Sandbagging-like Behavior from Claude Models
Joe Benton · 2025-02-19T20:47:06.792Z · comments (0)

[link] Progress links and short notes, 2025-02-17
jasoncrawford · 2025-02-17T19:18:29.422Z · comments (0)

[link] Are SAE features from the Base Model still meaningful to LLaVA?
Shan23Chen (shan-chen) · 2025-02-18T22:16:14.449Z · comments (2)

THE ARCHIVE
Jason Reid (jason-reid) · 2025-02-17T01:12:41.486Z · comments (0)

Comparing the effectiveness of top-down and bottom-up activation steering for bypassing refusal on harmful prompts
Ana Kapros (ana-kapros) · 2025-02-12T19:12:07.592Z · comments (0)

[link] Cooperation for AI safety must transcend geopolitical interference
Matrice Jacobine · 2025-02-16T18:18:01.539Z · comments (6)

[link] The Dilemma’s Dilemma
James Stephen Brown (james-brown) · 2025-02-19T23:50:47.485Z · comments (8)

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)
gergogaspar (gergo-gaspar) · 2025-02-17T12:39:09.196Z · comments (0)

AIS Berlin, events, opportunities and the flipped gameboard - Fieldbuilders Newsletter, February 2025
gergogaspar (gergo-gaspar) · 2025-02-17T14:16:31.834Z · comments (0)

Bimodal AI Beliefs
Adam Train (aetrain) · 2025-02-14T06:45:53.933Z · comments (1)

Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (0)

There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)

[link] Sparse Autoencoder Features for Classifications and Transferability
Shan23Chen (shan-chen) · 2025-02-18T22:14:12.994Z · comments (0)

Are current LLMs safe for psychotherapy?
PaperBike · 2025-02-12T19:16:34.452Z · comments (4)

[link] Teaching AI to reason: this year's most important story
Benjamin_Todd · 2025-02-13T17:40:02.869Z · comments (0)

Make Superintelligence Loving
Davey Morse (davey-morse) · 2025-02-21T06:07:17.235Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

romeostevensit on The case for the death penalty

"0.12% of the population (the most persistent offenders) accounted for 20% of violent crime convictions" https://inquisitivebird.xyz/p/when-few-do-great-harm

dynomight on The first RCT for GLP-1 drugs and alcoholism isn't what we hoped

Thanks for the response! I must protest that I think I'm being misinterpreted a bit. Compare my quote:

the point of RCTs is to avoid resorting to regression coefficients on non-randomized sample

To the:

The point of RCTs is not to avoid resorting to regression coefficients.

The "non-randomized sample" part of that quote is important! If semaglutide had no impact on the decision to participate, then we can argue about about the theory of regressions. Yes, the fraction that participated happened to be close, but with small numbers that could easily happen by chance. The hypothesis of this research is that semaglutide would reduce the urge to drink! If the decision to participate was random, and I believed the conclusion of the experiment, then that conclusion would seem to imply that the decision to participate wasn't random after all. It just seems incredibly strange to assume that there's no impact of semaglutide on the probability of agreeing to the experiment, and very unlikely the other variables in the regression fix this, which is why I'm dubious that the regression coefficients reflect any causal relationship.

That said, I think the participation bias could go in either direction. I said (and maintain) that the lab experiment does provide some evidence in favor of semaglutide's effectiveness. I just think that given the non-random selection, small sample, and general weirdness of having people drink in a room in a hospital as a measurement, it's quite weak evidence. Given the dismal results from the drinking records (which have less of all of these issues) I think that makes the overall takeaway from this paper pretty negative.

romeostevensit on The case for the death penalty

There are the predictable lobbies for increasing the price taxpayers pay for prisoners, but not much advocacy for decreasing it.

james-camacho on The case for the death penalty

Similar disclaimer: don't assume these are my opinions. I'm merely advocating for a devil.

If we're going for efficiency, I feel like we can get most of the safety gains with tamer measures. For example, you could cut off a petty thief's hand, or castrate a rapist. The actual procedure would be about as expensive as execution, but if a mistake was made there is still a living person to pay reparations to. I think you could also make the argument that this is less cruel than imprisoning someone for years—after all, people have a "right to life, liberty, and the pursuit of happiness", not a right to all their limbs and genitals.

Another thing we can do is punish not only the criminal, but their friends and family too. We can model people as having the policy to take certain actions in a given environment. The ultimate goal of the justice system is to decrease the weight of certain defective policies in the general populace, either through threat, force, or elimination. When we get good enough mindreaders, we can just directly compare each person's policy to the defective ones, and change the environment to mitigate defection. Until then, we have to make do with approximations, and one's culture, especially the shared culture among friends and family, is a very good measure for how similar two people's policies will be. So, if we find someone defecting, it makes sense to punish not only them, but their friends and family for a couple generations too.

richardjacton on How to Make Superbabies

(For people reading this thread who want an intro to finemapping this lecture is a great place to start for a high level overview https://www.youtube.com/watch?v=pglYf7wocSI)

alexander-turok on The case for the death penalty

A fenced-off city that will inevitably be compared to a Holocaust ghetto.

shankar-sivarajan on The case for the death penalty

My paraphrase of Gandalf: "Many that die deserve life. Can you give it to them? Then do the next best thing, and deal out death in judgement to the many that live who deserve it."

alexander-turok on The case for the death penalty

But why's that a bad thing?

cousin_it on The case for the death penalty

Because otherwise everyone will gleefully discriminate against them in every way they possibly can.

chris-monteiro on Murder plots are infohazards

I am sure there are some interesting uses of agented AIs in can configure for automated OSINT but this feels quite large a task given I am bottlenecking more in who to hand the data to rather than it being insufficiency rich.

Know any preconfigured agency menageries for something like this?