LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Why Swiss watches and Taylor Swift are AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T13:23:27.033Z · comments (11)

[link] GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
ChengCheng (ccstan99) · 2024-11-01T00:10:50.718Z · comments (0)

Automating LLM Auditing with Developmental Interpretability
htlou · 2024-09-04T15:50:04.337Z · comments (0)

[question] Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-09-04T12:40:07.678Z · answers+comments (7)

"It's a 10% chance which I did 10 times, so it should be 100%"
egor.timatkov · 2024-11-18T01:14:27.738Z · comments (3)

Training a Sparse Autoencoder in < 30 minutes on 16GB of VRAM using an S3 cache
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-08-24T07:39:00.057Z · comments (0)

[link] Four Levels of Voting Methods
hive · 2024-09-26T18:15:00.565Z · comments (3)

[link] Instruction Following without Instruction Tuning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-24T13:49:09.078Z · comments (0)

Is Text Watermarking a lost cause?
egor.timatkov · 2024-10-01T16:20:51.113Z · comments (13)

[link] Why good things often don’t lead to better outcomes
DMMF · 2024-09-19T16:37:07.778Z · comments (1)

Slave Morality: A place for every man and every man in his place
Martin Sustrik (sustrik) · 2024-09-19T04:20:04.491Z · comments (7)

Hiring a writer to co-author with me (Spencer Greenberg for ClearerThinking.org)
spencerg · 2024-10-27T17:34:50.479Z · comments (0)

[question] Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?
SpectrumDT · 2024-11-04T15:20:14.822Z · answers+comments (49)

Trying Bluesky
jefftk (jkaufman) · 2024-11-17T02:50:04.093Z · comments (14)

Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
Declan Molony (declan-molony) · 2024-09-10T05:54:47.000Z · comments (12)

[link] My lukewarm take on GLP-1 agonists
George3d6 · 2024-08-26T12:34:27.929Z · comments (0)

Reducing global AI competition through the Commerce Control List and Immigration reform: a dual-pronged approach
Ben Smith (ben-smith) · 2024-09-03T05:28:24.549Z · comments (2)

Review: Dr Stone
ProgramCrafter (programcrafter) · 2024-09-29T10:35:53.175Z · comments (4)

[question] Is there a CFAR handbook audio option?
FinalFormal2 · 2024-10-26T17:08:36.480Z · answers+comments (0)

Appealing to the Public
jefftk (jkaufman) · 2024-10-23T19:00:07.669Z · comments (0)

Evolutionary prompt optimization for SAE feature visualization
neverix · 2024-11-14T13:06:49.728Z · comments (0)

New Funding Category Open in Foresight's AI Safety Grants
Allison Duettmann (allison-duettmann) · 2024-11-06T22:59:41.065Z · comments (0)

[link] Pronouns are Annoying
ymeskhout · 2024-09-18T13:30:04.620Z · comments (21)

Two arguments against longtermist thought experiments
momom2 (amaury-lorin) · 2024-11-02T10:22:11.311Z · comments (5)

[link] Levers for Biological Progress - A Response to "Machines of Loving Grace"
Niko_McCarty (niko-2) · 2024-11-01T16:35:08.221Z · comments (0)

LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction
Tristan Tran (tristan-tran) · 2024-11-09T20:58:09.182Z · comments (5)

Current Attitudes Toward AI Provide Little Data Relevant to Attitudes Toward AGI
Seth Herd · 2024-11-12T18:23:53.533Z · comments (2)

Announcing the Ultimate Jailbreaking Championship
InnerHufflepuff (grayswan) · 2024-09-04T00:35:31.234Z · comments (1)

[link] Where is the Learn Everything System?
Shoshannah Tekofsky (DarkSym) · 2024-09-27T21:30:16.379Z · comments (8)

2024 NYC Secular Solstice & Megameetup
Joe Rogero · 2024-11-12T17:46:18.674Z · comments (0)

Electric Grid Cyberattack: An AI-Informed Threat Model
moonlightmaze · 2024-11-11T21:34:17.190Z · comments (0)

Join a LessWrong Team for the Unaging System Challenge
Crissman · 2024-10-23T06:01:08.018Z · comments (5)

[link] Benefits of Psyllium Dietary Fiber in Particular
Brendan Long (korin43) · 2024-08-28T18:13:23.891Z · comments (7)

What can we learn from insecure domains?
Logan Zoellner (logan-zoellner) · 2024-11-01T23:53:30.066Z · comments (21)

[link] What if muscle tension is sometimes signal jamming?
Chipmonk · 2024-11-04T21:08:47.800Z · comments (1)

AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
DanielFilan · 2024-11-14T07:00:06.977Z · comments (0)

[question] Any Trump Supporters Want to Dialogue?
k64 · 2024-09-28T19:41:55.370Z · answers+comments (80)

The deepest atheist: Sam Altman
Trey Edwin (Paolo Vivaldi) · 2024-10-10T03:27:34.465Z · comments (2)

Pomodoro Method Randomized Self Experiment
niplav · 2024-09-29T21:55:04.740Z · comments (2)

Chaos Theory in Ecology
Elizabeth (pktechgirl) · 2024-11-09T17:50:01.727Z · comments (2)

[link] AI & wisdom 2: growth and amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:07:39.449Z · comments (0)

Humans are (mostly) metarational
Yair Halberstadt (yair-halberstadt) · 2024-10-09T05:51:16.644Z · comments (6)

[link] AI x Human Flourishing: Introducing the Cosmos Institute
Brendan McCord (brendan-mccord) · 2024-09-05T18:23:32.690Z · comments (5)

[link] AI & wisdom 3: AI effects on amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:08:56.604Z · comments (0)

[link] Runner's High On Demand: A Story of Luck & Persistence
Shoshannah Tekofsky (DarkSym) · 2024-09-29T17:15:29.494Z · comments (6)

Against Explosive Growth
c.trout (ctrout) · 2024-09-04T21:45:03.120Z · comments (1)

[link] The Ap Distribution
criticalpoints · 2024-08-24T21:45:35.029Z · comments (3)

[question] Looking to interview AI Safety researchers for a book
jeffreycaruso · 2024-08-24T19:57:33.119Z · answers+comments (0)

[link] Verification methods for international AI agreements
Akash (akash-wasil) · 2024-08-31T14:58:10.986Z · comments (1)

Are LLMs on the Path to AGI?
Davidmanheim · 2024-08-30T03:14:04.710Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

noggin-scratcher on "It's a 10% chance which I did 10 times, so it should be 100%"

Ironically, the even more basic error of probabilistic thinking that people so—painfully—commonly make ("It either happens or doesn't, so it's 50/50") would get closer to the right answer.

jkaufman on Dragon Agnosticism

I think it's a pretty weak hit, though not zero. There are so many things I want to look into that I don't have time for that having this as another factor in my prioritization doesn't feel very limiting to my intellectual freedom.

I do think it is good to have a range of people in society who are taking a range of approaches, though!

benito on Dragon Agnosticism

Then I shall continue to tend to and grow my garden.

jkaufman on Dragon Agnosticism

Nice of you to offer! I expect, however, that pressure in this direction will come from non-LW non-EA directions.

super-agi on Are extreme probabilities for P(doom) epistemically justifed?

Suggested spelling corrections:

I predict that the superforcaters in the report took

I predict that the superforcasters in the report took

a lot of empircal evidence for climate stuff

a lot of empirical evidence for climate stuff

and it may or not may not be the case

and it may or may not be the case

There are no also easy rules that

There are also no easy rules that

meaning that there should see persistence from past events

meaning that we should see persistence from past events

I also feel this kinds of linear extrapolation

I also feel these kinds of linear extrapolation

and really quite a lot of empircal evidence

and really quite a lot of empirical evidence

are many many times more invectious

are many many times more infectious

engineered virus that is spreads like the measles or covid

engineered virus that spreads like the measles or covid

case studies on weather are breakpoints in technological development

case studies on weather there are breakpoints in technological development

break that trend extrapolition wouldn't have predicted

break that trend extrapolation wouldn't have predicted

It's very vulnerable to refernces class and

It's very vulnerable to references class and

impressed by superforecaster track record than you are.

impressed by superforecaster track records than you are.

annasalamon on Dragon Agnosticism

Does it feel to you as though your epistemic habits / self-trust / intellectual freedom and autonomy / self-honesty takes a hit here?

benito on Dragon Agnosticism

It’s going pretty well for me! Most people I work with or am friends with know that there are multiple topics on which my thoughts are private, and there have been ~no significant social costs to me that I’m aware of.

I would like to be informed of opportunities to support others in this on LessWrong or in the social circles I participate in, to back you up if people are applying pressure on you to express your thoughts on a topic that you don’t want to talk about.

jiro on Heresies in the Shadow of the Sequences

My own heresy is that I don't have a true rejection. Many ideas are things which I believe by accumulation of evidence and there's no single item which would disprove my position. And talking about a "true rejection" is really trying to create a gotcha to force someone to change their position without allowing for things such as accumulation of evidence or even misphrasing the rejection.

I also think rationalists shouldn't bet, but that probably deserves its own post.

mathieuroy on Second-Order Rationality, System Rationality, and a feature suggestion for LessWrong

david-matolcsi on "The Solomonoff Prior is Malign" is a special case of a simpler argument

I think that the standard simulation argument is still pretty strong: If the world was like what it looks to be, then probably we could, and plausibly we would, create lots of simulations. Therefore, we are probably in a simulation.

I agree that all the rest, for example the Oracle assuming that most of the simulations it appears in are created for anthropic capture/influencing reasons, are pretty speculative and I have low confidence in them.