LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

At 87, Pearl is still able to change his mind
rotatingpaguro · 2023-10-18T04:46:29.339Z · comments (15)

LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B
Simon Lermen (dalasnoin) · 2023-10-12T19:58:02.119Z · comments (29)

[link] Vernor Vinge, who coined the term "Technological Singularity", dies at 79
Kaj_Sotala · 2024-03-21T22:14:14.699Z · comments (24)

On Devin
Zvi · 2024-03-18T13:20:04.779Z · comments (34)

Priors and Prejudice
MathiasKB (MathiasKirkBonde) · 2024-04-22T15:00:41.782Z · comments (31)

Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar · 2023-12-18T11:58:39.379Z · comments (21)

Liability regimes for AI
Ege Erdil (ege-erdil) · 2024-08-19T01:25:01.006Z · comments (34)

[link] Explore More: A Bag of Tricks to Keep Your Life on the Rails
Shoshannah Tekofsky (DarkSym) · 2024-09-28T21:38:52.256Z · comments (7)

Some (problematic) aesthetics of what constitutes good work in academia
Steven Byrnes (steve2152) · 2024-03-11T17:47:28.835Z · comments (12)

Does davidad's uploading moonshot work?
jacobjacob · 2023-11-03T02:21:51.720Z · comments (35)

The Plan - 2023 Version
johnswentworth · 2023-12-29T23:34:19.651Z · comments (39)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (13)

Leading The Parade
johnswentworth · 2024-01-31T22:39:56.499Z · comments (31)

[link] If you weren't such an idiot...
kave · 2024-03-02T00:01:37.314Z · comments (74)

Deep atheism and AI risk
Joe Carlsmith (joekc) · 2024-01-04T18:58:47.745Z · comments (22)

[link] Moral Reality Check (a short story)
jessicata (jessica.liu.taylor) · 2023-11-26T05:03:18.254Z · comments (44)

OpenAI o1
Zach Stein-Perlman · 2024-09-12T17:30:31.958Z · comments (41)

The Information: OpenAI shows 'Strawberry' to feds, races to launch it
Martín Soto (martinsq) · 2024-08-27T23:10:18.155Z · comments (15)

[link] Nursing doubts
dynomight · 2024-08-30T02:25:36.826Z · comments (20)

[link] That Alien Message - The Animation
Writer · 2024-09-07T14:53:30.604Z · comments (9)

LLMs for Alignment Research: a safety priority?
abramdemski · 2024-04-04T20:03:22.484Z · comments (24)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (32)

My motivation and theory of change for working in AI healthtech
Andrew_Critch · 2024-10-12T00:36:30.925Z · comments (35)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

[link] Stanislav Petrov Quarterly Performance Review
Ricki Heicklen (bayesshammai) · 2024-09-26T21:20:11.646Z · comments (3)

Value Claims (In Particular) Are Usually Bullshit
johnswentworth · 2024-05-30T06:26:21.151Z · comments (18)

[link] Arithmetic is an underrated world-modeling technology
dynomight · 2024-10-17T14:00:22.475Z · comments (32)

[link] The Checklist: What Succeeding at AI Safety Will Involve
Sam Bowman (sbowman) · 2024-09-03T18:18:34.230Z · comments (49)

AI Views Snapshots
Rob Bensinger (RobbBB) · 2023-12-13T00:45:50.016Z · comments (61)

Loudly Give Up, Don't Quietly Fade
Screwtape · 2023-11-13T23:30:25.308Z · comments (11)

Survey: How Do Elite Chinese Students Feel About the Risks of AI?
Nick Corvino (nick-corvino) · 2024-09-02T18:11:11.867Z · comments (13)

Momentum of Light in Glass
Ben (ben-lang) · 2024-10-09T20:19:42.088Z · comments (44)

[link] Decomposing Agency — capabilities without desires
owencb · 2024-07-11T09:38:48.509Z · comments (32)

My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (31)

Graphical tensor notation for interpretability
Jordan Taylor (Nadroj) · 2023-10-04T08:04:33.341Z · comments (11)

[link] Fields that I reference when thinking about AI takeover prevention
Buck · 2024-08-13T23:08:54.950Z · comments (15)

0. CAST: Corrigibility as Singular Target
Max Harms (max-harms) · 2024-06-07T22:29:12.934Z · comments (12)

Comparing Anthropic's Dictionary Learning to Ours
Robert_AIZI · 2023-10-07T23:30:32.402Z · comments (8)

What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
Hastings (hastings-greer) · 2024-01-12T13:17:23.829Z · comments (22)

Read the Roon
Zvi · 2024-03-05T13:50:04.967Z · comments (6)

EA orgs' legal structure inhibits risk taking and information sharing on the margin
Elizabeth (pktechgirl) · 2023-11-05T19:13:56.135Z · comments (17)

[Completed] The 2024 Petrov Day Scenario
Ben Pace (Benito) · 2024-09-26T08:08:32.495Z · comments (114)

The Worst Form Of Government (Except For Everything Else We've Tried)
johnswentworth · 2024-03-17T18:11:38.374Z · comments (46)

The Dark Arts
lsusr · 2023-12-19T04:41:13.356Z · comments (49)

When is a mind me?
Rob Bensinger (RobbBB) · 2024-04-17T05:56:38.482Z · comments (125)

How to (hopefully ethically) make money off of AGI
habryka (habryka4) · 2023-11-06T23:35:16.476Z · comments (81)

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
Neel Nanda (neel-nanda-1) · 2024-07-07T17:39:35.064Z · comments (15)

The 99% principle for personal problems
Kaj_Sotala · 2023-10-02T08:20:07.379Z · comments (20)

Integrity in AI Governance and Advocacy
habryka (habryka4) · 2023-11-03T19:52:33.180Z · comments (57)

How it All Went Down: The Puzzle Hunt that took us way, way Less Online
A* (agendra) · 2024-06-02T08:01:40.109Z · comments (5)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

brambleboy on Matt Goldenberg's Short Form Feed

While the broader message might be good, the study the video is about didn't replicate.

arepo on What TMS is like

Did the ringing go away over time, or was it permanent?

towards_keeperhood on Could orcas be (trained to be) smarter than humans? 

In the wikipedia list, the estimated number of neurons in the neocortex of a blue whale is 5 billion (compared to 43 billion in orcas), even though blue whales are much larger. (Unfortunately the blue whale estimate is just an estimate and not grounded in optical or isotropic fractionation measurements.)

(EDIT: Hm interesting, the linked reddit post mentions 15billion for blue whales. Not sure what is correct.)

gunnar_zarncke on Could orcas be (trained to be) smarter than humans? 

As I commented [LW(p) · GW(p)] on Are big brains for processing sensory input? [LW · GW] I predict that the brain regions of a whale or Orca responsible for spatiotemporal learning and memory are a big part of their encephalization.

tslarm on Update on the Mysterious Trump Buyers on Polymarket

Can't this only be judged in retrospect, and over a decent sample size? If all the markets did was reflect the public expert consensus, they wouldn't be very useful; the possibility that they're doing significantly better is still open.

(I'm assuming that by "every other prediction source" you mean everything other than prediction/betting markets, because it sounds like Polymarket is no longer out of line with the other markets. Betfair is the one I keep an eye on, and that's at 60/40 too.)

aysja on johnswentworth's Shortform

I think I probably agree, although I feel somewhat wary about it. My main hesitations are:

The lack of epistemic modifiers seems off to me, relative to the strength of the arguments they’re making. Such that while I agree with many claims, my imagined reader who is coming into this with zero context is like “why should I believe this?” E.g., “Without intervention, humanity will be summarily outcompeted and relegated to irrelevancy,” which like, yes, but also—on what grounds should I necessarily conclude this? They gave some argument along the lines of “intelligence is powerful,” and that seems probably true, but imo not enough to justify the claim that it will certainly lead to our irrelevancy. All of this would be fixed (according to me) if it were framed more as like “here are some reasons you might be pretty worried,” of which there are plenty, or "here's what I think," rather than “here is what will definitely happen if we continue on this path,” which feels less certain/obvious to me.
Along the same lines, I think it’s pretty hard to tell whether this piece is in good faith or not. E.g., in the intro Connor writes “The default path we are on now is one of ruthless, sociopathic corporations racing toward building the most intelligent, powerful AIs as fast as possible to compete with one another and vie for monopolization and control of both the market and geopolitics.” Which, again, I don’t necessarily disagree with, but my imagined reader with zero context is like “what, really? sociopaths? control over geopolitics?” I.e., I’m expecting readers to question the integrity of the piece, and to be more unsure of how to update on it (e.g. "how do I know this whole thing isn't just a strawman?" etc.).
There are many places where they kind of just state things without justifying them much. I think in the best case this might cause readers to think through whether such claims make sense (either on their own, or by reading the hyperlinked stuff—both of which put quite a lot of cognitive load on them), and in the worst case just causes readers to either bounce or kind of blindly swallow what they’re saying. E.g., “Black-Box Evaluations can only catch all relevant safety issues insofar as we have either an exhaustive list of all possible failure modes, or a mechanistic model of how concrete capabilities lead to safety risks.” They say this without argument and then move on. And although I agree with them (having spent a lot of time thinking this through myself), it’s really not obvious at first blush. Why do you need an exhaustive list? One might imagine, for instance, that a small number of tests would generalize well. And do you need mechanistic models? Sometimes medicines work safely without that, etc., etc. I haven’t read the entire Compendium closely, but my sense is that this is not an isolated incident. And I don't think this is a fatal flaw or anything—they're moving through a ton of material really fast and it's hard to give a thorough account of all claims—but it does make me more hesitant to use it as the default "here's what's happening" document.

All of that said, I do broadly agree with the set of arguments, and I think it’s a really cool activity for people to write up what they believe. I’m glad they did it. But I’m not sure how comfortable I feel about sending it to people who haven’t thought much about AI.

d0themath on Update on the Mysterious Trump Buyers on Polymarket

The promise of prediction markets was that they are either useful or allow you to take money from rich idiots. I’d say that was fulfilled.

Also, useful is very different from perfect. They are still very adequate for a large variety of questions.

kvmanthinking on Chapter 27: Empathy

Harry's brain tried to calculate the ramifications and implications of this and ran out of swap space.

this is very relatable

ryankidd44 on Ryan Kidd's Shortform

I'm not sure!

yair-halberstadt on Could orcas be (trained to be) smarter than humans? 

Douglas Adams answered this long ago of course:

For instance, on the planet Earth, man had always assumed that he was more intelligent than dolphins because he had achieved so much—the wheel, New York, wars and so on—whilst all the dolphins had ever done was muck about in the water having a good time. But conversely, the dolphins had always believed that they were far more intelligent than man—for precisely the same reasons.