LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Which things were you surprised to learn are not metaphors?
Eric Neyman (UnexpectedValues) · 2024-11-21T18:56:18.025Z · answers+comments (78)

[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)

Subskills of "Listening to Wisdom"
Raemon · 2024-12-09T03:01:18.706Z · comments (14)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

"The Solomonoff Prior is Malign" is a special case of a simpler argument
David Matolcsi (matolcsid) · 2024-11-17T21:32:34.711Z · comments (44)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (49)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (10)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

Why I funded PIBBSS
Ryan Kidd (ryankidd44) · 2024-09-15T19:56:33.018Z · comments (21)

You should consider applying to PhDs (soon!)
bilalchughtai (beelal) · 2024-11-29T20:33:12.462Z · comments (19)

DeepSeek beats o1-preview on math, ties on coding; will release weights
Zach Stein-Perlman · 2024-11-20T23:50:26.597Z · comments (23)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (31)

Sorry for the downtime, looks like we got DDosd
habryka (habryka4) · 2024-12-02T04:14:30.209Z · comments (13)

The Big Nonprofits Post
Zvi · 2024-11-29T16:10:06.938Z · comments (10)

[link] Announcing turntrout.com, my new digital home
TurnTrout · 2024-11-17T17:42:08.164Z · comments (24)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (20)

LLMs can learn about themselves by introspection
Felix J Binder (fjb) · 2024-10-18T16:12:51.231Z · comments (38)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren't scheming
Buck · 2024-10-10T13:36:53.810Z · comments (4)

[link] Advice for journalists
Nathan Young · 2024-10-07T16:46:40.929Z · comments (53)

Why comparative advantage does not help horses
Sherrinford · 2024-09-30T22:27:57.450Z · comments (10)

Hierarchical Agency: A Missing Piece in AI Alignment
Jan_Kulveit · 2024-11-27T05:49:04.241Z · comments (19)

MIRI’s 2024 End-of-Year Update
Rob Bensinger (RobbBB) · 2024-12-03T04:33:47.499Z · comments (1)

[link] Seven lessons I didn't learn from election day
Eric Neyman (UnexpectedValues) · 2024-11-14T18:39:07.053Z · comments (33)

The case for unlearning that removes information from LLM weights
Fabien Roger (Fabien) · 2024-10-14T14:08:04.775Z · comments (14)

[link] Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman · 2024-11-06T16:00:06.940Z · comments (33)

[question] What are the best arguments for/against AIs being "slightly 'nice'"?
Raemon · 2024-09-24T02:00:19.605Z · answers+comments (58)

You can, in fact, bamboozle an unaligned AI into sparing your life
David Matolcsi (matolcsid) · 2024-09-29T16:59:43.942Z · comments (171)

[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)

2024 Petrov Day Retrospective
Ben Pace (Benito) · 2024-09-28T21:30:14.952Z · comments (25)

[link] Sabotage Evaluations for Frontier Models
David Duvenaud (david-duvenaud) · 2024-10-18T22:33:14.320Z · comments (55)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (11)

2024 Unofficial LessWrong Census/Survey
Screwtape · 2024-12-02T05:30:53.019Z · comments (42)

Bigger Livers?
sarahconstantin · 2024-11-08T21:50:09.814Z · comments (12)

Zvi’s Thoughts on His 2nd Round of SFF
Zvi · 2024-11-20T13:40:08.092Z · comments (2)

LLMs Look Increasingly Like General Reasoners
eggsyntax · 2024-11-08T23:47:28.886Z · comments (45)

Anvil Problems
Screwtape · 2024-11-13T22:57:41.974Z · comments (13)

Three Notions of "Power"
johnswentworth · 2024-10-30T06:10:08.326Z · comments (43)

A very strange probability paradox
notfnofn · 2024-11-22T14:01:36.587Z · comments (26)

(Salt) Water Gargling as an Antiviral
Elizabeth (pktechgirl) · 2024-11-22T18:00:02.765Z · comments (6)

[Intuitive self-models] 1. Preliminaries
Steven Byrnes (steve2152) · 2024-09-19T13:45:27.976Z · comments (20)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman (UnexpectedValues) · 2024-10-07T19:29:29.033Z · comments (2)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (34)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

jkaufman on Why Isn't Tesla Level 3?

I don't think this makes much sense. In a regulated industry, you want to build up a positive reputation and working relationship with the regulators, where they know what to expect from you, are familiar with your work and approach, have a sense of where you're going, and generally like and trust you. Engaging with him early and then repeatedly over a long period seems like a way better strategy than waiting until you have something extremely ambitious to try to get them to approve.

bronson-schoen on A shortcoming of concrete demonstrations as AGI risk advocacy

I am more optimistic that we can get such empirical evidence for at least the most important parts of the AI risk case, like deceptive alignment, and here's one reason as comment on offer:

Can you elaborate on what you were pointing to in the linked example? The thread specifically I’ve seen a few people mention recently but I seem to be missing the conclusion they’re drawing from it.

mondsemmel on MondSemmel's Shortform

Media is bizarre. Here is an article drawing tenuous connections between the recent assassin of a healthcare CEO with rationalism and effective altruism, and here is one who does the same with rationalism and Scott Alexander. Why, tho?

programcrafter on Understanding Shapley Values with Venn Diagrams

Shapley values are the ONLY way to guarantee: <Efficiency, Symmetry, Linearity, Null player properties>

Well it doesn't end at that: it turns out Shapley values for more than 2 players are not nicely behaved and instead violate Maximin Dominance, as demonstrated in https://www.lesswrong.com/posts/vJ7ggyjuP4u2yHNcP/threat-resistant-bargaining-megapost-introducing-the-rose#ROSE_Value__N_Player_Case__ [LW · GW].

The article I link showed how this is fixed:

Shapley values are about adding everyone one-by-one to a team in a random order and everyone gets their marginal value they contributed to the team.
And that's kinda like giving everyone a random initiative ordering and giving everyone the surplus they can extract in the resulting initiative game.
If we're doing that, then maybe a player, regardless of their position, can ensure they get their maximin value? Maybe this sort of Random-Order Surplus Extraction can work. ROSE.

bronson-schoen on A shortcoming of concrete demonstrations as AGI risk advocacy

crisis-mongering about risk when there is no demonstration/empirical evidence to ruin the initially perfect world pretty immediately

I think the key point of this post is precisely the question of “is there any such demonstration, short of the actual real very bad thing happening in a real setting that people who discount these as serious risks would accept as empirical evidence worth updating on?”

g-1 on Post-Quantum Investing: Dump Crypto for Index Funds and Real Estate?

I meant extrapolating developments in the future.

dagon on daijin's Shortform

Paying the (ongoing, repeated) pirate-game blackmail ("pay us or we'll impose a wealth tax") IS a form of wealth tax. You probably need to be more specific about what kinds and levels of wealth tax could happen with various counterfactual assumptions (without those assumptions, there's no reason to believe anything is possible except what actually exists).

will_pearson on Will_Pearson's Shortform

What do you think avout the core concept of Explanatory Fog, that is secrecy leading to distrust leading to a viral mental breakdown? Possibly leading eventually to the end of civlisation. Happy to rework it if the core concept is good.

ete on Second-Time Free

Good call! I have a use for this idea :)

bhauth on shoes with springs

This was a quick and short post, but some people ended up liking it a lot. In retrospect I should've written a bit more, maybe gone into the design of recent running shoes. For example, this Nike Alphafly has a somewhat thick heel made of springy foam that sticks out behind the heel of the foot, and in the front, there's a "carbon plate" (a thin sheet of carbon fiber composite) which also acts like a spring. In the future, there might be gradual evolution towards more extreme versions of the same concept, as recent designs become accepted. Running shoes with a carbon plate have become significantly more common over the past few years. That review says:

The energy return is noticeably greater than that of a shoe without any plating, especially when you lay down some serious power. And that stiffness doesn’t always compromise as much comfort as you’d think.

So that's the running-optimized version of shoes with springs using modern materials, while I was writing more about high heels worn for fashion.

Biomechanics is a topic I could write a lot about, but that would be a separate post. On the general topic of "walking" I also wrote this post [LW · GW]. (japanese version here)