LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Anthropic's Certificate of Incorporation
Zach Stein-Perlman · 2024-06-12T13:00:30.806Z · comments (4)

Talent Needs of Technical AI Safety Teams
yams (william-brewer) · 2024-05-24T00:36:40.486Z · comments (64)

The LessWrong 2022 Review
habryka (habryka4) · 2023-12-05T04:00:00.000Z · comments (43)

Why I funded PIBBSS
Ryan Kidd (ryankidd44) · 2024-09-15T19:56:33.018Z · comments (21)

Current AIs Provide Nearly No Data Relevant to AGI Alignment
Thane Ruthenis · 2023-12-15T20:16:09.723Z · comments (155)

Mapping the semantic void: Strange goings-on in GPT embedding spaces
mwatkins · 2023-12-14T13:10:22.691Z · comments (31)

The Pareto Best and the Curse of Doom
Screwtape · 2024-02-21T23:10:01.359Z · comments (21)

Rationality Research Report: Towards 10x OODA Looping?
Raemon · 2024-02-24T21:06:38.703Z · comments (21)

[link] Gender Exploration
sapphire (deluks917) · 2024-01-14T18:57:32.893Z · comments (25)

[link] introduction to cancer vaccines
bhauth · 2024-05-05T01:06:16.972Z · comments (19)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

Four visions of Transformative AI success
Steven Byrnes (steve2152) · 2024-01-17T20:45:46.976Z · comments (22)

How much to update on recent AI governance moves?
habryka (habryka4) · 2023-11-16T23:46:01.601Z · comments (5)

The Parable Of The Fallen Pendulum - Part 1
johnswentworth · 2024-03-01T00:25:00.111Z · comments (32)

[link] Practically A Book Review: Appendix to "Nonlinear's Evidence: Debunking False and Misleading Claims" (ThingOfThings)
tailcalled · 2024-01-03T17:07:13.990Z · comments (25)

Social status part 1/2: negotiations over object-level preferences
Steven Byrnes (steve2152) · 2024-03-05T16:29:07.143Z · comments (15)

The Pearly Gates
lsusr · 2024-05-30T04:01:14.198Z · comments (6)

Simple versus Short: Higher-order degeneracy and error-correction
Daniel Murfet (dmurfet) · 2024-03-11T07:52:46.307Z · comments (6)

Ten arguments that AI is an existential risk
KatjaGrace · 2024-08-13T17:00:03.397Z · comments (41)

[link] Please support this blog (with money)
Elizabeth (pktechgirl) · 2024-08-17T15:30:05.641Z · comments (2)

Introduction to French AI Policy
Lucie Philippon (lucie-philippon) · 2024-07-04T03:39:45.273Z · comments (12)

The case for more ambitious language model evals
Jozdien · 2024-01-30T00:01:13.876Z · comments (30)

Please stop using mediocre AI art in your posts
Raemon · 2024-08-25T00:13:52.890Z · comments (24)

[link] A primer on the current state of longevity research
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-22T17:14:57.990Z · comments (6)

You should go to ML conferences
Jan_Kulveit · 2024-07-24T11:47:52.214Z · comments (13)

Being nicer than Clippy
Joe Carlsmith (joekc) · 2024-01-16T19:44:23.893Z · comments (32)

What I Would Do If I Were Working On AI Governance
johnswentworth · 2023-12-08T06:43:42.565Z · comments (32)

' petertodd'’s last stand: The final days of open GPT-3 research
mwatkins · 2024-01-22T18:47:00.710Z · comments (16)

A Selection of Randomly Selected SAE Features
CallumMcDougall (TheMcDouglas) · 2024-04-01T09:09:49.235Z · comments (2)

Experiences and learnings from both sides of the AI safety job market
Marius Hobbhahn (marius-hobbhahn) · 2023-11-15T15:40:32.196Z · comments (4)

Clarifying METR's Auditing Role
Beth Barnes (beth-barnes) · 2024-05-30T18:41:56.029Z · comments (1)

The Leopold Model: Analysis and Reactions
Zvi · 2024-06-14T15:10:03.480Z · comments (19)

OthelloGPT learned a bag of heuristics
jylin04 · 2024-07-02T09:12:56.377Z · comments (10)

"AI Alignment" is a Dangerously Overloaded Term
Roko · 2023-12-15T14:34:29.850Z · comments (100)

Attitudes about Applied Rationality
Camille Berger (Camille Berger) · 2024-02-03T14:42:22.770Z · comments (18)

Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1)
Neel Nanda (neel-nanda-1) · 2023-12-23T02:44:24.270Z · comments (8)

[question] How do you feel about LessWrong these days? [Open feedback thread]
jacobjacob · 2023-12-05T20:54:42.317Z · answers+comments (281)

2023 in AI predictions
jessicata (jessica.liu.taylor) · 2024-01-01T05:23:42.514Z · comments (35)

Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight
Sam Marks (samuel-marks) · 2024-04-18T16:17:39.136Z · comments (10)

[link] Most smart and skilled people are outside of the EA/rationalist community: an analysis
titotal (lombertini) · 2024-07-12T12:13:56.215Z · comments (36)

[link] Perplexity wins my AI race
Elizabeth (pktechgirl) · 2024-08-24T19:20:10.859Z · comments (12)

The first future and the best future
KatjaGrace · 2024-04-25T06:40:04.510Z · comments (12)

Demystifying "Alignment" through a Comic
milanrosko · 2024-06-09T08:24:22.454Z · comments (19)

Skills I'd like my collaborators to have
Raemon · 2024-02-09T08:20:37.686Z · comments (9)

Danger, AI Scientist, Danger
Zvi · 2024-08-15T22:40:06.715Z · comments (9)

Why I'm doing PauseAI
Joseph Miller (Josephm) · 2024-04-30T16:21:54.156Z · comments (16)

New LessWrong feature: Dialogue Matching
jacobjacob · 2023-11-16T21:27:16.763Z · comments (22)

[question] What convincing warning shot could help prevent extinction from AI?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-13T18:09:29.096Z · answers+comments (18)

[link] A case for AI alignment being difficult
jessicata (jessica.liu.taylor) · 2023-12-31T19:55:26.130Z · comments (56)

SAE reconstruction errors are (empirically) pathological
wesg (wes-gurnee) · 2024-03-29T16:37:29.608Z · comments (16)

← previous page (newer posts) · next page (older posts) →

The argument here is that there are two ways of proving ZFC + not Consistent(ZFC) is inconsistent. Either you prove not Consistent(ZFC) from axioms in ZFC or you contradict an axiom of ZFC from not Consistent(ZFC). The former is impossible by Godel's second incompleteness theorem. The ladder is equivalent to proving Consistent(ZFC) from an axiom of ZFC (its contrapositive), which is also impossible by Godel. ↩︎

LessWrong 2.0 Reader

Archive

Recent comments