LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] A new process for mapping discussions
Nathan Young · 2024-09-30T08:57:20.029Z · comments (7)

[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (5)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

[link] Our Digital and Biological Children
Eneasz · 2024-10-24T18:36:38.719Z · comments (0)

Trading Candy
jefftk (jkaufman) · 2024-11-01T01:10:08.024Z · comments (4)

[link] Generic advice caveats
Saul Munn (saul-munn) · 2024-10-30T21:03:07.185Z · comments (1)

Why is there Nothing rather than Something?
Logan Zoellner (logan-zoellner) · 2024-10-26T12:37:50.204Z · comments (3)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

An AI crash is our best bet for restricting AI
Remmelt (remmelt-ellen) · 2024-10-11T02:12:03.491Z · comments (3)

[question] What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
ChristianKl · 2024-09-26T09:17:39.088Z · answers+comments (21)

the Daydication technique
chaosmage · 2024-10-18T21:47:46.448Z · comments (0)

[link] Evaluating Synthetic Activations composed of SAE Latents in GPT-2
Giorgi Giglemiani (Rakh) · 2024-09-25T20:37:48.227Z · comments (0)

Bay Winter Solstice 2024: song leading auditions
tcheasdfjkl · 2024-11-10T23:59:08.199Z · comments (0)

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
Daniel Lee (daniel-lee) · 2024-09-06T02:28:41.954Z · comments (0)

Interpretability of SAE Features Representing Check in ChessGPT
Jonathan Kutasov (jonathan-kutasov) · 2024-10-05T20:43:36.679Z · comments (2)

[link] Predicting Influenza Abundance in Wastewater Metagenomic Sequencing Data
jefftk (jkaufman) · 2024-09-23T17:25:58.380Z · comments (0)

European Progress Conference
Martin Sustrik (sustrik) · 2024-10-06T11:10:03.819Z · comments (11)

Distinguishing ways AI can be "concentrated"
Matthew Barnett (matthew-barnett) · 2024-10-21T22:21:13.666Z · comments (2)

Domain-specific SAEs
jacob_drori (jacobcd52) · 2024-10-07T20:15:38.584Z · comments (0)

[question] Any real toeholds for making practical decisions regarding AI safety?
lukehmiles (lcmgcd) · 2024-09-29T12:03:08.084Z · answers+comments (6)

There aren't enough smart people in biology doing something boring
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-21T15:52:04.482Z · comments (13)

Superintelligence Can't Solve the Problem of Deciding What You'll Do
Vladimir_Nesov · 2024-09-15T21:03:28.077Z · comments (11)

SAEs you can See: Applying Sparse Autoencoders to Clustering
Robert_AIZI · 2024-10-28T14:48:16.744Z · comments (0)

Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde (kola-ayonrinde) · 2024-10-30T22:50:45.642Z · comments (0)

Sleeping on Stage
jefftk (jkaufman) · 2024-10-22T00:50:07.994Z · comments (3)

[question] Seeking AI Alignment Tutor/Advisor: $100–150/hr
MrThink (ViktorThink) · 2024-10-05T21:28:16.491Z · answers+comments (3)

Option control
Joe Carlsmith (joekc) · 2024-11-04T17:54:03.073Z · comments (0)

[link] Can a Bayesian Oracle Prevent Harm from an Agent? (Bengio et al. 2024)
mattmacdermott · 2024-09-01T07:46:26.647Z · comments (0)

[link] Care Doesn't Scale
stavros · 2024-10-28T11:57:38.742Z · comments (1)

The causal backbone conjecture
tailcalled · 2024-08-17T18:50:14.577Z · comments (0)

Thinking in 2D
sarahconstantin · 2024-10-20T19:30:05.842Z · comments (0)

SAE features for refusal and sycophancy steering vectors
neverix · 2024-10-12T14:54:48.022Z · comments (4)

[link] A brief history of the automated corporation
owencb · 2024-11-04T14:35:04.906Z · comments (1)

LessWrong email subscriptions?
Raemon · 2024-08-27T21:59:56.855Z · comments (6)

Just because an LLM said it doesn't mean it's true: an illustrative example
dirk (abandon) · 2024-08-21T21:05:59.691Z · comments (12)

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?
Taras Kutsyk · 2024-09-29T19:37:30.465Z · comments (8)

[link] Death notes - 7 thoughts on death
Nathan Young · 2024-10-28T15:01:13.532Z · comments (1)

[link] SB 1047 gets vetoed
ryan_b · 2024-09-30T15:49:38.609Z · comments (1)

[question] When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?
corruptedCatapillar · 2024-11-01T07:29:49.262Z · answers+comments (2)

[question] When can I be numerate?
FinalFormal2 · 2024-09-12T04:05:27.710Z · answers+comments (3)

Trying to be rational for the wrong reasons
Viliam · 2024-08-20T16:18:06.385Z · comments (8)

[link] Fictional parasites very different from our own
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-08T14:59:39.080Z · comments (0)

Abstractions are not Natural
Alfred Harwood · 2024-11-04T11:10:09.023Z · comments (21)

Linkpost: "Imagining and building wise machines: The centrality of AI metacognition" by Johnson, Karimi, Bengio, et al.
Chris_Leong · 2024-11-11T16:13:26.504Z · comments (6)

You're Playing a Rough Game
jefftk (jkaufman) · 2024-10-17T19:20:06.251Z · comments (2)

[link] Beware the science fiction bias in predictions of the future
Nikita Sokolsky (nikita-sokolsky) · 2024-08-19T05:32:47.372Z · comments (20)

How to put California and Texas on the campaign trail!
Yair Halberstadt (yair-halberstadt) · 2024-11-06T06:08:25.673Z · comments (4)

[link] UK AISI: Early lessons from evaluating frontier AI systems
Zach Stein-Perlman · 2024-10-25T19:00:21.689Z · comments (0)

[link] A primer on the next generation of antibodies
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-01T22:37:59.207Z · comments (0)

A suite of Vision Sparse Autoencoders
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-10-27T04:05:20.377Z · comments (0)

← previous page (newer posts) · next page (older posts) →

The argument here is that there are two ways of proving ZFC + not Consistent(ZFC) is inconsistent. Either you prove not Consistent(ZFC) from axioms in ZFC or you contradict an axiom of ZFC from not Consistent(ZFC). The former is impossible by Godel's second incompleteness theorem. The ladder is equivalent to proving Consistent(ZFC) from an axiom of ZFC (its contrapositive), which is also impossible by Godel. ↩︎

LessWrong 2.0 Reader

Archive

Recent comments