LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

[link] The rise of AI in cybercrime
BobyResearcher · 2023-07-30T20:19:34.867Z · comments (1)
SSA vs. SIA: how future population may provide evidence for or against the foundations of political liberalism
[deleted] · 2023-07-30T20:18:59.444Z · comments (10)
[link] Rationalization Maximizes Expected Value
Kevin Dorst · 2023-07-30T20:11:26.377Z · comments (10)
Apollo Neuro Results
Elizabeth (pktechgirl) · 2023-07-30T18:40:05.213Z · comments (16)
Hilbert's Triumph, Church and Turing's failure, and what it means (Post #2)
Noosphere89 (sharmake-farah) · 2023-07-30T14:33:25.180Z · comments (16)
[question] Specific Arguments against open source LLMs?
Iknownothing · 2023-07-30T14:27:13.116Z · answers+comments (2)
Socialism in large organizations
Adam Zerner (adamzerner) · 2023-07-30T07:25:57.736Z · comments (16)
How to make real-money prediction markets on arbitrary topics (Outdated)
yutaka · 2023-07-30T02:11:47.050Z · comments (13)
[question] Does decidability of a theory imply completeness of the theory?
Noosphere89 (sharmake-farah) · 2023-07-29T23:53:08.166Z · answers+comments (12)
[question] If I showed the EQ-SQ theory's findings to be due to measurement bias, would anyone change their minds about it?
tailcalled · 2023-07-29T19:38:13.285Z · answers+comments (13)
Self-driving car bets
paulfchristiano · 2023-07-29T18:10:01.112Z · comments (41)
[link] The Parable of the Dagger - The Animation
Writer · 2023-07-29T14:03:12.023Z · comments (6)
Are Guitars Obsolete?
jefftk (jkaufman) · 2023-07-29T13:20:01.482Z · comments (8)
NAMSI: A promising approach to alignment
[deleted] · 2023-07-29T07:03:51.930Z · comments (6)
Understanding and Aligning a Human-like Inductive Bias with Cognitive Science: a Review of Related Literature
Claire Short (claire-short) · 2023-07-29T06:10:38.353Z · comments (0)
[link] Universal and Transferable Adversarial Attacks on Aligned Language Models [paper link]
Sodium · 2023-07-29T03:21:15.477Z · comments (0)
[link] Why You Should Never Update Your Beliefs
Arjun Panickssery (arjun-panickssery) · 2023-07-29T00:27:01.899Z · comments (17)
Thoughts about the Mechanistic Interpretability Challenge #2 (EIS VII #2)
RGRGRG · 2023-07-28T20:44:36.868Z · comments (5)
Because of LayerNorm, Directions in GPT-2 MLP Layers are Monosemantic
ojorgensen · 2023-07-28T19:43:12.235Z · comments (3)
When can we trust model evaluations?
evhub · 2023-07-28T19:42:21.799Z · comments (9)
Yes, It's Subjective, But Why All The Crabs?
johnswentworth · 2023-07-28T19:35:36.741Z · comments (15)
Semaglutide and Muscle
5hout · 2023-07-28T18:36:22.036Z · comments (14)
Double Crux in a Box
Screwtape · 2023-07-28T17:55:08.794Z · comments (3)
[link] AI Safety 101 : Introduction to Vision Interpretability
jeanne_ (jeanne_s) · 2023-07-28T17:32:11.545Z · comments (0)
Visible loss landscape basins don't correspond to distinct algorithms
Mikhail Samin (mikhail-samin) · 2023-07-28T16:19:05.279Z · comments (13)
[link] Progress links digest, 2023-07-28: The decadent opulence of modern capitalism
jasoncrawford · 2023-07-28T14:36:26.382Z · comments (3)
AI Awareness through Interaction with Blatantly Alien Models
VojtaKovarik · 2023-07-28T08:41:07.776Z · comments (5)
You don't get to have cool flaws
Neil (neil-warren) · 2023-07-28T05:37:31.414Z · comments (16)
Reducing sycophancy and improving honesty via activation steering
Nina Rimsky (NinaR) · 2023-07-28T02:46:23.122Z · comments (14)
Mech Interp Puzzle 2: Word2Vec Style Embeddings
Neel Nanda (neel-nanda-1) · 2023-07-28T00:50:00.297Z · comments (4)
[link] ETFE windows
bhauth · 2023-07-28T00:46:55.556Z · comments (4)
A Short Memo on AI Interpretability Rainbows
scasper · 2023-07-27T23:05:50.196Z · comments (0)
Pulling the Rope Sideways: Empirical Test Results
Daniel Kokotajlo (daniel-kokotajlo) · 2023-07-27T22:18:01.072Z · comments (18)
[link] A $10k retroactive grant for VaccinateCA
Austin Chen (austin-chen) · 2023-07-27T18:14:44.305Z · comments (0)
Preference Aggregation as Bayesian Inference
beren · 2023-07-27T17:59:36.270Z · comments (1)
AI #22: Into the Weeds
Zvi · 2023-07-27T17:40:02.184Z · comments (8)
[link] SSA rejects anthropic shadow, too
jessicata (jessica.liu.taylor) · 2023-07-27T17:25:17.728Z · comments (38)
[question] What are examples of someone doing a lot of work to find the best of something?
chanamessinger (cmessinger) · 2023-07-27T15:58:02.114Z · answers+comments (15)
[link] AI-Plans.com 10-day Critique-a-Thon
Iknownothing · 2023-07-27T11:44:01.660Z · comments (2)
Privacy in a Digital World
Faustify (nikolay-blagoev) · 2023-07-27T10:46:38.887Z · comments (0)
[link] Cultivating a state of mind where new ideas are born
Henrik Karlsson (henrik-karlsson) · 2023-07-27T09:16:42.566Z · comments (18)
[link] Partial Transcript of Recent Senate Hearing Discussing AI X-Risk
Daniel_Eth · 2023-07-27T09:16:01.168Z · comments (0)
AXRP Episode 24 - Superalignment with Jan Leike
DanielFilan · 2023-07-27T04:00:02.106Z · comments (3)
[question] Have you ever considered taking the 'Turing Test' yourself?
Super AGI (super-agi) · 2023-07-27T03:48:30.407Z · answers+comments (6)
AXRP Episode 23 - Mechanistic Anomaly Detection with Mark Xu
DanielFilan · 2023-07-27T01:50:02.808Z · comments (0)
GPT-4 can catch subtle cross-language translation mistakes
Michael Tontchev (michael-tontchev-1) · 2023-07-27T01:39:23.492Z · comments (1)
Social Balance through Embracing Social Credit
dhruvv · 2023-07-26T20:07:02.953Z · comments (9)
[link] Why no Roman Industrial Revolution?
jasoncrawford · 2023-07-26T19:34:41.682Z · comments (30)
Why you can't treat decidability and complexity as a constant (Post #1)
Noosphere89 (sharmake-farah) · 2023-07-26T17:54:33.294Z · comments (13)
A response to the Richards et al.'s "The Illusion of AI's Existential Risk"
Harrison Fell (harrison-fell) · 2023-07-26T17:34:20.409Z · comments (0)
next page (older posts) →