LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Beyond ELO: Rethinking Chess Skill as a Multidimensional Random Variable
Oliver Oswald (oliver-oswald) · 2025-02-10T19:19:36.233Z · comments (6)
Bimodal AI Beliefs
Adam Train (aetrain) · 2025-02-14T06:45:53.933Z · comments (1)
[question] p(s-risks to contemporary humans)?
mhampton · 2025-02-08T21:19:53.821Z · answers+comments (5)
There are a lot of upcoming retreats/conferences between March and July (2025)
gergogaspar (gergo-gaspar) · 2025-02-18T09:30:30.258Z · comments (0)
[link] AI Safety at the Frontier: Paper Highlights, January '25
gasteigerjo · 2025-02-11T16:14:16.972Z · comments (0)
[question] Should I Divest from AI?
OKlogic · 2025-02-10T03:29:33.582Z · answers+comments (4)
Are current LLMs safe for psychotherapy?
PaperBike · 2025-02-12T19:16:34.452Z · comments (4)
[link] Teaching AI to reason: this year's most important story
Benjamin_Todd · 2025-02-13T17:40:02.869Z · comments (0)
Closed-ended questions aren't as hard as you think
electroswing · 2025-02-19T03:53:11.855Z · comments (0)
[link] AISN #48: Utility Engineering and EnigmaEval
Corin Katzke (corin-katzke) · 2025-02-18T19:15:16.751Z · comments (0)
Response to the US Govt's Request for Information Concerning Its AI Action Plan
Davey Morse (davey-morse) · 2025-02-14T06:14:08.673Z · comments (0)
What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)
gergogaspar (gergo-gaspar) · 2025-02-17T12:39:09.196Z · comments (0)
OpenAI’s NSFW policy: user safety, harm reduction, and AI consent
8e9 · 2025-02-13T13:59:22.911Z · comments (3)
[link] How do you make a 250x better vaccine at 1/10 the cost? Develop it in India.
Abhishaike Mahajan (abhishaike-mahajan) · 2025-02-09T03:53:17.050Z · comments (5)
ML4Good Colombia - Applications Open to LatAm Participants
Alejandro Acelas (alejandro-acelas) · 2025-02-10T15:03:03.929Z · comments (0)
Claude 3.5 Sonnet (New)'s AGI scenario
Nathan Young · 2025-02-17T18:47:04.669Z · comments (2)
A fable on AI x-risk
bgaesop · 2025-02-18T20:15:24.933Z · comments (0)
Cross-Layer Feature Alignment and Steering in Large Language Model
dlaptev · 2025-02-08T20:18:20.331Z · comments (0)
Call for Applications: XLab Summer Research Fellowship
JoNeedsSleep (joanna-j-1) · 2025-02-18T19:19:20.155Z · comments (0)
Sparse Autoencoder Feature Ablation for Unlearning
aludert · 2025-02-13T19:13:48.388Z · comments (0)
Where Would Good Forecasts Most Help AI Governance Efforts?
Violet Hour · 2025-02-11T18:15:33.082Z · comments (0)
Rethinking AI Safety Approach in the Era of Open-Source AI
Weibing Wang (weibing-wang) · 2025-02-11T14:01:39.167Z · comments (0)
Intelligence Is Jagged
Adam Train (aetrain) · 2025-02-19T07:08:46.444Z · comments (0)
How identical twin sisters feel about nieces vs their own daughters
Dave Lindbergh (dave-lindbergh) · 2025-02-09T17:36:25.830Z · comments (19)
AI Safety Oversights
Davey Morse (davey-morse) · 2025-02-08T06:15:52.896Z · comments (0)
[link] Sparse Autoencoder Features for Classifications and Transferability
Shan23Chen (shan-chen) · 2025-02-18T22:14:12.994Z · comments (0)
[link] Probability of AI-Caused Disaster
Alvin Ånestrand (alvin-anestrand) · 2025-02-12T19:40:11.121Z · comments (2)
Artificial Static Place Intelligence: Guaranteed Alignment
ank · 2025-02-15T11:08:50.226Z · comments (2)
LW/ACX social meetup
Stefan (stefan-1) · 2025-02-10T21:12:39.092Z · comments (0)
Misaligned actions and what to do with them? - A proposed framework and open problems
Shivam · 2025-02-18T00:06:31.518Z · comments (0)
Intrinsic Dimension of Prompts in LLMs
Karthik Viswanathan (vkarthik095) · 2025-02-14T19:02:49.464Z · comments (0)
arch-anarchist reading list
Peter lawless · 2025-02-16T22:47:00.273Z · comments (1)
Arguing for the Truth? An Inference-Only Study into AI Debate
denisemester · 2025-02-11T03:04:58.852Z · comments (0)
Opinion Article Scoring System
ciaran · 2025-02-10T14:32:19.030Z · comments (0)
Quantifying the Qualitative: Towards a Bayesian Approach to Personal Insight
Pruthvi Kumar (pruthvi-kumar) · 2025-02-15T19:50:42.550Z · comments (0)
[link] Claude is More Anxious than GPT; Personality is an axis of interpretability in language models
future_detective · 2025-02-10T19:19:28.005Z · comments (2)
Preference for uncertainty and impact overestimation bias in altruistic systems.
Luck (luck-1) · 2025-02-15T12:27:05.474Z · comments (0)
Gradient Anatomy's - Hallucination Robustness in Medical Q&A
DieSab (diego-sabajo) · 2025-02-12T19:16:58.949Z · comments (0)
Places of Loving Grace
ank · 2025-02-18T23:49:18.580Z · comments (0)
[question] Programming Language Early Funding?
J Thomas Moros (J_Thomas_Moros) · 2025-02-16T17:34:06.058Z · answers+comments (3)
Positive Directions
G Wood (geoffrey-wood) · 2025-02-11T00:00:11.426Z · comments (0)
eHeaven 1st, eGod 2nd: Multiversal AI Alignment & Rational Utopia
ank · 2025-02-13T22:35:28.300Z · comments (0)
Permanent properties of things are a self-fulfilling prophecy
YanLyutnev (YanLutnev) · 2025-02-19T00:08:20.776Z · comments (0)
[link] Baumol effect vs Jevons paradox
Hzn · 2025-02-10T08:28:05.982Z · comments (0)
[link] LLMs can teach themselves to better predict the future
Ben Turtel (ben-turtel) · 2025-02-13T01:01:12.175Z · comments (1)
the dumbest theory of everything
lostinwilliamsburg · 2025-02-13T07:57:38.842Z · comments (0)
[link] Sea Change
Charlie Sanders (charlie-sanders) · 2025-02-18T06:03:06.961Z · comments (2)
CyberEconomy. The Limits to Growth
Timur Sadekov (timur-sadekov) · 2025-02-16T21:02:34.040Z · comments (0)
Preserving Epistemic Novelty in AI: Experiments, Insights, and the Case for Decentralized Collective Intelligence
Andy E Williams (andy-e-williams) · 2025-02-08T10:25:27.891Z · comments (8)
Paranoia, Cognitive Biases, and Catastrophic Thought Patterns.
Spiritus Dei (spiritus-dei) · 2025-02-14T00:13:56.300Z · comments (1)
← previous page (newer posts) · next page (older posts) →