LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

How well do truth probes generalise?
mishajw · 2024-02-24T14:12:19.729Z · comments (11)
Rawls's Veil of Ignorance Doesn't Make Any Sense
Arjun Panickssery (arjun-panickssery) · 2024-02-24T13:18:46.802Z · comments (9)
[question] Can someone explain to me what went wrong with ChatGPT?
Valentin Baltadzhiev (valentin-baltadzhiev) · 2024-02-24T11:50:14.762Z · answers+comments (1)
The Sense Of Physical Necessity: A Naturalism Demo (Introduction)
LoganStrohl (BrienneYudkowsky) · 2024-02-24T02:56:31.458Z · comments (1)
Instrumental deception and manipulation in LLMs - a case study
Olli Järviniemi (jarviniemi) · 2024-02-24T02:07:01.769Z · comments (13)
A starting point for making sense of task structure (in machine learning)
Kaarel (kh) · 2024-02-24T01:51:49.227Z · comments (2)
[link] Why you, personally, should want a larger human population
jasoncrawford · 2024-02-23T19:48:10.526Z · comments (32)
Deliberative Cognitive Algorithms as Scaffolding
Cole Wyeth (Amyr) · 2024-02-23T17:15:26.424Z · comments (4)
The Shutdown Problem: Incomplete Preferences as a Solution
EJT (ElliottThornley) · 2024-02-23T16:01:16.378Z · comments (21)
In set theory, everything is a set
Jacob G-W (g-w1) · 2024-02-23T14:35:54.521Z · comments (9)
The role of philosophical thinking in understanding large language models: Calibrating and closing the gap between first-person experience and underlying mechanisms
Bill Benzon (bill-benzon) · 2024-02-23T12:19:34.851Z · comments (0)
Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)
Parasocial relationship logic
KatjaGrace · 2024-02-23T07:30:05.475Z · comments (1)
Shaming with and without naming
KatjaGrace · 2024-02-23T07:30:03.862Z · comments (5)
Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.
Chi Nguyen · 2024-02-23T06:10:05.881Z · comments (18)
[question] Does increasing the power of a multimodal LLM get you an agentic AI?
yanni kyriacos (yanni) · 2024-02-23T04:14:56.464Z · answers+comments (3)
[link] The natural boundaries between people
Chipmonk · 2024-02-23T01:09:28.592Z · comments (2)
[link] Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”
Ricki Heicklen (bayesshammai) · 2024-02-22T23:56:02.318Z · comments (5)
AI #52: Oops
Zvi · 2024-02-22T21:50:07.393Z · comments (9)
[link] Embed your second brain in your first brain
dkl9 · 2024-02-22T21:46:51.091Z · comments (3)
The Gemini Incident
Zvi · 2024-02-22T21:00:04.594Z · comments (19)
[link] Some Thoughts On Using Auctions For Land Valuation
harsimony · 2024-02-22T19:54:55.357Z · comments (9)
The Binding of Isaac & Transparent Newcomb's Problem
suvjectibity · 2024-02-22T18:56:39.081Z · comments (0)
[link] Research Post: Tasks That Language Models Don’t Learn
Bruce W. Lee (bruce-lee) · 2024-02-22T18:52:32.237Z · comments (23)
Sora What
Zvi · 2024-02-22T18:10:05.397Z · comments (3)
Do sparse autoencoders find "true features"?
Demian Till · 2024-02-22T18:06:59.630Z · comments (33)
Everything Wrong with Roko's Claims about an Engineered Pandemic
EZ97 · 2024-02-22T15:59:08.439Z · comments (10)
The One and a Half Gemini
Zvi · 2024-02-22T13:10:04.725Z · comments (4)
[question] How do I make predictions about the future to make sense of what to do with my life?
Raj Thimmiah (raj-thimmiah) · 2024-02-22T11:22:50.583Z · answers+comments (1)
[link] How are voluntary commitments on vulnerability reporting going?
Adam Jones (domdomegg) · 2024-02-22T08:43:56.996Z · comments (1)
Notes on Internal Objectives in Toy Models of Agents
Paul Colognese (paul-colognese) · 2024-02-22T08:02:39.556Z · comments (0)
The Byronic Hero Always Loses
Cole Wyeth (Amyr) · 2024-02-22T01:31:59.652Z · comments (4)
Job Listing: Managing Editor / Writer
Gretta Duleba (gretta-duleba) · 2024-02-21T23:41:26.818Z · comments (2)
The Pareto Best and the Curse of Doom
Screwtape · 2024-02-21T23:10:01.359Z · comments (22)
[link] AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office
aogara (Aidan O'Gara) · 2024-02-21T21:58:34.000Z · comments (0)
Analogies between scaling labs and misaligned superintelligent AI
scasper · 2024-02-21T19:29:39.033Z · comments (4)
[link] Extinction Risks from AI: Invisible to Science?
VojtaKovarik · 2024-02-21T18:07:33.986Z · comments (7)
Extinction-level Goodhart's Law as a Property of the Environment
VojtaKovarik · 2024-02-21T17:56:02.052Z · comments (0)
Dynamics Crucial to AI Risk Seem to Make for Complicated Models
VojtaKovarik · 2024-02-21T17:54:46.089Z · comments (0)
Which Model Properties are Necessary for Evaluating an Argument?
VojtaKovarik · 2024-02-21T17:52:58.083Z · comments (2)
Weak vs Quantitative Extinction-level Goodhart's Law
VojtaKovarik · 2024-02-21T17:38:15.375Z · comments (1)
Dual Wielding Kindle Scribes
mesaoptimizer · 2024-02-21T17:17:58.743Z · comments (18)
A Tale of Two Restaurant Types
Zvi · 2024-02-21T13:50:05.133Z · comments (0)
Less Wrong automated systems are inadvertently Censoring me
Roko · 2024-02-21T12:57:16.955Z · comments (52)
[question] What is the research speed multiplier of the most advanced current LLMs?
wunan · 2024-02-21T12:39:11.034Z · answers+comments (2)
Jailbreaking GPT-4 with the tool API
mishajw · 2024-02-21T11:16:02.484Z · comments (2)
Gut Renovating Another Bathroom
jefftk (jkaufman) · 2024-02-21T03:00:04.787Z · comments (0)
Thoughts for and against an ASI figuring out ethics for itself
sweenesm · 2024-02-20T23:40:56.770Z · comments (10)
AI #51: Altman’s Ambition
Zvi · 2024-02-20T19:50:07.439Z · comments (5)
The Third Gemini
Zvi · 2024-02-20T19:50:05.195Z · comments (2)
← previous page (newer posts) · next page (older posts) →