LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)
[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)
[question] What (if anything) made your p(doom) go down in 2024?
Satron · 2024-11-16T16:46:43.865Z · answers+comments (6)
A better “Statement on AI Risk?”
Knight Lee (Max Lee) · 2024-11-25T04:50:29.399Z · comments (4)
Visualizing small Attention-only Transformers
WCargo (Wcargo) · 2024-11-19T09:37:42.213Z · comments (0)
[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)
Antonym Heads Predict Semantic Opposites in Language Models
Jake Ward (jake-ward) · 2024-11-15T15:32:14.102Z · comments (0)
notes on prioritizing tasks & cognition-threads
Emrik (Emrik North) · 2024-11-26T00:28:03.400Z · comments (1)
[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)
[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)
[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)
(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)
[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)
[link] Decorated pedestrian tunnels
dkl9 · 2024-11-24T22:16:03.794Z · comments (3)
Beyond Gaussian: Language Model Representations and Distributions
Matt Levinson · 2024-11-24T01:53:38.156Z · comments (0)
LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)
Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)
[link] When the Scientific Method Doesn't Really Help...
casualphysicsenjoyer (hatta_afiq) · 2024-11-27T19:52:30.023Z · comments (0)
The boat
RomanS · 2024-11-22T12:56:45.050Z · comments (0)
Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)
Hope to live or fear to die?
Knight Lee (Max Lee) · 2024-11-27T10:42:37.070Z · comments (0)
San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)
Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)
Reducing x-risk might be actively harmful
MountainPath · 2024-11-18T14:25:07.127Z · comments (5)
Should you increase AI alignment funding, or increase AI regulation?
Knight Lee (Max Lee) · 2024-11-26T09:17:01.809Z · comments (1)
aspirational leadership
dhruvmethi · 2024-11-20T16:07:43.507Z · comments (0)
Agenda Manipulation
Pazzaz · 2024-11-09T14:13:33.729Z · comments (0)
Root node of my posts
AtillaYasar (atillayasar) · 2024-11-19T20:09:02.973Z · comments (0)
Don't want Goodhart? — Specify the variables more
YanLyutnev (YanLutnev) · 2024-11-21T22:43:48.362Z · comments (2)
[question] Poll: what’s your impression of altruism?
David Gross (David_Gross) · 2024-11-09T20:28:15.418Z · answers+comments (4)
[question] Have we seen any "ReLU instead of sigmoid-type improvements" recently
KvmanThinking (avery-liu) · 2024-11-23T03:51:52.984Z · answers+comments (4)
[link] Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong · 2024-10-31T09:21:11.623Z · comments (0)
Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)
Workshop Report: Why current benchmarks approaches are not sufficient for safety?
Tom DAVID (tom-david) · 2024-11-26T17:20:47.453Z · comments (0)
Which AI Safety Benchmark Do We Need Most in 2025?
Loïc Cabannes (loic-cabannes) · 2024-11-17T23:50:56.337Z · comments (2)
[link] Sparks of Consciousness
Charlie Sanders (charlie-sanders) · 2024-11-13T04:58:27.222Z · comments (0)
MIT FutureTech are hiring ‍a Product and Data Visualization Designer
peterslattery · 2024-11-13T14:48:06.167Z · comments (0)
Breaking beliefs about saving the world
Oxidize · 2024-11-15T00:46:03.693Z · comments (3)
A Meritocracy of Taste
Daniele De Nuntiis (daniele-de-nuntiis) · 2024-11-28T09:10:10.598Z · comments (0)
[question] A Coordination Cookbook?
azergante · 2024-11-10T23:20:34.843Z · answers+comments (0)
AI alignment via civilizational cognitive updates
AtillaYasar (atillayasar) · 2024-11-10T09:33:35.023Z · comments (10)
Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-11-24T19:40:52.215Z · comments (0)
Composition Circuits in Vision Transformers (Hypothesis)
phenomanon (ekg) · 2024-11-01T22:16:11.191Z · comments (0)
Zaragoza ACX/LW Meetup
Fernand0 · 2024-11-25T06:56:12.321Z · comments (0)
[link] Paradigm Shifts—change everything... except almost everything
James Stephen Brown (james-brown) · 2024-11-23T18:34:13.088Z · comments (0)
[question] Will Orion/Gemini 2/Llama-4 outperform o1
LuigiPagani (luigipagani) · 2024-11-18T21:15:55.953Z · answers+comments (3)
'Meta', 'mesa', and mountains
Lorec · 2024-10-31T17:25:53.635Z · comments (0)
Automated monitoring systems
hiki_t · 2024-11-28T18:54:29.886Z · comments (0)
Launching a 5-day Intro to Transformative AI course
bluedotimpact · 2024-11-22T17:45:05.304Z · comments (0)
Jakarta ACX December 2024 Meetup
Aud (aud) · 2024-11-19T15:01:31.101Z · comments (0)
← previous page (newer posts) · next page (older posts) →