LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Scattered thoughts on what it means for an LLM to believe
TheManxLoiner · 2024-11-06T22:10:29.429Z · comments (4)
Apply to be a mentor in SPAR!
agucova · 2024-11-05T21:32:45.797Z · comments (0)
Using Narrative Prompting to Extract Policy Forecasts from LLMs
Max Ghenis (MaxGhenis) · 2024-11-05T04:37:52.004Z · comments (0)
If I care about measure, choices have additional burden (+AI generated LW-comments)
avturchin · 2024-11-15T10:27:15.212Z · comments (11)
Project Adequate: Seeking Cofounders/Funders
Lorec · 2024-11-17T03:12:12.995Z · comments (7)
Educational CAI: Aligning a Language Model with Pedagogical Theories
Bharath Puranam (bharath-puranam) · 2024-11-01T18:55:26.993Z · comments (1)
[link] Is P(Doom) Meaningful? Bayesian vs. Popperian Epistemology Debate
Liron · 2024-11-09T23:39:30.039Z · comments (0)
Bellevue Library Meetup - Nov 23
Cedar (xida-ren) · 2024-11-09T23:05:02.452Z · comments (3)
Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)
Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)
[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)
Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)
Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)
Germany-wide ACX Meetup
Fernand0 · 2024-11-17T10:08:54.584Z · comments (0)
[link] Entropic strategy in Two Truths and a Lie
dkl9 · 2024-11-21T22:03:28.986Z · comments (2)
[question] What (if anything) made your p(doom) go down in 2024?
Satron · 2024-11-16T16:46:43.865Z · answers+comments (6)
Visualizing small Attention-only Transformers
WCargo (Wcargo) · 2024-11-19T09:37:42.213Z · comments (0)
[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)
[question] What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood (Simon Skade) · 2024-11-07T09:40:20.275Z · answers+comments (15)
What are Emotions?
Myles H (zarsou9) · 2024-11-15T04:20:27.388Z · comments (13)
[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)
LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)
(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)
San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)
Antonym Heads Predict Semantic Opposites in Language Models
Jake Ward (jake-ward) · 2024-11-15T15:32:14.102Z · comments (0)
Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)
Reducing x-risk might be actively harmful
MountainPath · 2024-11-18T14:25:07.127Z · comments (5)
[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)
Beyond Gaussian: Language Model Representations and Distributions
Matt Levinson · 2024-11-24T01:53:38.156Z · comments (0)
Enabling New Applications with Today's Mechanistic Interpretability Toolkit
ananya_joshi · 2024-10-25T17:53:23.960Z · comments (0)
[link] Paradigm Shifts—change everything... except almost everything
James Stephen Brown (james-brown) · 2024-11-23T18:34:13.088Z · comments (0)
[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)
Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)
[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)
Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)
[link] Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong · 2024-10-31T09:21:11.623Z · comments (0)
Which AI Safety Benchmark Do We Need Most in 2025?
Loïc Cabannes (loic-cabannes) · 2024-11-17T23:50:56.337Z · comments (2)
Root node of my posts
AtillaYasar (atillayasar) · 2024-11-19T20:09:02.973Z · comments (0)
Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)
aspirational leadership
dhruvmethi · 2024-11-20T16:07:43.507Z · comments (0)
Breaking beliefs about saving the world
Oxidize · 2024-11-15T00:46:03.693Z · comments (3)
MIT FutureTech are hiring ‍a Product and Data Visualization Designer
peterslattery · 2024-11-13T14:48:06.167Z · comments (0)
[link] Sparks of Consciousness
Charlie Sanders (charlie-sanders) · 2024-11-13T04:58:27.222Z · comments (0)
Don't want Goodhart? — Specify the variables more
YanLyutnev (YanLutnev) · 2024-11-21T22:43:48.362Z · comments (2)
The boat
RomanS · 2024-11-22T12:56:45.050Z · comments (0)
Agenda Manipulation
Pazzaz · 2024-11-09T14:13:33.729Z · comments (0)
[question] Poll: what’s your impression of altruism?
David Gross (David_Gross) · 2024-11-09T20:28:15.418Z · answers+comments (4)
Truth Terminal: A reconstruction of events
crvr.fr (crdevio) · 2024-11-17T23:51:21.279Z · comments (1)
Modeling AI-driven occupational change over the next 10 years and beyond
2120eth · 2024-11-12T04:58:26.741Z · comments (0)
'Meta', 'mesa', and mountains
Lorec · 2024-10-31T17:25:53.635Z · comments (0)
← previous page (newer posts) · next page (older posts) →