LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)

[question] What (if anything) made your p(doom) go down in 2024?
Satron · 2024-11-16T16:46:43.865Z · answers+comments (6)

A better “Statement on AI Risk?”
Knight Lee (Max Lee) · 2024-11-25T04:50:29.399Z · comments (4)

Visualizing small Attention-only Transformers
WCargo (Wcargo) · 2024-11-19T09:37:42.213Z · comments (0)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

Antonym Heads Predict Semantic Opposites in Language Models
Jake Ward (jake-ward) · 2024-11-15T15:32:14.102Z · comments (0)

notes on prioritizing tasks & cognition-threads
Emrik (Emrik North) · 2024-11-26T00:28:03.400Z · comments (1)

[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)

[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)

[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)

(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)

[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)

[link] Decorated pedestrian tunnels
dkl9 · 2024-11-24T22:16:03.794Z · comments (3)

Beyond Gaussian: Language Model Representations and Distributions
Matt Levinson · 2024-11-24T01:53:38.156Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)

Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)

[link] When the Scientific Method Doesn't Really Help...
casualphysicsenjoyer (hatta_afiq) · 2024-11-27T19:52:30.023Z · comments (0)

The boat
RomanS · 2024-11-22T12:56:45.050Z · comments (0)

Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)

Hope to live or fear to die?
Knight Lee (Max Lee) · 2024-11-27T10:42:37.070Z · comments (0)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)

Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)

Reducing x-risk might be actively harmful
MountainPath · 2024-11-18T14:25:07.127Z · comments (5)

Should you increase AI alignment funding, or increase AI regulation?
Knight Lee (Max Lee) · 2024-11-26T09:17:01.809Z · comments (1)

aspirational leadership
dhruvmethi · 2024-11-20T16:07:43.507Z · comments (0)

Agenda Manipulation
Pazzaz · 2024-11-09T14:13:33.729Z · comments (0)

Root node of my posts
AtillaYasar (atillayasar) · 2024-11-19T20:09:02.973Z · comments (0)

Don't want Goodhart? — Specify the variables more
YanLyutnev (YanLutnev) · 2024-11-21T22:43:48.362Z · comments (2)

[question] Poll: what’s your impression of altruism?
David Gross (David_Gross) · 2024-11-09T20:28:15.418Z · answers+comments (4)

[question] Have we seen any "ReLU instead of sigmoid-type improvements" recently
KvmanThinking (avery-liu) · 2024-11-23T03:51:52.984Z · answers+comments (4)

[link] Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong · 2024-10-31T09:21:11.623Z · comments (0)

Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)

Workshop Report: Why current benchmarks approaches are not sufficient for safety?
Tom DAVID (tom-david) · 2024-11-26T17:20:47.453Z · comments (0)

Which AI Safety Benchmark Do We Need Most in 2025?
Loïc Cabannes (loic-cabannes) · 2024-11-17T23:50:56.337Z · comments (2)

[link] Sparks of Consciousness
Charlie Sanders (charlie-sanders) · 2024-11-13T04:58:27.222Z · comments (0)

MIT FutureTech are hiring ‍a Product and Data Visualization Designer
peterslattery · 2024-11-13T14:48:06.167Z · comments (0)

Breaking beliefs about saving the world
Oxidize · 2024-11-15T00:46:03.693Z · comments (3)

A Meritocracy of Taste
Daniele De Nuntiis (daniele-de-nuntiis) · 2024-11-28T09:10:10.598Z · comments (0)

[question] A Coordination Cookbook?
azergante · 2024-11-10T23:20:34.843Z · answers+comments (0)

AI alignment via civilizational cognitive updates
AtillaYasar (atillayasar) · 2024-11-10T09:33:35.023Z · comments (10)

Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-11-24T19:40:52.215Z · comments (0)

Composition Circuits in Vision Transformers (Hypothesis)
phenomanon (ekg) · 2024-11-01T22:16:11.191Z · comments (0)

Zaragoza ACX/LW Meetup
Fernand0 · 2024-11-25T06:56:12.321Z · comments (0)

[link] Paradigm Shifts—change everything... except almost everything
James Stephen Brown (james-brown) · 2024-11-23T18:34:13.088Z · comments (0)

[question] Will Orion/Gemini 2/Llama-4 outperform o1
LuigiPagani (luigipagani) · 2024-11-18T21:15:55.953Z · answers+comments (3)

'Meta', 'mesa', and mountains
Lorec · 2024-10-31T17:25:53.635Z · comments (0)

Automated monitoring systems
hiki_t · 2024-11-28T18:54:29.886Z · comments (0)

Launching a 5-day Intro to Transformative AI course
bluedotimpact · 2024-11-22T17:45:05.304Z · comments (0)

Jakarta ACX December 2024 Meetup
Aud (aud) · 2024-11-19T15:01:31.101Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

xpym on You are not too "irrational" to know your preferences.

Now it would certainly be tempting to define rationality as something like “only taking actions that you endorse in the long term”, but I’d be cautious of that.

Indeed, and there's another big reason for that - trying to always override your short-term "monkey brain" impulses just doesn't work that well for most people. That's the root of akrasia, which certainly isn't a problem that self-identified rationalists are immune to. What seems to be a better approach is to find compromises, where you develop workable long-term strategies which involve neither unlimited amounts of proverbial ice cream, nor total abstinence.

But I think that quite a few people who care about “health” actually care about not appearing low status by doing things that everyone knows are unhealthy.

Which is a good thing, in this particular case, yes? That's cultural evolution properly doing its job, as far as I'm concerned.

yanling-guo on How Universal Basic Income Could Help Us Build a Brighter Future

It doesn’t make sense to argue about definitions. If you define UBI so, then so does UBI mean for you. I’m actively pushing for a redefinition of UBI, or reshaping the policy as I said, because I thinks it’s the right thing to do.

Did I reply in so perfect English that it sounded like corrected by ChatGPT? Cheer to my English, which has improved so much! 🥂

the-gears-to-ascension on [bounty $100] Why are there no interesting (1D, 2-state) quantum cellular automata?

What is a concise intro that will teach me everything I need to know for understanding every expression here? I'm also asking Claude, interested in input from people with useful physics textbook taste

q-home on Making a conservative case for alignment

I think there should be more spaces where controversial ideas can be debated. I'm not against spaces without pronoun rules, just don't think every place should be like this. Also, if we create a space for political debate, we need to really make sure that the norms don't punish everyone who opposes centrism & the right. (Over-sensitive norms like "if you said that some opinion is transphobic you're uncivil/shaming/manipulative and should get banned" might do this.) Otherwise it's not free speech either. Will just produce another Grey or Red Tribe instead of Red/Blue/Grey debate platform.

I do think progressives underestimate free speech damage. To me it's the biggest issue with the Left. Though I don't think they're entirely wrong about free speech.

For example, imagine I have trans employees. Another employee (X) refuses to use pronouns, in principle (using pronouns is not the same as accepting progressive gender theories). Why? Maybe X thinks my trans employees live such a great lie that using pronouns is already an unacceptable concession. Or maybe X thinks that even trying to switch "he" & "she" is too much work, and I'm not justified in asking to do that work because of absolute free speech. Those opinions seem unnecessarily strong and they're at odds with the well-being of my employees, my work environment. So what now? Also, if pronouns are an unacceptable concession, why isn't calling a trans woman by her female name an unacceptable concession?

Imagine I don't believe something about a minority, so I start avoiding words which might suggest otherwise. If I don't believe that gay love can be as true as straight love, I avoid the word "love" (in reference to gay people or to anybody) at work. If I don't believe that women are as smart as men, I avoid the word "master" / "genius" (in reference to women or anybody) at work. It can get pretty silly. Will predictably cost me certain jobs.

sinclair-chen on Sinclair Chen's Shortform

we completely dominate dogs. society treat them well because enough humans love dogs.

I do think that cooperation between people is the origin of religion, and its moral rulesets which create tiny little societies that can hunt stags.

sinclair-chen on Sinclair Chen's Shortform

I definitely think that if I was not conscious then I would not coherently want things. But that conscious minds are the only things that can truly care, does not mean that conscious minds are the only things we should terminally care about.

The close circle composition isn't enough to justify Singerian altruism from egoist assumptions, because of the value falloff. With each degree of connection, I love the stranger less.

sinclair-chen on Sinclair Chen's Shortform

I didn't use the word "ethics" in my comment, so are you making a definitional statement, to distinguish between [universal value system] and [subjective value system] or just authoritatively saying that I'm wrong?

Are you claiming moral realism? I don't really believe that. If "ethics" is global, why should I care about "ethics"? Sorry if that sounds callous, I do actually care about the world, just trying to pin down what you mean.

shankar-sivarajan on Why Don't We Just... Shoggoth+Face+Paraphraser?

I suspect the real reason is stopping competitors fine-tuning on o1's CoT, which they also come right out and say:

Therefore, after weighing multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring

anaguma on keltan's Shortform

What signal do we get from DeepSeek continuing to publish?

johnswentworth on leogao's Shortform

the number one spontaneous conversation is "what are you working on" or "what have you done so far", which forces you to re-explain what you're doing & the reasons for doing it to a skeptical & ignorant audience

I'm very curious if others also find this to be the biggest value-contributor amongst spontaneous conversations. (Also, more generally, I'm curious what kinds of spontaneous conversations people are getting so much value out of.)