LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Automating Mechanistic Interpretability via Program Synthesis
Edy Nastase (edy-nastase) · 2025-04-17T10:58:46.748Z · comments (1)

Memory Decoding Journal Club
Devin Ward (Carboncopies Foundation) · 2025-04-17T16:19:25.992Z · comments (0)

AI, Alignment & the Art of Relationship Design
Priyanka Bharadwaj (priyanka-bharadwaj) · 2025-04-19T00:47:02.591Z · comments (0)

Machines of Stolen Grace
Riley Tavassoli (riley-tavassoli) · 2025-03-27T18:15:23.736Z · comments (0)

I’m headed to DC this week. any tips?
Wes R · 2025-04-19T02:33:18.584Z · comments (0)

Could LLMs Learn to Detect Bias Autonomously, Like Tesla’s Self-Driving Cars?
Omnipheasant · 2025-04-18T18:45:36.242Z · comments (0)

Hierarchical Cognitive Anchoring: A Sketch Toward Scalable Structural Alignment
sparckix · 2025-04-18T19:03:51.115Z · comments (0)

Alignment Does Not Need to Be Opaque! An Introduction to Feature Steering with Reinforcement Learning
Jeremias Ferrao (jeremias-ferrao) · 2025-04-18T19:34:49.357Z · comments (0)

[question] How familiar is the Lesswrong community as a whole with the concept of Reward-modelling?
Oxidize · 2025-04-09T23:33:18.044Z · answers+comments (8)

Routine Novelty
BazingaBoy (martin-nenov) · 2025-03-31T15:47:05.217Z · comments (0)

A Fraction of Global Market Capitalization as the Best Currency
Greenless Mirror (mikhail-2) · 2025-03-31T13:30:03.970Z · comments (25)

Does the universe's recognition of measurement provide stronger evidence for being in a simulation than universal fine-tuning?
amelia (314159) · 2025-04-09T08:20:10.561Z · comments (0)

AI Needs Us? Information Theory and Humans as data
tomdekan (tomd@hey.com) · 2025-03-29T15:51:16.070Z · comments (6)

LLM-based Fact Checking for Popular Posts?
azergante · 2025-04-18T21:26:25.230Z · comments (0)

[link] Six reasons why objective morality is nonsense
Zero Contradictions · 2025-04-11T02:11:04.775Z · comments (10)

[question] How many times faster can the AGI advance the science than humans do?
StanislavKrym · 2025-03-28T15:16:52.320Z · answers+comments (0)

[link] Rethinking Friction: Equity and Motivation Across Domains
eltimbalino · 2025-04-08T03:58:02.839Z · comments (0)

Do we want too much from a potentially godlike AGI?
StanislavKrym · 2025-04-11T23:33:06.710Z · comments (0)

[link] find_purpose.exe
heatdeathandtaxes · 2025-04-12T19:31:38.951Z · comments (0)

Alignment through atomic agents
micseydel · 2025-03-27T18:43:14.569Z · comments (0)

On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
funnyfranco · 2025-03-31T19:26:27.090Z · comments (32)

Would this solve the (outer) alignment problem, or at least help?
Wes R · 2025-04-06T18:49:14.145Z · comments (1)

An argument for asexuality
filthy_hedonist (sid-kolichala) · 2025-03-27T18:08:48.624Z · comments (10)

What If Galaxies Are Alive and Atoms Have Minds? A Thought Experiment on Life Across Scales
Saif Khan (saif-khan) · 2025-04-18T10:01:18.783Z · comments (4)

A Solution to Sandbagging and other Self-Provable Misalignment: Constitutional AI Detectives
Knight Lee (Max Lee) · 2025-04-14T10:27:24.903Z · comments (2)

[link] The Cynic Wasps in the Beehive
mempko · 2025-04-12T19:30:44.227Z · comments (0)

Why Does It Feel Like Something? An Evolutionary Path to Subjectivity
gmax (maxim-gurevich) · 2025-04-15T08:38:50.637Z · comments (11)

[question] Is the ethics of interaction with primitive peoples already solved?
StanislavKrym · 2025-04-11T14:56:21.306Z · answers+comments (0)

Will the AGIs be able to run the civilisation?
StanislavKrym · 2025-03-28T04:50:07.568Z · comments (2)

A New Challenge to all Bayesians!
milanrosko · 2025-04-02T02:38:35.562Z · comments (0)

8 PRIME SKILLS An analisis
P. João (gabriel-brito) · 2025-04-17T11:36:54.678Z · comments (0)

Reframing AI Safety Through the Lens of Identity Maintenance Framework
Hiroshi Yamakawa (hiroshi-yamakawa) · 2025-04-01T06:16:45.228Z · comments (1)

How to defeat superintelligence, the Sta-Hi way
kilgoar (william-walshe) · 2025-04-09T13:58:59.541Z · comments (0)

Karel Čapek’s 'War with the Newts' 1936 review
Petr 'Margot' Andreev (petr-andreev) · 2025-04-04T23:12:39.572Z · comments (1)

Ai Cone of Probabilties - what aren't we talking about?
Marzipan · 2025-04-05T05:51:27.859Z · comments (5)

Null Rationalism
kilgoar (william-walshe) · 2025-04-05T03:26:06.034Z · comments (0)

Insect Suffering Is The Biggest Issue: What To Do About It
omnizoid · 2025-04-01T12:51:08.115Z · comments (9)

Not The End of All Value
Ben Ihrig (eternal/ephemera) · 2025-04-10T20:53:36.671Z · comments (0)

[question] How far are Western welfare states from coddling the population into becoming useless?
StanislavKrym · 2025-04-13T17:08:01.834Z · answers+comments (5)

An Unbiased Evaluation of My Debate with Thane Ruthenis - Run It Yourself
funnyfranco · 2025-04-07T18:56:47.831Z · comments (14)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

romeostevensit on Three Months In, Evaluating Three Rationalist Cases for Trump

I think the major impacts that matter are on war, pandemic risk, and x-risk. I rarely see anyone try to figure those out, perhaps the sign is too uncertain due to complexity.

jkaufman on Risers for Foot Percussion

I did see your comment on FB! I'm still thinking about what I want to try next. I'm worried that silicone with your method would tear, though.

hpcfung on Rationalist Should Win. Not Dying with Dignity and Funding WBE.

I'm also interested, have you made any progress since your comment?

lc on Three Months In, Evaluating Three Rationalist Cases for Trump

The doubling down is delusional but I think you're simplifying the failure of projection a bit. The inability of markets and forecasters to predict Trump's second term is quite interesting. A lot of different models of politics failed.

gjm on o3 Will Use Its Tools For You

Pedantic note: there are many instances of "syncopathy" that I am fairly sure should be "sycophancy".

(It's an understandable mistake -- "syncopathy" is composed of familiar components, which could plausibly be put together to mean something like "the disease of agreeing too much" which is, at least in the context of AI, not far off what sycophancy in fact means. Whereas if you can parse "sycophancy" at all you might work out that it means "fig-showing" which obviously has nothing to do with anything. So far as I can tell, no one actually knows how "fig-showing" came to be the term for servile flattery.)

michaeldickens on Planning for Extreme AI Risks

I think the right way to self-destruct isn't to shut down entirely. It's to spend all your remaining assets on safety (whether that be lobbying for regulations, or research, or whatever). This would greatly increase the total amount of money spent on safety efforts so it might help quite a lot.

I do believe shutting down does have a decent chance, although not a comfortingly large one, of scaring government and/or other AI companies into taking the risks seriously.

anthonyc on What Makes an AI Startup "Net Positive" for Safety?

I won't comment on your specific startup, but I wonder in general how an AI Safety startup becomes a successful business. What's the business model? Who is the target customer? Why do they buy? Unless the goal is to get acquired by one of the big labs, in which case, sure, but again, why or when do they buy, and at what price? Especially since they already don't seem to be putting much effort into solving the problem themselves despite having better tools and more money to do so than any new entrant startup.

anthonyc on Three Months In, Evaluating Three Rationalist Cases for Trump

I really, really hope at some point the Democrats will acknowledge the reason they lost is that they failed to persuade the median voter of their ideas, and/or adopt ideas that appeal to said voters. At least among those I interact with, there seems to be a denial of the idea that this is how you win elections, which is a prerequisite for governing.

saidachmiz on A Dissent on Honesty

The hard cases are much more interesting. What about lying to my landlord about renting a room on airbnb? What about saying your class will make people millionaires for the low low price of $1,000 (hey, it could happen)? What about hiding the rats from the health inspector?

None of these seem like hard cases to me. Lying is wrong (and pretty obviously so) in all three of these cases.

anthonyc on Why Does It Feel Like Something? An Evolutionary Path to Subjectivity

That seems very possible to me, and if and when we can show whether something like that is the case, I do think it would represent significant progress. If nothing else, it would help tell us what the thing we need to be examining actually is, in a way we don't currently have an easy way to specify.