LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance
claudia.biancotti · 2024-11-18T09:38:35.723Z · comments (2)

[link] Internal music player: phenomenology of earworms
dkl9 · 2024-11-14T23:29:48.383Z · comments (2)

[link] Against AI As An Existential Risk
Noah Birnbaum (daniel-birnbaum) · 2024-07-30T19:10:41.156Z · comments (13)

Retrieval Augmented Genesis
João Ribeiro Medeiros (joao-ribeiro-medeiros) · 2024-10-01T20:18:01.836Z · comments (0)

[link] SCP Foundation - Anti memetic Division Hub
landscape_kiwi · 2024-09-15T13:40:52.691Z · comments (1)

'Chat with impactful research & evaluations' (Unjournal NotebookLMs)
david reinstein (david-reinstein) · 2024-09-28T00:32:16.845Z · comments (0)

Does “Ultimate Neartermism” via Eternal Inflation dominate Longtermism in expectation?
Jordan Arel · 2024-08-17T22:28:21.849Z · comments (1)

Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?
Lorec · 2024-09-28T14:07:42.412Z · comments (6)

Inquisitive vs. adversarial rationality
gb (ghb) · 2024-09-18T13:50:09.198Z · comments (9)

The Pragmatic Side of Cryptographically Boxing AI
Bart Jaworski (bart-jaworski) · 2024-08-06T17:46:21.754Z · comments (0)

A small improvement to Wikipedia page on Pareto Efficiency
ektimo · 2024-11-18T02:13:49.151Z · comments (0)

Another UFO Bet
codyz · 2024-11-01T01:55:27.301Z · comments (11)

[question] Practical advice for secure virtual communication post easy AI voice-cloning?
hmys (the-cactus) · 2024-08-09T17:32:33.458Z · answers+comments (5)

[question] why won't this alignment plan work?
KvmanThinking (avery-liu) · 2024-10-10T15:44:59.450Z · answers+comments (7)

Modelling Social Exchange: A Systematised Method to Judge Friendship Quality
Wynn Walker · 2024-08-04T18:49:30.892Z · comments (0)

[Research log] The board of Alphabet would stop DeepMind to save the world
Lucie Philippon (lucie-philippon) · 2024-07-16T04:59:14.874Z · comments (0)

[question] Is School of Thought related to the Rationality Community?
Shoshannah Tekofsky (DarkSym) · 2024-10-15T12:41:33.224Z · answers+comments (6)

Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning
Tom Angsten (tom-angsten) · 2024-07-30T16:36:06.518Z · comments (0)

[question] Can agents coordinate on randomness without outside sources?
Mikhail Samin (mikhail-samin) · 2024-07-06T13:43:44.633Z · answers+comments (16)

[link] Optimising under arbitrarily many constraint equations
dkl9 · 2024-09-12T14:59:28.475Z · comments (0)

[question] Request for AI risk quotes, especially around speed, large impacts and black boxes
Nathan Young · 2024-08-02T17:49:48.898Z · answers+comments (0)

GPT4o is still sensitive to user-induced bias when writing code
Reed (ThomasReed) · 2024-09-22T21:04:54.717Z · comments (0)

Thirty random thoughts about AI alignment
Lysandre Terrisse · 2024-09-15T16:24:10.572Z · comments (1)

Exploring Shard-like Behavior: Empirical Insights into Contextual Decision-Making in RL Agents
Alejandro Aristizabal (alejandro-aristizabal) · 2024-09-29T00:32:42.161Z · comments (0)

[link] Memorising molecular structures
dkl9 · 2024-07-12T22:40:42.307Z · comments (0)

[question] how to truly feel my beliefs?
KvmanThinking (avery-liu) · 2024-11-11T00:04:30.994Z · answers+comments (6)

[question] Can subjunctive dependence emerge from a simplicity prior?
Daniel C (harper-owen) · 2024-09-16T12:39:35.543Z · answers+comments (0)

[link] AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels
Corin Katzke (corin-katzke) · 2024-10-28T16:03:39.258Z · comments (0)

How can I get over my fear of becoming an emulated consciousness?
James Dowdell (james-dowdell) · 2024-07-07T22:02:43.520Z · comments (8)

[link] Garrison Lovely: China Hawks are Manufacturing an AI Arms Race
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-11-20T10:13:37.070Z · comments (0)

LLMs stifle creativity, eliminate opportunities for serendipitous discovery and disrupt intergenerational transfer of wisdom
Ghdz (gal-hadad) · 2024-08-05T18:27:20.709Z · comments (2)

Notes on Tuning Metacognition
JoNeedsSleep (joanna-j-1) · 2024-07-03T19:54:59.732Z · comments (0)

Valence Need Not Be Bounded; Utility Need Not Synthesize
Lorec · 2024-11-20T01:37:20.911Z · comments (0)

A Taxonomy Of AI System Evaluations
Maxime Riché (maxime-riche) · 2024-08-19T09:07:45.224Z · comments (0)

Against Job Boards: Human Capital and the Legibility Trap
vaishnav92 · 2024-10-24T20:50:50.266Z · comments (1)

[link] Yet Another Critique of "Luxury Beliefs"
ymeskhout · 2024-07-18T18:37:28.703Z · comments (10)

The Existential Dread of Being a Powerful AI System
testingthewaters · 2024-09-26T10:56:32.904Z · comments (1)

[link] A (paraconsistent) logic to deal with inconsistent preferences
B Jacobs (Bob Jacobs) · 2024-07-14T11:17:45.426Z · comments (2)

[link] Metaculus's 'Minitaculus' Experiments — Collaborate With Us
ChristianWilliams · 2024-08-26T20:44:32.125Z · comments (0)

Activation Engineering Theories of Impact
kubanetics (jakub-nowak) · 2024-07-18T16:44:33.656Z · comments (1)

Spark in the Dark Guest Spots
jefftk (jkaufman) · 2024-07-14T01:40:05.311Z · comments (0)

[question] Why would ASI share any resources with us?
Satron · 2024-11-13T23:38:36.535Z · answers+comments (8)

Avoiding jailbreaks by discouraging their representation in activation space
Guido Bergman · 2024-09-27T17:49:20.785Z · comments (2)

[question] Opinions on Eureka Labs
jmh · 2024-07-17T00:16:02.959Z · answers+comments (2)

Halifax Canada - ACX Meetups Everywhere Fall 2024
interstice · 2024-08-29T18:39:12.490Z · comments (0)

Understanding Hidden Computations in Chain-of-Thought Reasoning
rokosbasilisk · 2024-08-24T16:35:03.907Z · comments (1)

[link] Solutions to problems with Bayesianism
B Jacobs (Bob Jacobs) · 2024-07-31T14:18:27.910Z · comments (0)

[question] How to cite LessWrong as an academic source?
PhilosophicalSoul (LiamLaw) · 2024-11-06T08:28:26.309Z · answers+comments (6)

Budapest Hungary - ACX Meetups Everywhere Fall 2024
Timothy Underwood (timothy-underwood-1) · 2024-08-29T18:37:41.313Z · comments (0)

[Aspiration-based designs] A. Damages from misaligned optimization – two more models
Jobst Heitzig · 2024-07-15T14:08:15.716Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

xpym on Making a conservative case for alignment

I'd say that atheism had already set the "conservatives not welcome" baseline way back when, and this resulted in the community norms evolving accordingly. Granted, these days the trans stuff is more salient, but the reason it flourished here even more than in other tech-adjacent spaces has much to do with that early baseline.

Ben Shapiro and Jordan Peterson have both said that the intellectual case for atheism is strong, and both remain very popular on the right.

Sure, but somebody admitting that certainly isn't the modal conservative.

gesild-muka on What are the good rationality films?

Children of Men (2006) comes to mind: a movie about a small group of people in a dying world who have the means to benefit humanity and provide hope for the future but can't agree on next steps. (The story is more nuanced but these bits seem relevant to rationality).

milan-w on U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative

Upon further reflection: that big 3 lab soft nationalization scenario I speculated about will happen only if the recommendations end up being implemented with a minimum degree of competence. That is far from guaranteed to happen. Another possible implementation (which at this point I would not be all that surprised if it ended up happening) is "the Executive picks just one lab for some dumb political reason, hands them a ton of money under a vague contract, and then fails to provide any significant oversight".

sharmake-farah on If we solve alignment, do we die anyway?

The answer to this is that we'd rely on instrumental convergence to help us out, combined with adding more data/creating error-correcting mechanisms to prevent value drift from being a problem.

chaosmage on Twelve Virtues of Rationality

I love this very much, so I turned it into a poem, that I think could be lyrics for a song (using tunes like "Amazing Grace" or "House if the Rising Sun") for people like the Bayesian Choir or occasions like the Secular Solstice.

https://sevensecularsermons.org/the-twelve-virtues-of-rationality

Maybe the part about the nameless virtue should be a chorus repeated after each of the first eleven, instead of a tinal stanza, to remind that this one is before the others, and because songs with choruses are good?

ricraz on Anthropic: Three Sketches of ASL-4 Safety Case Components

Cool, ty for (characteristically) thoughtful engagement.

I am still intuitively skeptical about a bunch of your numbers but now it's the sort of feeling which I would also have if you were just reasoning more clearly than me about this stuff (that is, people who reason more clearly tend to be able to notice ways that interventions could be surprisingly high-leverage in confusing domains).

dakara on Thoughts on “AI is easy to control” by Pope & Belrose

Have you had any p(doom) updates since then or is it still around 5%?

irenictruth on What changes should happen in the HHS?

[I] suspect [vaccines] (or antibiotics) account for the majority of the value provided by the medical system

Though I agree that vaccines and antibiotics are extraordinarily beneficial and cost-effective interventions, I suspect you're missing essential value fountains in our medical system. Two that come to mind are surgery and emergency medicine.

I've spoken to several surgeons about their work, and they all said that one of the great things about their job is seeing the immediate and obvious benefits to patients. (Of course, surgery wouldn't be nearly as effective without antibiotics, so potentially, this smuggles something in.)

Emergency medicine also provides a lot of benefits. Someone was going to die from bleeding, and we sewed them up. Boom! We avoid a $2.5 million loss. Accidental deaths would be much higher in the US without emergency medicine personnel.

Another one to look into would be perinatal care. I haven't examined it, but I suspect it adds billions or trillions to the US economy by producing humans with a higher baseline health and capacity.

viliam on Proposal to increase fertility: University parent clubs

I agree. The best advertisement for having kids is to see other people having kids. Not only because people instinctively copy others, but also because you can ask the parents the things you are curious about, or you can try to babysit their kids to get an idea what it would be like to have your own kids.

Also, the more places are parent-friendly, the less costly it is to become a parent. If your friends mostly socialize in loud places with lots of alcohol, starting a family will make you socially isolated, because you would not want to bring your kids to places like that. If instead your friends meet at a park, you can keep your social life and bring your kids along with you.

If many people meet at the same place, it can make sense to have a room specifically for kids, at least with some paper and crayons, so that the kids can play there and leave their parents alone for a moment. Also, one big box where people can bring toys they no longer need at home.

joseph-miller on What are the good rationality films?

Kinda a stretch, but Groundhog Day is about someone becoming stronger [? · GW]. Also just a great film.