LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Some Rules for an Algebra of Bayes Nets
johnswentworth · 2023-11-16T23:53:11.650Z · comments (35)

Survey for alignment researchers!
Cameron Berg (cameron-berg) · 2024-02-02T20:41:44.323Z · comments (11)

Testbed evals: evaluating AI safety even when it can’t be directly measured
joshc (joshua-clymer) · 2023-11-15T19:00:41.908Z · comments (2)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

LW Frontpage Experiments! (aka "Take the wheel, Shoggoth!")
Ruby · 2024-04-23T03:58:43.443Z · comments (27)

Do sparse autoencoders find "true features"?
Demian Till · 2024-02-22T18:06:59.630Z · comments (33)

Dumbing down
Martin Sustrik (sustrik) · 2024-06-09T06:50:47.469Z · comments (0)

Instruction-following AGI is easier and more likely than value aligned AGI
Seth Herd · 2024-05-15T19:38:03.185Z · comments (25)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (68)

Linking Alt Accounts
jefftk (jkaufman) · 2023-10-06T17:00:09.802Z · comments (33)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (10)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

[link] [Repost] The Copenhagen Interpretation of Ethics
mesaoptimizer · 2024-01-25T15:20:08.162Z · comments (4)

Update on Chinese IQ-related gene panels
Lao Mein (derpherpize) · 2023-12-14T10:12:21.212Z · comments (7)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

Eliezer's example on Bayesian statistics is wr... oops!
Zane · 2023-10-17T18:38:18.327Z · comments (13)

[link] OpenAI: Preparedness framework
Zach Stein-Perlman · 2023-12-18T18:30:10.153Z · comments (23)

[link] The True Story of How GPT-2 Became Maximally Lewd
Writer · 2024-01-18T21:03:08.167Z · comments (7)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

Epistemic Hell
rogersbacon · 2024-01-27T17:13:09.578Z · comments (20)

[link] Paper: Understanding and Controlling a Maze-Solving Policy Network
TurnTrout · 2023-10-13T01:38:09.147Z · comments (0)

“Artificial General Intelligence”: an extremely brief FAQ
Steven Byrnes (steve2152) · 2024-03-11T17:49:02.496Z · comments (6)

[link] The Inner Ring by C. S. Lewis
Saul Munn (saul-munn) · 2024-04-24T22:48:09.228Z · comments (6)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

[link] Motivation gaps: Why so much EA criticism is hostile and lazy
titotal (lombertini) · 2024-04-22T11:49:59.389Z · comments (5)

Transcoders enable fine-grained interpretable circuit analysis for language models
Jacob Dunefsky (jacob-dunefsky) · 2024-04-30T17:58:09.982Z · comments (14)

Text Posts from the Kids Group: 2020
jefftk (jkaufman) · 2024-04-13T22:30:05.326Z · comments (3)

[link] InterLab – a toolkit for experiments with multi-agent interactions
Tomáš Gavenčiak (tomas-gavenciak) · 2024-01-22T18:23:35.661Z · comments (0)

High-level interpretability: detecting an AI's objectives
Paul Colognese (paul-colognese) · 2023-09-28T19:30:16.753Z · comments (4)

AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan · 2024-04-11T21:30:04.244Z · comments (10)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

Game Theory without Argmax [Part 1]
Cleo Nardo (strawberry calm) · 2023-11-11T15:59:47.486Z · comments (18)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (15)

Finding Sparse Linear Connections between Features in LLMs
Logan Riggs (elriggs) · 2023-12-09T02:27:42.456Z · comments (5)

Flagging Potentially Unfair Parenting
jefftk (jkaufman) · 2023-12-26T12:40:05.099Z · comments (1)

[link] Davidad's Provably Safe AI Architecture - ARIA's Programme Thesis
simeon_c (WayZ) · 2024-02-01T21:30:44.090Z · comments (17)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

How We Picture Bayesian Agents
johnswentworth · 2024-04-08T18:12:48.595Z · comments (14)

Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities
Axel Højmark (hojmax) · 2024-07-22T16:17:07.665Z · comments (0)

[link] Former OpenAI Superalignment Researcher: Superintelligence by 2030
Julian Bradshaw · 2024-06-05T03:35:19.251Z · comments (30)

Multiplex Gene Editing: Where Are We Now?
sarahconstantin · 2024-07-16T20:50:04.590Z · comments (6)

[link] [Link Post] "Foundational Challenges in Assuring Alignment and Safety of Large Language Models"
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-06-06T18:55:09.151Z · comments (2)

[link] We're all in this together
Tamsin Leake (carado-1) · 2023-12-05T13:57:46.270Z · comments (65)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

[link] The 2nd Demographic Transition
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-06T14:10:13.095Z · comments (17)

What is it to solve the alignment problem?
Joe Carlsmith (joekc) · 2024-08-24T21:19:34.280Z · comments (17)

How useful is "AI Control" as a framing on AI X-Risk?
habryka (habryka4) · 2024-03-14T18:06:30.459Z · comments (4)

[Summary] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda (neel-nanda-1) · 2024-04-19T19:06:17.755Z · comments (0)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

MATS AI Safety Strategy Curriculum
Ronny Fernandez (ronny-fernandez) · 2024-03-07T19:59:37.434Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

benito on More Dakka

This is the canonical More Dakka post, but it doesn't include the origin.

Quoting the TV Tropes page on More Dakka:

The trope namer is Warhammer 40,000, where it is the Ork onomatopoeia for machine gun firing, and their general term for rapid fire capacity: "dakka-dakka-dakka-dakka...".

Here's a color version of the image on the page, that honestly conveys the heuristic to me quite effectively. It's very stupid, but man it's good to just go more dakka when marginal effort is continuing to improve things.

rhollerith_dot_com on quila's Shortform

The CNS contains dozens of "feedback loops". Any intervention that drastically alters the equilibrium point of several of those loops is generally a bad idea unless you are doing it to get out of some dire situation, e.g., seizures. That's my recollection of Huberman's main objection put into my words (because I dont recall his words).

Also, melatonin is expensive to measure quantitatively, so the amount tends to vary a lot from what is on the label. In particular, there are reputational consequences and possible legal consequences to a brand's having been found to have less than the label says, so brands tend to err on the side of putting too much melatonin in per pill, which ends up often being manyfold more than the label says.

anders-lindstroem on The lying p value

yeah, that comic summarize it all!

As a side note, I wonder how many would get their PhD degree if the requirement was to publish 3-4 papers (2 single author and 1-2 co-author) were the main result (where it's applicable) needed to have p<0.01? Perhaps the paper publishing frenzy would slow down a little bit if the monograph came into fashion again?

atillayasar on AI alignment via civilizational cognitive updates

I agree that morality and consensus are in principle not the same -- Nazis or any evil society is an easy counterexample.
(One could argue Nazis did not have the consensus of the entire world, but you can then just imagine a fully evil population.)

But for one, simply rejecting civilization and consensus based on "you have no rigorous definition of them and also look at Genghis Khan/Nazis this proves that governments are evil" is like, basically, putting the burden of proof on the side that is arguing for civilization and common sense morality, which is suspicious.

I'm open to alternatives, but just saying "governments can be evil therefor I reject it, full stop" is not that helpful for discourse. Like what do you wanna do, just abolish civilization?

So consider a handwavy view of morality and of what a "good civilization" looks like. Let's assume common sense morality is correct, and that mostly everyone understands what this means: "don't steal, don't hurt people, leave people alone unless they're doing bad things, don't be a sex pervert, etc.". And assume most people agree with this and want to live by this. Then when you have consensus, meaning most people are observing and agreeing that civilization (or practically speaking, the area or community over which they have a decent amount of influence), is abiding by "common sense morality", then everything is basically moral and fine.

(I also want to point out that caring too much on pinpointing what morality means exactly and how to put it into words, distracts from solving practical problems where it's extremely obvious what is morally going wrong, but where you have to sort out the "implementation details".)

quila on quila's Shortform

recommend against supplementing melatonin

why?

i searched Andrew Huberman melatonin and found this, though it looks like it may be an AI generated summary.

atillayasar on AI alignment via civilizational cognitive updates

Despite being "into" AI safety for a while, I haven't picked a side. I do believe it's extremely important and deserves more attention and I believe that AI actually could kill everyone in less 5 years.

But any effort spent on pinning down one's "p(doom)" is not spent usefully on things like: how to actually make AI safe, how AI works, how to approach this problem as a civilization/community, how to think about this problem. And, as was my intention with this article, "how to think about things in general, and how to make philosophical progress".

rhollerith_dot_com on Thomas Kwa's Shortform

Are Eliezer and Nate right that continuing the AI program will almost certainly lead to extinction or something approximately as disastrous as extinction?

rhollerith_dot_com on quila's Shortform

A lot of people e.g. Andrew Huberman (who recommends many supplements for cognitive enhancement and other ends) recommend against supplementing melatonin except to treat insomnia that has failed to respond to many other interventions.

brendan-long on The Humanitarian Economy

I think you've re-invented Communism. The reason we don't implement it is that in practice it's much worse for everyone, including poor people.

rhollerith_dot_com on What are the primary drivers that caused selection pressure for intelligence in humans?

It's also important to consider the selection pressure keeping intelligence low, namely, the fact that most animals chronically have trouble getting enough calories, combined with the high caloric needs of neural tissue.

It is no coincidence that human intelligence didn't start rising much till humans were reliably getting meat in their diet and they started to routinely cook their food, which makes whatever calories are in food easier to digest, allowing the human gut to get smaller, which in turn reduced the caloric demands of the gut (which needs to be kept alive 24 hours a day even on a day when the person finds no food to eat).