LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Muddling Along Is More Likely Than Dystopia
Jeffrey Heninger (jeffrey-heninger) · 2023-10-20T21:25:15.459Z · comments (10)

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry · 2024-04-29T20:57:35.127Z · comments (8)

Saying the quiet part out loud: trading off x-risk for personal immortality
disturbance · 2023-11-02T17:43:34.155Z · comments (89)

The Good Life in the face of the apocalypse
Elizabeth (pktechgirl) · 2023-10-16T22:40:15.200Z · comments (8)

[link] Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb · 2024-04-16T10:10:13.338Z · comments (11)

We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
Lukas_Gloor · 2024-05-09T15:43:11.490Z · comments (36)

Release: Optimal Weave (P1): A Prototype Cohabitive Game
mako yass (MakoYass) · 2024-08-17T14:08:18.947Z · comments (19)

[Paper] All's Fair In Love And Love: Copy Suppression in GPT-2 Small
CallumMcDougall (TheMcDouglas) · 2023-10-13T18:32:02.376Z · comments (4)

Coup probes: Catching catastrophes with probes trained off-policy
Fabien Roger (Fabien) · 2023-11-17T17:58:28.687Z · comments (7)

[link] Palworld development blog post
bhauth · 2024-01-28T05:56:19.984Z · comments (12)

Why you should be using a retinoid
GeneSmith · 2024-08-19T03:07:41.722Z · comments (53)

My Criticism of Singular Learning Theory
Joar Skalse (Logical_Lunatic) · 2023-11-19T15:19:16.874Z · comments (56)

Refusal mechanisms: initial experiments with Llama-2-7b-chat
Andy Arditi (andy-arditi) · 2023-12-08T17:08:01.250Z · comments (7)

[link] New voluntary commitments (AI Seoul Summit)
Zach Stein-Perlman · 2024-05-21T11:00:41.794Z · comments (17)

Bostrom Goes Unheard
Zvi · 2023-11-13T14:11:07.586Z · comments (9)

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (32)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers
Yitz (yitz) · 2024-01-17T09:48:07.930Z · comments (11)

[link] "The Heart of Gaming is the Power Fantasy", and Cohabitive Games
Raemon · 2023-10-08T21:02:33.526Z · comments (49)

AISafety.com – Resources for AI Safety
Søren Elverlin (soren-elverlin-1) · 2024-05-17T15:57:11.712Z · comments (3)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning
keith_wynroe · 2024-07-02T13:17:16.352Z · comments (7)

Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon (joy_void_joy) · 2024-04-27T16:04:45.894Z · comments (13)

Announcing Athena - Women in AI Alignment Research
Claire Short (claire-short) · 2023-11-07T21:46:41.741Z · comments (2)

Self-Referential Probabilistic Logic Admits the Payor's Lemma
Yudhister Kumar (randomwalks) · 2023-11-28T10:27:29.029Z · comments (14)

Survey of 2,778 AI authors: six parts in pictures
KatjaGrace · 2024-01-06T04:43:34.590Z · comments (1)

3C's: A Recipe For Mathing Concepts
johnswentworth · 2024-07-03T01:06:11.944Z · comments (5)

Fluent, Cruxy Predictions
Raemon · 2024-07-10T18:00:06.424Z · comments (10)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (23)

Studying The Alien Mind
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-12-05T17:27:28.049Z · comments (10)

The Gemini Incident
Zvi · 2024-02-22T21:00:04.594Z · comments (19)

Thomas Kwa's research journal
Thomas Kwa (thomas-kwa) · 2023-11-23T05:11:08.907Z · comments (1)

[link] The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists
EJT (ElliottThornley) · 2023-10-23T21:00:48.398Z · comments (22)

[question] How have you become more hard-working?
Chi Nguyen · 2023-09-25T12:37:39.860Z · answers+comments (40)

LessWrong Community Weekend 2024, open for applications
UnplannedCauliflower · 2024-05-01T10:18:21.992Z · comments (2)

Quick look: applications of chaos theory
Elizabeth (pktechgirl) · 2024-08-18T15:00:07.853Z · comments (45)

[link] My thesis (Algorithmic Bayesian Epistemology) explained in more depth
Eric Neyman (UnexpectedValues) · 2024-05-09T19:43:16.543Z · comments (4)

[link] MIRI's May 2024 Newsletter
Harlan · 2024-05-15T00:13:30.153Z · comments (1)

New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?"
Joe Carlsmith (joekc) · 2023-11-15T17:16:42.088Z · comments (26)

Memory bandwidth constraints imply economies of scale in AI inference
Ege Erdil (ege-erdil) · 2023-09-17T14:01:34.701Z · comments (34)

Spaciousness In Partner Dance: A Naturalism Demo
LoganStrohl (BrienneYudkowsky) · 2023-11-19T07:00:19.555Z · comments (5)

Corrigibility = Tool-ness?
johnswentworth · 2024-06-28T01:19:48.883Z · comments (8)

A couple productivity tips for overthinkers
Steven Byrnes (steve2152) · 2024-04-20T16:05:50.332Z · comments (13)

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (2)

EU policymakers reach an agreement on the AI Act
tlevin (trevor) · 2023-12-15T06:02:44.668Z · comments (7)

[Valence series] 2. Valence & Normativity
Steven Byrnes (steve2152) · 2023-12-07T16:43:49.919Z · comments (5)

Some Vacation Photos
johnswentworth · 2024-01-04T17:15:01.187Z · comments (0)

I'm a Former Israeli Officer. AMA
Yovel Rom · 2023-10-10T08:33:51.557Z · comments (70)

Send us example gnarly bugs
Beth Barnes (beth-barnes) · 2023-12-10T05:23:00.773Z · comments (10)

Secondary forces of debt
KatjaGrace · 2024-06-27T21:10:06.131Z · comments (18)

ACX Covid Origins Post convinced readers
ErnestScribbler · 2024-05-01T13:06:20.818Z · comments (7)

[link] [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
trevor (TrevorWiesinger) · 2024-03-28T16:03:36.452Z · comments (22)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

rhollerith_dot_com on My simple AGI investment & insurance strategy

Our situation is analogous to someone who has been diagnosed with cancer and told he has a low probability of survival, but at least there's a nifty investment opportunity he can buy that pays off big if he does survive.

steve2152 on [Intuitive self-models] 1. Preliminaries

Thanks for the kind words!

The thing you quoted was supposed to be very silly and self-deprecating, but I wrote it very poorly, and it actually wound up sounding kinda judgmental. Oops, sorry. I just rewrote it. I agree with everything you wrote in this comment.

mark-xu on My AI Model Delta Compared To Christiano

I don’t think Paul thinks verification is generally easy or that delegation is fundamentally viable. He, for example, doesn’t suck at hiring because he thinks it’s in fact a hard problem to verify if someone is good at their job.

I liked Rohins comment elsewhere on this general thread.

I’m happy to answer more specific questions, although provide would generally feel more comfortable answering questions about my views then about Paul’s.

linda-linsefors on [Intuitive self-models] 1. Preliminaries

I tried it and it works for me too.

For me the dancer was spinning contraclockwise and would not change. With your screwing trick I could change rotation, and where now stably stuck in the clockwise direction. Until I screwed in the other direction. I've now done this back and forth a few times.

paradiddle on [Intuitive self-models] 1. Preliminaries

Section 1.6 is another appendix about how this series relates to Philosophy Of Mind. My opinion of Philosophy Of Mind is: I’m against it! Or rather, I’ll say plenty in this series that would be highly relevant to understanding the true nature of consciousness, free will, and so on, but the series itself is firmly restricted in scope to questions that can be resolved within the physical universe (including physics, neuroscience, algorithms, and so on). I’ll leave the philosophy to the philosophers.

At the risk of outing myself as a thin-skinned philosopher, I want to push back on this a bit. If we are taking "philosophy of mind" to mean, "the kind of work philosophers of mind do" (which I think we should), then your comment seems misplaced. Crucially, one need not be defending particular views on "big questions" about the true nature of consciousness, free will, and so on to be doing philosophy of mind. Rather, much of the work philosophers of mind do is continuous with scientific inquiry. Indeed, I would say some philosophy of mind is close to indistinguishable from what you do in this post! For example, lots of this work involves trying to carve up conceptual space in a way that coheres with empirical findings, suggests avenues for further research, and renders fruitful discussion easier. Your section 1.3 in this post features exactly the kind of conceptual work that is the bread-and-butter of philosophy. So, far from leaving philosophy to the philosophers, I actually think your work would fit comfortably into the more empirically informed end of contemporary philosophy of mind. To end on a positive note, I think it's really clearly written, fascinating, and fun to read. So thanks!

bokov-1 on My simple AGI investment & insurance strategy

I'm trying out this strategy on Investopedia's simulator (https://www.investopedia.com/simulator/trade/options)

The January 15 2027 call options on QQQ look like this as of posting (current price 481.48):

Strike	Black-Scholes	Ask
485	64.244	77.4
500	57.796	69.83
...	...	...
675	14.308	14
680	13.693	13.5
685	13.077	12.49
...	...	...
700	11.446	10.5
...	...	...
720	9.702	8.5

So, if you were following this strategy and buying today, would you buy 485 because it has the lowest OOM strike price? Would you buy 675 because it's the lowest strike price where the ask is lower than the theoretical Black-Sholes fair price? Would you go for 720 because it's the cheapest available? Would you look for the out-of-money option with the largest difference between Black-Sholes and the ask?

What would be your thought process? I'm definitely hoping to hear from @lc but am interested in hearing from anybody who found this line of reasoning worth investigating and has opinions about it.

dagon on What you know when you know nothing

I think this is mixing up colloquial "know nothing" and literal "know nothing". It's impossible to identify a thing about which one knows nothing, as that identification is something about the thing. It can be wrong, and it can be very imprecise, but it's not nothing.

50/50 are the odds of A when we know nothing about A.

No. 50/50 is a reasonable universal prior, but that's both very theoretical and deeply unclear how to categorize quantum waveforms into things over which a probability is even applicable. In most real cases, 50/50 are the odds to start with when all you know is that it's common enough to come to your attention, and that it "feels" balanced whether or not it'll happen.

In other words, "undefined and inapplicable" is the probability for things you know nothing about. Almost all things you can apply probability to, you know SOMETHING about.

You add another layer of mixing literal and figurative "don't know anything" to the term "singularity". Also, don't forget to multiply by the probability that a singularity-on-relevant-factors might not have happened for the thing you're predicting.

dacyn on Pronouns are Annoying

No, because John could be speaking about himself administering the medication.

If it's about John administering the medication then you'd have to say "... he refused to let him".

It’s also possible to refuse to do something you’ve already acknowledged you should do, so the 3rd he could still be John regardless of who is being told what.

But the sentence did not claim John merely acknowledged that he should administer the medication, it claimed John was the originator of that statement. Is John supposed to be refusing his own requests?

cubefox on What you know when you know nothing

If we know nothing about them, the statements could equally be true or false, and positively or negatively dependent. The same argument which makes us assume 50% probability to individual statements would also make us assume independence between statements. The possibilities cancel out, so to speak.

jessica-liu-taylor on The Obliqueness Thesis

Computationally tractable is Yudkowsky's framing and might be too limited. The kind of thing I believe is for example, an animal without a certain brain complexity will tend not to be a social animal and is therefore unlikely to have the sort of values social animals have. And animals that can't do math aren't going to value mathematical aesthetics the way human mathematicians do.