LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

Mech Interp Lacks Good Paradigms
Daniel Tan (dtch1997) · 2024-07-16T15:47:32.171Z · comments (0)

[link] Twitter thread on open-source AI
Richard_Ngo (ricraz) · 2024-07-31T00:26:11.655Z · comments (6)

[link] patent process problems
bhauth · 2024-07-14T21:12:04.953Z · comments (13)

Musings on LLM Scale (Jul 2024)
Vladimir_Nesov · 2024-07-03T18:35:48.373Z · comments (0)

[question] How unusual is the fact that there is no AI monopoly?
Viliam · 2024-08-16T20:21:51.012Z · answers+comments (15)

In Defense of Lawyers Playing Their Part
Isaac King (KingSupernova) · 2024-07-01T01:32:58.695Z · comments (9)

[link] Romae Industriae
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-19T13:03:31.536Z · comments (2)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

[link] A computational complexity argument for many worlds
jessicata (jessica.liu.taylor) · 2024-08-13T19:35:10.116Z · comments (15)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (4)

[LDSL#1] Performance optimization as a metaphor for life
tailcalled · 2024-08-08T16:16:27.349Z · comments (4)

Games for AI Control
charlie_griffin (cjgriffin) · 2024-07-11T18:40:50.607Z · comments (0)

Extracting SAE task features for in-context learning
Dmitrii Kharlapenko (dmitrii-kharlapenko) · 2024-08-12T20:34:13.747Z · comments (1)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (24)

Book Review: What Even Is Gender?
Joey Marcellino · 2024-09-01T16:09:27.773Z · comments (14)

[LDSL#6] When is quantification needed, and when is it hard?
tailcalled · 2024-08-13T20:39:45.481Z · comments (0)

Comparing Quantized Performance in Llama Models
NickyP (Nicky) · 2024-07-15T16:01:24.960Z · comments (2)

Music in the AI World
Martin Sustrik (sustrik) · 2024-08-16T04:20:01.706Z · comments (8)

Some comments on intelligence
Viliam · 2024-08-01T15:17:07.215Z · comments (5)

Fun With CellxGene
sarahconstantin · 2024-09-06T22:00:03.461Z · comments (2)

AI #74: GPT-4o Mini Me and Llama 3
Zvi · 2024-07-25T13:50:06.528Z · comments (6)

Inference-Only Debate Experiments Using Math Problems
Arjun Panickssery (arjun-panickssery) · 2024-08-06T17:44:27.293Z · comments (0)

A more systematic case for inner misalignment
Richard_Ngo (ricraz) · 2024-07-20T05:03:03.500Z · comments (4)

[link] Baking vs Patissing vs Cooking, the HPS explanation
adamShimi · 2024-07-17T20:29:09.645Z · comments (16)

Investigating the Ability of LLMs to Recognize Their Own Writing
Christopher Ackerman (christopher-ackerman) · 2024-07-30T15:41:44.017Z · comments (0)

[link] Epistemic states as a potential benign prior
Tamsin Leake (carado-1) · 2024-08-31T18:26:14.093Z · comments (2)

But Where do the Variables of my Causal Model come from?
Dalcy (Darcy) · 2024-08-09T22:07:57.395Z · comments (1)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

[question] Where to find reliable reviews of AI products?
Elizabeth (pktechgirl) · 2024-09-17T23:48:25.899Z · answers+comments (4)

Proveably Safe Self Driving Cars
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (20)

[LDSL#4] Root cause analysis versus effect size estimation
tailcalled · 2024-08-11T16:12:14.604Z · comments (0)

Paper Summary: Princes and Merchants: European City Growth Before the Industrial Revolution
Jeffrey Heninger (jeffrey-heninger) · 2024-07-15T21:30:04.043Z · comments (1)

AIS terminology proposal: standardize terms for probability ranges
eggsyntax · 2024-08-30T15:43:39.857Z · comments (12)

Representation Tuning
Christopher Ackerman (christopher-ackerman) · 2024-06-27T17:44:33.338Z · comments (4)

Childhood and Education Roundup #6: College Edition
Zvi · 2024-06-26T11:40:03.990Z · comments (8)

[question] What Other Lines of Work are Safe from AI Automation?
RogerDearnaley (roger-d-1) · 2024-07-11T10:01:12.616Z · answers+comments (35)

DIY RLHF: A simple implementation for hands on experience
Mike Vaiana (mike-vaiana) · 2024-07-10T12:07:03.047Z · comments (0)

RLHF is the worst possible thing done when facing the alignment problem
tailcalled · 2024-09-19T18:56:27.676Z · comments (6)

[link] New blog: Expedition to the Far Lands
Connor Leahy (NPCollapse) · 2024-08-17T11:07:48.537Z · comments (3)

Reading More Each Day: A Simple $35 Tool
aysajan · 2024-07-24T13:54:04.290Z · comments (2)

[link] AI forecasting bots incoming
Dan H (dan-hendrycks) · 2024-09-09T19:14:31.050Z · comments (44)

Monthly Roundup #19: June 2024
Zvi · 2024-06-25T12:00:03.333Z · comments (9)

AI Constitutions are a tool to reduce societal scale risk
Sammy Martin (SDM) · 2024-07-25T11:18:17.826Z · comments (2)

[link] Video Intro to Guaranteed Safe AI
Mike Vaiana (mike-vaiana) · 2024-07-11T17:53:47.630Z · comments (0)

Cheap Whiteboards!
Johannes C. Mayer (johannes-c-mayer) · 2024-08-08T13:52:59.627Z · comments (2)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

Response to Dileep George: AGI safety warrants planning ahead
Steven Byrnes (steve2152) · 2024-07-08T15:27:07.402Z · comments (7)

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
Daniel Lee (daniel-lee) · 2024-09-06T02:28:41.954Z · comments (0)

[question] Me & My Clone
SimonBaars (simonbaars) · 2024-07-18T16:25:40.770Z · answers+comments (22)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

steve2152 on [Intuitive self-models] 1. Preliminaries

Thanks for the kind words!

The thing you quoted was supposed to be very silly and self-deprecating, but I wrote it very poorly, and it actually wound up sounding kinda judgmental. Oops, sorry. I just rewrote it. I agree with everything you wrote in this comment.

mark-xu on My AI Model Delta Compared To Christiano

I don’t think Paul thinks verification is generally easy or that delegation is fundamentally viable. He, for example, doesn’t suck at hiring because he thinks it’s in fact a hard problem to verify if someone is good at their job.

I liked Rohins comment elsewhere on this general thread.

I’m happy to answer more specific questions, although provide would generally feel more comfortable answering questions about my views then about Paul’s.

linda-linsefors on [Intuitive self-models] 1. Preliminaries

I tried it and it works for me too.

For me the dancer was spinning contraclockwise and would not change. With your screwing trick I could change rotation, and where now stably stuck in the clockwise direction. Until I screwed in the other direction. I've now done this back and forth a few times.

paradiddle on [Intuitive self-models] 1. Preliminaries

Section 1.6 is another appendix about how this series relates to Philosophy Of Mind. My opinion of Philosophy Of Mind is: I’m against it! Or rather, I’ll say plenty in this series that would be highly relevant to understanding the true nature of consciousness, free will, and so on, but the series itself is firmly restricted in scope to questions that can be resolved within the physical universe (including physics, neuroscience, algorithms, and so on). I’ll leave the philosophy to the philosophers.

At the risk of outing myself as a thin-skinned philosopher, I want to push back on this a bit. If we are taking "philosophy of mind" to mean, "the kind of work philosophers of mind do" (which I think we should), then your comment seems misplaced. Crucially, one need not be defending particular views on "big questions" about the true nature of consciousness, free will, and so on to be doing philosophy of mind. Rather, much of the work philosophers of mind do is continuous with scientific inquiry. Indeed, I would say some philosophy of mind is close to indistinguishable from what you do in this post! For example, lots of this work involves trying to carve up conceptual space in a way that coheres with empirical findings, suggests avenues for further research, and renders fruitful discussion easier. Your section 1.3 in this post features exactly the kind of conceptual work that is the bread-and-butter of philosophy. So, far from leaving philosophy to the philosophers, I actually think your work would fit comfortably into the more empirically informed end of contemporary philosophy of mind. To end on a positive note, I think it's really clearly written, fascinating, and fun to read. So thanks!

bokov-1 on My simple AGI investment & insurance strategy

I'm trying out this strategy on Investopedia's simulator (https://www.investopedia.com/simulator/trade/options)

The January 15 2027 call options on QQQ look like this as of posting (current price 481.48):

Strike	Black-Scholes	Ask
485	64.244	77.4
500	57.796	69.83
...	...	...
675	14.308	14
680	13.693	13.5
685	13.077	12.49
...	...	...
700	11.446	10.5
...	...	...
720	9.702	8.5

So, if you were following this strategy and buying today, would you buy 485 because it has the lowest OOM strike price? Would you buy 675 because it's the lowest strike price where the ask is lower than the theoretical Black-Sholes fair price? Would you go for 720 because it's the cheapest available? Would you look for the out-of-money option with the largest difference between Black-Sholes and the ask?

What would be your thought process? I'm definitely hoping to hear from @lc but am interested in hearing from anybody who found this line of reasoning worth investigating and has opinions about it.

dagon on What you know when you know nothing

I think this is mixing up colloquial "know nothing" and literal "know nothing". It's impossible to identify a thing about which one knows nothing, as that identification is something about the thing. It can be wrong, and it can be very imprecise, but it's not nothing.

50/50 are the odds of A when we know nothing about A.

No. 50/50 is a reasonable universal prior, but that's both very theoretical and deeply unclear how to categorize quantum waveforms into things over which a probability is even applicable. In most real cases, 50/50 are the odds to start with when all you know is that it's common enough to come to your attention, and that it "feels" balanced whether or not it'll happen.

In other words, "undefined and inapplicable" is the probability for things you know nothing about. Almost all things you can apply probability to, you know SOMETHING about.

You add another layer of mixing literal and figurative "don't know anything" to the term "singularity". Also, don't forget to multiply by the probability that a singularity-on-relevant-factors might not have happened for the thing you're predicting.

dacyn on Pronouns are Annoying

No, because John could be speaking about himself administering the medication.

If it's about John administering the medication then you'd have to say "... he refused to let him".

It’s also possible to refuse to do something you’ve already acknowledged you should do, so the 3rd he could still be John regardless of who is being told what.

But the sentence did not claim John merely acknowledged that he should administer the medication, it claimed John was the originator of that statement. Is John supposed to be refusing his own requests?

cubefox on What you know when you know nothing

If we know nothing about them, the statements could equally be true or false, and positively or negatively dependent. The same argument which makes us assume 50% probability to individual statements would also make us assume independence between statements. The possibilities cancel out, so to speak.

jessica-liu-taylor on The Obliqueness Thesis

Computationally tractable is Yudkowsky's framing and might be too limited. The kind of thing I believe is for example, an animal without a certain brain complexity will tend not to be a social animal and is therefore unlikely to have the sort of values social animals have. And animals that can't do math aren't going to value mathematical aesthetics the way human mathematicians do.

jessica-liu-taylor on The Obliqueness Thesis

Relativity to Newtonian mechanics is a warp in a straightforward sense. If you believe the layout of a house consists of some rooms connected in a certain way, but there are actually more rooms connected in different ways, getting the maps to line up looks like a warp. Basically, the closer the mapping is to a true homomorphism (in the universal algebra sense), the less warping there is, otherwise there are deviations intuitively analogous to space warps.