LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (24)

How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-07-01T09:04:03.687Z · comments (4)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (33)

[link] Romae Industriae
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-19T13:03:31.536Z · comments (2)

In Defense of Lawyers Playing Their Part
Isaac King (KingSupernova) · 2024-07-01T01:32:58.695Z · comments (9)

[link] A computational complexity argument for many worlds
jessicata (jessica.liu.taylor) · 2024-08-13T19:35:10.116Z · comments (15)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (4)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

[question] How unusual is the fact that there is no AI monopoly?
Viliam · 2024-08-16T20:21:51.012Z · answers+comments (15)

Apply to MATS 7.0!
Ryan Kidd (ryankidd44) · 2024-09-21T00:23:49.778Z · comments (0)

[question] Is cybercrime really costing trillions per year?
Fabien Roger (Fabien) · 2024-09-27T08:44:07.621Z · answers+comments (5)

Extracting SAE task features for in-context learning
Dmitrii Kharlapenko (dmitrii-kharlapenko) · 2024-08-12T20:34:13.747Z · comments (1)

[LDSL#6] When is quantification needed, and when is it hard?
tailcalled · 2024-08-13T20:39:45.481Z · comments (0)

Comparing Quantized Performance in Llama Models
NickyP (Nicky) · 2024-07-15T16:01:24.960Z · comments (2)

[LDSL#1] Performance optimization as a metaphor for life
tailcalled · 2024-08-08T16:16:27.349Z · comments (4)

Music in the AI World
Martin Sustrik (sustrik) · 2024-08-16T04:20:01.706Z · comments (8)

Book Review: What Even Is Gender?
Joey Marcellino · 2024-09-01T16:09:27.773Z · comments (14)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (24)

RLHF is the worst possible thing done when facing the alignment problem
tailcalled · 2024-09-19T18:56:27.676Z · comments (10)

Games for AI Control
charlie_griffin (cjgriffin) · 2024-07-11T18:40:50.607Z · comments (0)

A more systematic case for inner misalignment
Richard_Ngo (ricraz) · 2024-07-20T05:03:03.500Z · comments (4)

[link] Baking vs Patissing vs Cooking, the HPS explanation
adamShimi · 2024-07-17T20:29:09.645Z · comments (16)

Some comments on intelligence
Viliam · 2024-08-01T15:17:07.215Z · comments (5)

AIS terminology proposal: standardize terms for probability ranges
eggsyntax · 2024-08-30T15:43:39.857Z · comments (12)

Inference-Only Debate Experiments Using Math Problems
Arjun Panickssery (arjun-panickssery) · 2024-08-06T17:44:27.293Z · comments (0)

Investigating the Ability of LLMs to Recognize Their Own Writing
Christopher Ackerman (christopher-ackerman) · 2024-07-30T15:41:44.017Z · comments (0)

AI #74: GPT-4o Mini Me and Llama 3
Zvi · 2024-07-25T13:50:06.528Z · comments (6)

AI Constitutions are a tool to reduce societal scale risk
Sammy Martin (SDM) · 2024-07-25T11:18:17.826Z · comments (2)

Book Review: On the Edge: The Gamblers
Zvi · 2024-09-24T11:50:06.065Z · comments (1)

Fun With CellxGene
sarahconstantin · 2024-09-06T22:00:03.461Z · comments (2)

[link] Epistemic states as a potential benign prior
Tamsin Leake (carado-1) · 2024-08-31T18:26:14.093Z · comments (2)

Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi (jarviniemi) · 2024-09-05T19:13:34.443Z · comments (0)

But Where do the Variables of my Causal Model come from?
Dalcy (Darcy) · 2024-08-09T22:07:57.395Z · comments (1)

Paper Summary: Princes and Merchants: European City Growth Before the Industrial Revolution
Jeffrey Heninger (jeffrey-heninger) · 2024-07-15T21:30:04.043Z · comments (1)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

[LDSL#4] Root cause analysis versus effect size estimation
tailcalled · 2024-08-11T16:12:14.604Z · comments (0)

[link] AI forecasting bots incoming
Dan H (dan-hendrycks) · 2024-09-09T19:14:31.050Z · comments (44)

[question] Where to find reliable reviews of AI products?
Elizabeth (pktechgirl) · 2024-09-17T23:48:25.899Z · answers+comments (4)

DIY RLHF: A simple implementation for hands on experience
Mike Vaiana (mike-vaiana) · 2024-07-10T12:07:03.047Z · comments (0)

[link] New blog: Expedition to the Far Lands
Connor Leahy (NPCollapse) · 2024-08-17T11:07:48.537Z · comments (3)

Reading More Each Day: A Simple $35 Tool
aysajan · 2024-07-24T13:54:04.290Z · comments (2)

[question] What Other Lines of Work are Safe from AI Automation?
RogerDearnaley (roger-d-1) · 2024-07-11T10:01:12.616Z · answers+comments (35)

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
Daniel Lee (daniel-lee) · 2024-09-06T02:28:41.954Z · comments (0)

[question] Me & My Clone
SimonBaars (simonbaars) · 2024-07-18T16:25:40.770Z · answers+comments (22)

The case for more Alignment Target Analysis (ATA)
Chi Nguyen · 2024-09-20T01:14:41.411Z · comments (13)

[link] Predicting Influenza Abundance in Wastewater Metagenomic Sequencing Data
jefftk (jkaufman) · 2024-09-23T17:25:58.380Z · comments (0)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sodium on Mira Murati leaves OpenAI/ OpenAI to remove non-profit control

Also from WSJ

abstractapplic on D&D.Sci: Whom Shall You Call? [Evaluation and Ruleset]

Thanks for a good one

I'm glad you feel that way about this scenario. I wish I did . . .

(For future reference, on the off-chance you haven't seen it: there's a compilation of all the past scenarios here [LW · GW], handily rated by quality and steamrollability.)

One thing that perhaps would make it easier was if the web interactive could tell whether or not your selection was the optimal one directly, and possibly how higher your expected price was than the optimal price (I first plugged mine in, then had to double check with your table out here)

. . . huh. I feel conflicted about this on aesthetic grounds - like, Reality doesn't come with big flashing signs saying "EV-maxxing solution reached!" when you reach an EV-maxxing solution - but it does sound both convenient to have and easy to set up. Might try adding this functionality to the interactive for the next one; would be curious to hear what anyone else who happens to be reading this comment thinks.

Anyway, greetings, and looking forward to seeing the next one.

Good to have you on board!

abramdemski on Wei Dai's Shortform

yyyep

interstice on A Path out of Insufficient Views

Some beliefs can be worse or better at predicting what we observe, this is not the same thing as popularity.

anon-user on Superintelligence Can't Solve the Problem of Deciding What You'll Do

Ability to predict how outcome depends on inputs + ability to compute the inverse of the prediction formula + ability to select certain inputs => ability to determine the output (within limits of what the influencing the inputs can accomplish). The rest is just an ontological difference on what language to use to describe this mechanism. I know that if I place a kettle on a gas stove and turn on the flame, I will get the boiling water, and we colloquially describe this as bowling the water. I do not know all the intricacies of the processes inside the water, and I am not directly controlling individual heat exchange subprocesses inside the kettle, but if would be silly to argue that I am not controlling the outcome of the water getting boiled.

anon-user on A Nonconstructive Existence Proof of Aligned Superintelligence

Perhaps you are missing the point of what I am saying here somewhat? The issue is is not the scale of the side-effect of a computation, it's the fact that the side-effect exists, so any accurate mathematical abstraction of an actual real-world ASI must be prepared to deal with solving a self-referential equation.

anon-user on Four Levels of Voting Methods

I think it's important to further refine the accuracy criterion - I think another very important criterion (particularly given today's state of US politics) is how conducive the voting system towards consensus-building vs polarization. In other words, not only pure accuracy matters, but the direction of the error as well. That is, an error towards a more extreme candidate is IMHO a lot more harmful than an equally sized error towards a more consensus candidate.

anthonyc on Is cybercrime really costing trillions per year?

In one sense, you're right, it is obviously correct. *Iff* you can actually do the calculation well, honestly, and convincingly, that is.

In practice, it's really hard to do that in a way that is consistent and principled. Most who try end up succumbing to various forms of motivated reasoning. And even when you do manage it, you have to make a lot of assumptions and extrapolations that get you really wide error bars, and a result that no one is going to believe unless they already want to believe your conclusion.

The other problem is you can't assume the analysis still holds if any of all those assumptions change. Two people, each with credible proposals to reduce the risk and cost of cybercrime in that sense, they can both make similar cost and benefit claims, but clearly effects are not additive; your estimate defines a max not a sum. This is always strictly the case, but if you use a narrower analysis than you can often treat them as approximately independent. If you want to make real-world decisions, you should include a sensitivity analysis as well.

sil-ver on [Intuitive self-models] 2. Conscious Awareness

Were you using this demo?

I’m skeptical of the hypothesis that the color phi phenomenon is just BS. It doesn’t seem like that kind of psych result. I think it’s more likely that this applet is terribly designed.

Yes -- and yeah, fair enough. Although-

I think I got some motion illusion?

-remember that the question isn't "did I get the vibe that something moves". We already know that a series of frames gives the vibe that something moves. The question is whether you remember having seen the red circle halfway across before seeing the blue circle.

zack_m_davis on The Sun is big, but superintelligences will not spare Earth a little sunlight

The claim is pretty clearly intended to be about relative material, not absolute number of pawns: in the end position of the second game, you have three pawns left and Stockfish has two; we usually don't describe this as Stockfish having giving up six pawns. (But I agree that it's easier to obtain resources from an adversary that values them differently, like if Stockfish is trying to win and you're trying to capture pawns.)