LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
habryka (habryka4) · 2024-11-30T02:55:16.077Z · comments (211)

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
habryka (habryka4) · 2024-11-16T06:38:03.937Z · comments (80)

Alignment Faking in Large Language Models
ryan_greenblatt · 2024-12-18T17:19:06.665Z · comments (53)

The hostile telepaths problem
Valentine · 2024-10-27T15:26:53.610Z · comments (84)

[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (28)

[link] I got dysentery so you don’t have to
eukaryote · 2024-10-22T04:55:58.422Z · comments (4)

[link] Biological risk from the mirror world
jasoncrawford · 2024-12-12T19:07:06.305Z · comments (29)

The Online Sports Gambling Experiment Has Failed
Zvi · 2024-11-11T14:30:04.371Z · comments (28)

You are not too "irrational" to know your preferences.
DaystarEld · 2024-11-26T15:01:42.996Z · comments (50)

Ayn Rand’s model of “living money”; and an upside of burnout
AnnaSalamon · 2024-11-16T02:59:07.368Z · comments (58)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)

Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)

Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (24)

[link] Understanding Shapley Values with Venn Diagrams
Carson L · 2024-12-06T21:56:43.960Z · comments (32)

[link] The Compendium, A full argument about extinction risk from AGI
adamShimi · 2024-10-31T12:01:51.714Z · comments (52)

Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (23)

What Goes Without Saying
sarahconstantin · 2024-12-20T18:00:06.363Z · comments (8)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (17)

Orienting to 3 year AGI timelines
Nikola Jurkovic (nikolaisalreadytaken) · 2024-12-22T01:15:11.401Z · comments (17)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (69)

o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)

Current safety training techniques do not fully transfer to the agent setting
Simon Lermen (dalasnoin) · 2024-11-03T19:24:51.537Z · comments (8)

[link] Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
cloud · 2024-12-06T22:19:26.717Z · comments (11)

"It's a 10% chance which I did 10 times, so it should be 100%"
egor.timatkov · 2024-11-18T01:14:27.738Z · comments (57)

A Rocket–Interpretability Analogy
plex (ete) · 2024-10-21T13:55:18.184Z · comments (31)

o3
Zach Stein-Perlman · 2024-12-20T18:30:29.448Z · comments (140)

“Alignment Faking” frame is somewhat fake
Jan_Kulveit · 2024-12-20T09:51:04.664Z · comments (13)

[link] Arithmetic is an underrated world-modeling technology
dynomight · 2024-10-17T14:00:22.475Z · comments (32)

Repeal the Jones Act of 1920
Zvi · 2024-11-27T15:00:06.801Z · comments (23)

[link] o1: A Technical Primer
Jesse Hoogland (jhoogland) · 2024-12-09T19:09:12.413Z · comments (17)

[link] China Hawks are Manufacturing an AI Arms Race
garrison · 2024-11-20T18:17:51.958Z · comments (42)

Subskills of "Listening to Wisdom"
Raemon · 2024-12-09T03:01:18.706Z · comments (16)

[link] When Is Insurance Worth It?
kqr · 2024-12-19T19:07:32.573Z · comments (28)

[question] Which things were you surprised to learn are not metaphors?
Eric Neyman (UnexpectedValues) · 2024-11-21T18:56:18.025Z · answers+comments (79)

[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)

"The Solomonoff Prior is Malign" is a special case of a simpler argument
David Matolcsi (matolcsid) · 2024-11-17T21:32:34.711Z · comments (44)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (51)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (10)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

The o1 System Card Is Not About o1
Zvi · 2024-12-13T20:30:08.048Z · comments (5)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (31)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

You should consider applying to PhDs (soon!)
bilalchughtai (beelal) · 2024-11-29T20:33:12.462Z · comments (19)

next page (older posts) →

Archive

Recent comments

cronodas on Hire (or become) a Thinking Assistant / Body Double

I live in New Jersey and have no job and lots of free time. How can I do this for someone without moving to the Bay Area?

nathan-helm-burger on What are the strongest arguments for very short timelines?

Hmm, yes. I agree that there's something about self-guiding /self-correcting on complex lengthy open-ended tasks where current AIs seem at near-zero performance.

I do expect this to improve dramatically in the next 12 months. I think this current lack is more about limitations in the training regimes so far, rather than limitations in algorithms/architectures.

Contrast this with the challengingness of ARC-AGI, which seems like maybe an architecture weakness?

romeostevensit on Hire (or become) a Thinking Assistant / Body Double

no

xpostah on What are the strongest arguments for very short timelines?

I think it depends on some factors actually.

For instance if we don’t get AGI by 2030 but lots of people still believe it could happen by 2040, we as a species might be better equipped to form good beliefs on it, figure out who to defer to, etc.

I already think this has happened btw. AI beliefs in 2024 are more sane on average than beliefs in say 2010 IMO.

P.S. I’m not talking about what you personally should do with your time and energy, maybe there’s other projects that appeal to you more. But I think it is worthwhile for someone to be doing the thing I ask. It won’t take much effort.

mitchell_porter on Panology

What's the difference between "panology" and "science"?

wassname on What o3 Becomes by 2028

Peak Data

We don't know how o3 works, but we can speculate. If it's like the open source huggingface kinda-replication then it uses all kinds of expensive methods to make the next level of reward model, and this model teaches a simpler student model. That means that the expensive methods are only needed once, during the training.

In other words, you use all kinds of expensive methods (process supervision, test time compute, MCTS) to bootstrap the next level of labels/supervision, which teaches a cheaper student model. This is essentially bootstrapping superhuman synthetic data/supervision.

o3 seems to have shown that this bootstrapping process can be repeated beyond the limits of human training data.

If this is true, we've reached peak cheap data. Not peak data.

jmh on When Is Insurance Worth It?

Nice write up and putting some light on something I think I have intuitively been doing but not quite realizing it. Particularly the impact on growth of wealth.

I was thinking that a big challenge for a lot of people is the estimated distribution - which is likely why so many non-technical rationales are given by many people. Trying to assess that is hard and requires a lot of information about a lot of things -- something the insurance companies can do (as suggested by another comment) but probably overwhelms most people who buy insurance.

With that thought, I was wondering if anyone has thought of shifting the equations a bit. Rather than working up some estimate of the probability space, why not put an equation together that you might be able to churn out some probability distributions given W, P, d_i and c_i. for the break-even case. I think most people would be able to digest that, event x_i has implied probability p_i, event x_j has implied probability p_j. Then the person can think if those probabilities actually make sense to them and their situation.

Clearly, it could not be an exhaustive listing of events but I would think a table of three or four of the main events that carry the greatest losses would be a good starting point for most people.

quila on What Goes Without Saying

I don't understand your objection.

I believe that persuasion should happen on merits of arguments, and that trying to activate the social biases of the reader is defecting from that norm (even if it's normal writing practice elsewhere).

Looking at the points of view espoused, they seem to be quite positive for their adherents.

There's no way to ensure this would be only done with positive views, because many authors think their beliefs would be positive to spread.

localdeity on StartAtTheEnd's Shortform

Some related scenarios are discussed in my post here [LW · GW], e.g. when popularity ≈ beauty + substance, and if popularity and beauty are readily apparent then you can estimate substance.

gordon-seidoh-worley on Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility

Just to verify, you were also eating rice with those lentils? I'd expect to be differently protein deficient if you only eat lentils. The right combo is beans and rice (or another grain).