LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
habryka (habryka4) · 2024-11-30T02:55:16.077Z · comments (214)

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
habryka (habryka4) · 2024-11-16T06:38:03.937Z · comments (80)

Alignment Faking in Large Language Models
ryan_greenblatt · 2024-12-18T17:19:06.665Z · comments (53)

The hostile telepaths problem
Valentine · 2024-10-27T15:26:53.610Z · comments (84)

[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (28)

[link] I got dysentery so you don’t have to
eukaryote · 2024-10-22T04:55:58.422Z · comments (4)

[link] Biological risk from the mirror world
jasoncrawford · 2024-12-12T19:07:06.305Z · comments (30)

Overview of strong human intelligence amplification methods
TsviBT · 2024-10-08T08:37:18.896Z · comments (141)

The Great Data Integration Schlep
sarahconstantin · 2024-09-13T15:40:02.298Z · comments (16)

The Best Lay Argument is not a Simple English Yud Essay
J Bostock (Jemist) · 2024-09-10T17:34:28.422Z · comments (15)

The Online Sports Gambling Experiment Has Failed
Zvi · 2024-11-11T14:30:04.371Z · comments (30)

Laziness death spirals
PatrickDFarley · 2024-09-19T15:58:30.252Z · comments (36)

the case for CoT unfaithfulness is overstated
nostalgebraist · 2024-09-29T22:07:54.053Z · comments (40)

The Field of AI Alignment: A Postmortem, and What To Do About It
johnswentworth · 2024-12-26T18:48:07.614Z · comments (51)

[link] Explore More: A Bag of Tricks to Keep Your Life on the Rails
Shoshannah Tekofsky (DarkSym) · 2024-09-28T21:38:52.256Z · comments (15)

You are not too "irrational" to know your preferences.
DaystarEld · 2024-11-26T15:01:42.996Z · comments (50)

"Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"
Raemon · 2024-09-28T23:38:25.512Z · comments (69)

Ayn Rand’s model of “living money”; and an upside of burnout
AnnaSalamon · 2024-11-16T02:59:07.368Z · comments (58)

What Goes Without Saying
sarahconstantin · 2024-12-20T18:00:06.363Z · comments (11)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)

The Sun is big, but superintelligences will not spare Earth a little sunlight
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-09-23T03:39:16.243Z · comments (141)

Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (24)

Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)

The Hopium Wars: the AGI Entente Delusion
Max Tegmark (MaxTegmark) · 2024-10-13T17:00:29.033Z · comments (55)

[link] Understanding Shapley Values with Venn Diagrams
Carson L · 2024-12-06T21:56:43.960Z · comments (32)

Orienting to 3 year AGI timelines
Nikola Jurkovic (nikolaisalreadytaken) · 2024-12-22T01:15:11.401Z · comments (34)

[link] The Compendium, A full argument about extinction risk from AGI
adamShimi · 2024-10-31T12:01:51.714Z · comments (52)

Cryonics is free
Mati_Roy (MathieuRoy) · 2024-09-29T17:58:17.108Z · comments (42)

Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (24)

A basic systems architecture for AI agents that do autonomous research
Buck · 2024-09-23T13:58:27.185Z · comments (15)

[link] Why I’m not a Bayesian
Richard_Ngo (ricraz) · 2024-10-06T15:22:45.644Z · comments (92)

Skills from a year of Purposeful Rationality Practice
Raemon · 2024-09-18T02:05:58.726Z · comments (18)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (17)

Contra papers claiming superhuman AI forecasting
nikos (followtheargument) · 2024-09-12T18:10:50.582Z · comments (16)

[question] Why is o1 so deceptive?
abramdemski · 2024-09-27T17:27:35.439Z · answers+comments (24)

Struggling like a Shadowmoth
Raemon · 2024-09-24T00:47:05.030Z · comments (38)

Did Christopher Hitchens change his mind about waterboarding?
Isaac King (KingSupernova) · 2024-09-15T08:28:09.451Z · comments (22)

Three Subtle Examples of Data Leakage
abstractapplic · 2024-10-01T20:45:27.731Z · comments (16)

My motivation and theory of change for working in AI healthtech
Andrew_Critch · 2024-10-12T00:36:30.925Z · comments (37)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (69)

o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)

Current safety training techniques do not fully transfer to the agent setting
Simon Lermen (dalasnoin) · 2024-11-03T19:24:51.537Z · comments (8)

[link] When Is Insurance Worth It?
kqr · 2024-12-19T19:07:32.573Z · comments (57)

My takes on SB-1047
leogao · 2024-09-09T18:38:37.799Z · comments (8)

[link] Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
cloud · 2024-12-06T22:19:26.717Z · comments (12)

"It's a 10% chance which I did 10 times, so it should be 100%"
egor.timatkov · 2024-11-18T01:14:27.738Z · comments (57)

o3
Zach Stein-Perlman · 2024-12-20T18:30:29.448Z · comments (148)

next page (older posts) →

Archive

Recent comments

viliam on If all trade is voluntary, then what is "exploitation?"

After reading your and FlorianH [LW(p) · GW(p)]'s comments, it seems to me that Econ 101 leaves it underspecified what it means to be an economical agent, and that those parts missing from the specification are the ones that matter here.

Naively, an economical agent is someone who accepts deals that increase the value they get. There seems to be nothing wrong with that; if we all become economical agents, our values will increase, which is a good thing.

But this is not the entire story. No agent accepts all hypothetical deals that would increase their value. Our attention and time are limited. We pick the seemingly best deals we are aware of. And there are probably other heuristics that successful agents follow, such as increasing their power, even if it does not increase the value in short term, because it will allow them to take more value in future.

People who insist on taking the Econ 101 perspective of "if the deal is not good for you, then simply don't take it, duh" seem willfully blind to how the power is strategically gained and used.

This reminds me of Ayn Rand's novels. Both the heroes and the villains could be called "economical agents" from certain perspective, but clearly they used different strategies. There is a difference between someone trying to get good deals without simultaneously crippling their trade partners, and someone for whom crippling their trade partners is an important component of how they get the good deals. Both of them are agents participating in the economy.

christopher-king on The Field of AI Alignment: A Postmortem, and What To Do About It

I think there is an obvious signal that could be used: a forecast of how much MIRI will like the research when asked in 5 years. (Note that I don't mean just asking MIRI now, but rather something like prediction markets or super-forecasters to predict what MIRI will say 5 years from now.)

Basically, if the forecast is above average, anyone who trusts MIRI should fund them.

sharmake-farah on The Field of AI Alignment: A Postmortem, and What To Do About It

Re the OpenAI o-series and search, my initial prediction is that Q*/MCTS search will work well on problems that are easy to verify and and easy to get training data for, and not work if either of these 2 conditions are violated, and secondarily will be reliant on the model having good error correction capabilities to use the search effectively, which is why I expect we can make RL capable of superhuman performance on mathematics/programming with some rather moderate schlep/drudge work, and I also expect cost reductions such that it can actually be practical, but I'm only giving a 50/50 chance by 2028 for superhuman performance as measured by benchmarks in these domains.

I think my main difference from you, Thane Ruthenis is I expect costs to reduce surprisingly rapidly, though this is admittedly untested.

This will accelerate AI progress, but not immediately cause an AI explosion, though in the more extreme paces this could create something like a scenario where programming companies are founded by a few people smartly managing a lot of programming AIs, and programming/mathematics experiencing something like what happened to the news industry from the rise of the internet, where there was a lot of bankruptcy of the middle end, the top end won big, and most people are in the bottom end.

Also, correct point on how a lot of people's conceptions of search are babble-and-prune, not top down search like MCTS/Q*/BFS/DFS/A* (not specifically targeted at sunwillrisee

By contrast, my understanding is that the sort of search John is talking about retargeting isn't the brute-force babble-and-prune algorithms, but a top-down heuristical-constraint-based search [LW · GW].

cbiddulph on The Field of AI Alignment: A Postmortem, and What To Do About It

Maybe someone else could moderate it?

dagon on What's the best metric for measuring quality of life?

There's no good candidate for a simple, legible, easily-obtained, and agreeable-to-most metric. Before-and-after polling of patients is probably closest we can get.

That said, the dimensions of quality that the FDA concerns itself with (including physical functioning, self-reported pain, and other easily- and not-easily-measured things) is likely close enough to "improves quality of life" that it's not necessary to have a new direction.

Perhaps you could identify some drugs that you think would improve quality of life, and work backwards to the metrics that prove to you that they do so.

archimedes on Letter from an Alien Mind

this concern sounds like someone walking down a straight road and then closing their eyes cause they know where they want to go anyway

This doesn't sound like a good analogy at all. A better analogy might be a stylized subway map compared to a geographically accurate one. Sometimes removing detail can make it easier to process.

logan-zoellner on The Field of AI Alignment: A Postmortem, and What To Do About It

A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, and that he lost them in the park. The policeman asks why he is searching here, and the drunk replies, "this is where the light is".

I've always been sympathetic to the drunk in this story. If the key is in the light, there is a chance of finding it. If it is in the dark, he's not going to find it anyway so there isn't much point in looking there.

Given the current state of alignment research, I think it's fair to say that we don't know where the answer will come from. I support The Plan [LW · GW] and I hope research continues on it. But if I had to guess, alignment will not be solved via getting a bunch of physicists thinking about agent foundations. It will be solved by someone who doesn't know better making a discovery they "wasn't supposed to work".

On an interesting side here a fun story about experts repeatedly failing to make an obvious-in-hindsight discovery because they "knew better".

viliam on If all trade is voluntary, then what is "exploitation?"

Seems to me that there is always some friction, some lack of information, etc., so "this requires market failure by definition" basically means "this happens in the real world".

chris_leong on The Field of AI Alignment: A Postmortem, and What To Do About It

Agreed. Simply focusing on physics post-docs feels too narrow to me.

Then again, just as John has a particular idea of what good alignment research looks like, I have my own idea: I would lean towards recruiting folk with both a technical and a philosophical background. It's possible that my own idea is just as narrow.

johnburidan on JohnBuridan's Shortform

You can buy bulk tamiflu powder here: https://www.selleckchem.com/products/oseltamivir-phosphate-Tamiflu.html And instructions for preparation are here: https://dph.illinois.gov/content/dam/soi/en/web/idph/files/publications/tami-flu-flyer-050316.pdf $300 for 13 doses, which would last you about 6 days, dosing the recommended twice daily during a pandemic. Shelf life of the powder is 2 years. Would any medicine nerd sanity check that I am not missing something essential?