LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Chris Olah’s views on AGI safety
evhub · 2019-11-01T20:13:35.210Z · comments (38)

Worlds Where Iterative Design Fails
johnswentworth · 2022-08-30T20:48:29.025Z · comments (30)

The Treacherous Path to Rationality
Jacob Falkovich (Jacobian) · 2020-10-09T15:34:17.490Z · comments (116)

How To Go From Interpretability To Alignment: Just Retarget The Search
johnswentworth · 2022-08-10T16:08:11.402Z · comments (34)

Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds
1a3orn · 2023-04-04T17:39:39.720Z · comments (38)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)

Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-04-30T18:51:13.493Z · comments (42)

Hiring engineers and researchers to help align GPT-3
paulfchristiano · 2020-10-01T18:54:23.551Z · comments (13)

Common knowledge about Leverage Research 1.0
BayAreaHuman · 2021-09-24T06:56:14.729Z · comments (212)

Evolution provides no evidence for the sharp left turn
Quintin Pope (quintin-pope) · 2023-04-11T18:43:07.776Z · comments (65)

My current LK99 questions
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2023-08-01T22:48:00.733Z · comments (38)

UDT shows that decision theory is more puzzling than ever
Wei Dai (Wei_Dai) · 2023-09-13T12:26:09.739Z · comments (55)

Call For Distillers
johnswentworth · 2022-04-04T18:25:34.942Z · comments (43)

Yudkowsky and Christiano discuss "Takeoff Speeds"
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-22T19:35:27.657Z · comments (176)

Funny Anecdote of Eliezer From His Sister
Noah Birnbaum (daniel-birnbaum) · 2024-04-22T22:05:31.886Z · comments (6)

Lightcone Infrastructure/LessWrong is looking for funding
habryka (habryka4) · 2023-06-14T04:45:53.425Z · comments (39)

Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC (LawChan) · 2022-12-03T00:58:36.973Z · comments (35)

Labs should be explicit about why they are building AGI
peterbarnett · 2023-10-17T21:09:20.711Z · comments (18)

Some AI research areas and their relevance to existential safety
Andrew_Critch · 2020-11-19T03:18:22.741Z · comments (37)

OpenAI: Fallout
Zvi · 2024-05-28T13:20:04.325Z · comments (25)

Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (23)

[link] Jaan Tallinn's 2023 Philanthropy Overview
jaan · 2024-05-20T12:11:39.416Z · comments (5)

Toward A Mathematical Framework for Computation in Superposition
Dmitry Vaintrob (dmitry-vaintrob) · 2024-01-18T21:06:57.040Z · comments (18)

Benign Boundary Violations
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2022-05-26T06:48:35.585Z · comments (84)

The Sun is big, but superintelligences will not spare Earth a little sunlight
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-09-23T03:39:16.243Z · comments (141)

If interpretability research goes well, it may get dangerous
So8res · 2023-04-03T21:48:18.752Z · comments (11)

Pay Risk Evaluators in Cash, Not Equity
Adam Scholl (adam_scholl) · 2024-09-07T02:37:59.659Z · comments (19)

Long Covid Is Not Necessarily Your Biggest Problem
Elizabeth (pktechgirl) · 2021-09-01T07:20:05.374Z · comments (40)

Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (25)

Consciousness as a conflationary alliance term for intrinsically valued internal experiences
Andrew_Critch · 2023-07-10T08:09:48.881Z · comments (53)

Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)

The Hopium Wars: the AGI Entente Delusion
Max Tegmark (MaxTegmark) · 2024-10-13T17:00:29.033Z · comments (55)

What does it take to defend the world against out-of-control AGIs?
Steven Byrnes (steve2152) · 2022-10-25T14:47:41.970Z · comments (47)

I Converted Book I of The Sequences Into A Zoomer-Readable Format
dkirmani · 2022-11-10T02:59:04.236Z · comments (32)

Simulacra Levels and their Interactions
Zvi · 2020-06-15T13:10:00.717Z · comments (51)

We're Not Ready: thoughts on "pausing" and responsible scaling policies
HoldenKarnofsky · 2023-10-27T15:19:33.757Z · comments (33)

What's So Bad About Ad-Hoc Mathematical Definitions?
johnswentworth · 2021-03-15T21:51:53.242Z · comments (58)

A concrete bet offer to those with short AGI timelines
Matthew Barnett (matthew-barnett) · 2022-04-09T21:41:45.106Z · comments (120)

The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren · 2022-11-28T12:54:52.399Z · comments (33)

Maybe Anthropic's Long-Term Benefit Trust is powerless
Zach Stein-Perlman · 2024-05-27T13:00:47.991Z · comments (21)

Optimistic Assumptions, Longterm Planning, and "Cope"
Raemon · 2024-07-17T22:14:24.090Z · comments (46)

[link] Why haven't we celebrated any major achievements lately?
jasoncrawford · 2020-08-17T20:34:20.084Z · comments (69)

Almost everyone should be less afraid of lawsuits
alyssavance · 2021-11-27T02:06:52.176Z · comments (18)

[link] Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy
garrison · 2024-02-10T19:52:55.191Z · comments (52)

Shoulder Advisors 101
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-10-09T05:30:57.372Z · comments (124)

Do a cost-benefit analysis of your technology usage
TurnTrout · 2022-03-27T23:09:26.753Z · comments (53)

GPT-4 Plugs In
Zvi · 2023-03-27T12:10:00.926Z · comments (47)

Brain Efficiency: Much More than You Wanted to Know
jacob_cannell · 2022-01-06T03:38:00.320Z · comments (102)

Attempted Gears Analysis of AGI Intervention Discussion With Eliezer
Zvi · 2021-11-15T03:50:01.141Z · comments (49)

Seeing the Smoke
Jacob Falkovich (Jacobian) · 2020-02-28T18:26:58.839Z · comments (29)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lao-mein on Lao Mein's Shortform

Why does California have forests so close to residential areas if they can not handle wildfires? In a just world, insurance companies would be allowed to pave over Californian forests with concrete.

mishka on AI #99: Farewell to Biden

The alternative hypothesis does need to be said, especially after someone at a party outright claimed it was obviously true, and with the general consensus that the previous export controls were not all that tight. That alternative hypothesis is that DeepSeek is lying and actually used a lot more compute and chips it isn’t supposed to have. I can’t rule it out.

Re DeepSeek cost-efficiency, we are seeing more claims pointing in that direction.

In a similarly unverified claim, the founder of 01.ai (who is sufficiently known in the US according to https://en.wikipedia.org/wiki/Kai-Fu_Lee) seems to be claiming that the training cost of their Yi-Lightning model is only 3 million dollars or so. Yi-Lightning is a very strong model released in mid-Oct-2024 (when one compares it to DeepSeek-V3, one might want to check "math" and "coding" subcategories on https://lmarena.ai/?leaderboard; the sources for the cost claim are https://x.com/tsarnick/status/1856446610974355632 and https://www.tomshardware.com/tech-industry/artificial-intelligence/chinese-company-trained-gpt-4-rival-with-just-2-000-gpus-01-ai-spent-usd3m-compared-to-openais-usd80m-to-usd100m, and we probably should similarly take this with a grain of salt).

But all this does seem to be well within what's possible. Here is the famous https://github.com/KellerJordan/modded-nanogpt ongoing competition, and it took people about 8 months to accelerate Andrej Karpathy's PyTorch GPT-2 trainer from llm.c by 14x on a 124M parameter GPT-2 (what's even more remarkable is that almost all that acceleration is due to better sample efficiency with the required training data dropping from 10 billion tokens to 0.73 billion tokens on the same training set with the fixed order of training tokens).

Some of the techniques used by the community pursuing this might not scale to really large models, but most of them probably would scale (as we see in their mid-Oct experiment demonstrating scaling of what has been 3-4x acceleration back then to the 1.5B version).

So when an org is claiming 10x-20x efficiency jump compared to what it presumably took a year or more ago, I am inclined to say, "why not, and probably the leaders are also in possession of similar techniques now, even if they are less pressed by compute shortage".

The real question is how fast will these numbers continue to go down for the similar levels of performance... It's has been very expensive to be the very first org achieving a given new level, but the cost seems to be dropping rapidly for the followers...

bronson-schoen on Marcus Williams's Shortform

Great post! Extremely interested in how this turns out, I’ve also found:

Alignment faking could be just one of many ways LLMs can fulfill conflicting goals using motivated reasoning.

One thing that I’ve noticed is that models are very good at justifying behavior in terms of following previously held goals.

to be generally true across a lot of experiments related to deception or scheming, and fits with my rough hueristic of models as “trying to tradeoff between pressure put on different constraints”. I’d predict that some variant of Experiment 2 for example would work.

daniel-tan on Daniel Tan's Shortform

Yeah. I definitely find myself mostly in the “run once” regime. Though for stuff I reuse a lot, I usually invest the effort to make nice frameworks / libraries

nathan-helm-burger on Passages I Highlighted in The Letters of J.R.R.Tolkien

I mean, I think you are right about him being anti-progress and fantasy-proposing terrible 'solutions' to real problems. I don't think you are correct to give him so much credit for productivity slowdown. I think his effect is quite a bit more minor than that. I think the productivity slowdown is mostly due to weird unanticipated long term downstream effects of our government structure, land use rules, tax structures, and a tendency towards veto-ochracy.

zedmor on Implications of the inference scaling paradigm for AI safety

That's exactly o1 full is not available in API as well. Why not make money on something you already have?

havard-tveit-ihle on Introducing the WeirdML Benchmark

The LLMs are presented with the ML task and they write python code to solve the ML task. This python code is what is run in the isolated docker with 12GB memory.

So the LLMs themselves are not run on the TITAN V, they are mostly called through an API. Although I did in fact run a bunch of the LLMs locally through ollama, just not on the TITAN V server, but a larger one.

nathan-helm-burger on yanni's Shortform

I think you are right that most people suffer from this lack of big ambition. I think I tend to fail in the other direction. I consistently come up with big plans, which might potentially have big payoffs if I could manage them, and then fail partway through. I'm a lot more effective (on average) working in a team where my big ideas get curtailed and my focus is kept on the achievable. I do also think that I bring value to under-aimers, by encouraging them to think bigger.

Sometimes people literally laugh out loud at me when I tell them about my current goals. I am not poorly calibrated, overall. I tell them that I know my chance of succeeding at my current goal is small, but that the payoff would really matter if I did manage it. In the course of aiming at a long term goal, I do tend to actively try to set aside my realistic estimate of my success, in order to let myself be buoyed by a sense of impending achievement. Being too realistic in the 'doing' phase, rather than the 'planning' phase tends to sharply bring down my chance of sticking with the project long enough that it has a chance to succeed.

randyorion on Introducing the WeirdML Benchmark

Hi. A question here.

"The system executes code in a Docker container with strict resource limits (TITAN V GPU with 12GB memory, 600-second timeout). This ensures fair comparison between models and tests their ability to work within realistic constraints."

How can you run llama-3.1/3.3-70b models with 12GB vram?

sharmake-farah on Alignment Faking in Large Language Models

Tagging @gwern [LW · GW] here for your take on the situation.