LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Fifteen Lawsuits against OpenAI
Remmelt (remmelt-ellen) · 2024-03-09T12:22:09.715Z · comments (4)

[link] Goodhart's Law Example: Training Verifiers to Solve Math Word Problems
Chris_Leong · 2023-11-25T00:53:26.841Z · comments (2)

A short dialogue on comparability of values
cousin_it · 2023-12-20T14:08:29.650Z · comments (7)

An Affordable CO2 Monitor
Pretentious Penguin (dylan-mahoney) · 2024-03-21T03:06:53.255Z · comments (1)

Am I going insane or is the quality of education at top universities shockingly low?
ChrisRumanov (pseudonymous-ai) · 2023-11-20T03:53:30.056Z · comments (30)

[link] Attention on AI X-Risk Likely Hasn't Distracted from Current Harms from AI
Erich_Grunewald · 2023-12-21T17:24:16.713Z · comments (2)

Ideas for Next-Generation Writing Platforms, using LLMs
ozziegooen · 2024-06-04T18:40:24.636Z · comments (4)

Smartphone Etiquette: Suggestions for Social Interactions
Declan Molony (declan-molony) · 2024-06-04T06:01:03.336Z · comments (4)

Why I think it's net harmful to do technical safety research at AGI labs
Remmelt (remmelt-ellen) · 2024-02-07T04:17:15.246Z · comments (24)

AI debate: test yourself against chess 'AIs'
Richard Willis · 2023-11-22T14:58:10.847Z · comments (35)

Links and brief musings for June
Kaj_Sotala · 2024-07-06T10:10:03.344Z · comments (0)

[link] Emotional issues often have an immediate payoff
Chipmonk · 2024-06-10T23:39:40.697Z · comments (2)

[link] my favourite Scott Sumner blog posts
DMMF · 2024-06-11T14:40:43.093Z · comments (0)

Essaying Other Plans
Screwtape · 2024-03-06T22:59:06.240Z · comments (4)

Exploring OpenAI's Latent Directions: Tests, Observations, and Poking Around
Johnny Lin (hijohnnylin) · 2024-01-31T06:01:27.969Z · comments (4)

Facebook is Paying Me to Post
jefftk (jkaufman) · 2023-11-14T19:10:07.303Z · comments (5)

Singular learning theory and bridging from ML to brain emulations
kave · 2023-11-01T21:31:54.789Z · comments (16)

The Limitations of GPT-4
p.b. · 2023-11-24T15:30:30.933Z · comments (12)

A list of all the deadlines in Biden's Executive Order on AI
Valentin Baltadzhiev (valentin-baltadzhiev) · 2023-11-01T17:14:31.074Z · comments (2)

[link] How to Upload a Mind (In Three Not-So-Easy Steps)
aggliu · 2023-11-13T18:13:32.893Z · comments (0)

The Sequences on YouTube
Neil (neil-warren) · 2024-01-07T01:44:39.663Z · comments (9)

[link] Agreeing With Stalin in Ways That Exhibit Generally Rationalist Principles
Zack_M_Davis · 2024-03-02T22:05:49.553Z · comments (22)

Evaluating Solar
jefftk (jkaufman) · 2024-02-17T21:50:04.783Z · comments (5)

Losing Metaphors: Zip and Paste
jefftk (jkaufman) · 2023-11-29T20:31:07.464Z · comments (6)

Three Types of Constraints in the Space of Agents
Nora_Ammann · 2024-01-15T17:27:27.560Z · comments (3)

Taking Into Account Sentient Non-Humans in AI Ambitious Value Learning: Sentientist Coherent Extrapolated Volition
Adrià Moret (Adrià R. Moret) · 2023-12-02T14:07:29.992Z · comments (31)

AI #57: All the AI News That’s Fit to Print
Zvi · 2024-03-28T11:40:05.435Z · comments (14)

How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots
Owain_Evans · 2024-03-28T02:34:21.799Z · comments (0)

Geometric Utilitarianism (And Why It Matters)
StrivingForLegibility · 2024-05-12T03:41:21.342Z · comments (2)

Quick takes on "AI is easy to control"
So8res · 2023-12-02T22:31:45.683Z · comments (49)

Sleeping on Stage
jefftk (jkaufman) · 2024-10-22T00:50:07.994Z · comments (3)

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?
Taras Kutsyk · 2024-09-29T19:37:30.465Z · comments (7)

SAE features for refusal and sycophancy steering vectors
neverix · 2024-10-12T14:54:48.022Z · comments (4)

[link] Can a Bayesian Oracle Prevent Harm from an Agent? (Bengio et al. 2024)
mattmacdermott · 2024-09-01T07:46:26.647Z · comments (0)

[question] Seeking AI Alignment Tutor/Advisor: $100–150/hr
MrThink (ViktorThink) · 2024-10-05T21:28:16.491Z · answers+comments (3)

[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (4)

Why is there Nothing rather than Something?
Logan Zoellner (logan-zoellner) · 2024-10-26T12:37:50.204Z · comments (3)

The causal backbone conjecture
tailcalled · 2024-08-17T18:50:14.577Z · comments (0)

[link] Positive visions for AI
L Rudolf L (LRudL) · 2024-07-23T20:15:26.064Z · comments (4)

Talk: AI safety fieldbuilding at MATS
Ryan Kidd (ryankidd44) · 2024-06-23T23:06:37.623Z · comments (2)

Optimizing Repeated Correlations
SatvikBeri · 2024-08-01T17:33:23.823Z · comments (1)

Agent membranes/boundaries and formalizing “safety”
Chipmonk · 2024-01-03T17:55:21.018Z · comments (46)

Causality is Everywhere
silentbob · 2024-02-13T13:44:49.952Z · comments (12)

[question] How are you preparing for the possibility of an AI bust?
Nate Showell · 2024-06-23T19:13:45.247Z · answers+comments (16)

[question] Thoughts on Francois Chollet's belief that LLMs are far away from AGI?
O O (o-o) · 2024-06-14T06:32:48.170Z · answers+comments (17)

[link] Let's Design A School, Part 2.1 School as Education - Structure
Sable · 2024-05-02T22:04:30.435Z · comments (2)

Meetup In a Box: Year In Review
Czynski (JacobKopczynski) · 2024-02-14T01:18:28.259Z · comments (0)

What is the best argument that LLMs are shoggoths?
JoshuaFox · 2024-03-17T11:36:23.636Z · comments (22)

D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra
aphyer · 2024-01-16T22:44:52.424Z · comments (1)

Vote in the LessWrong review! (LW 2022 Review voting phase)
habryka (habryka4) · 2024-01-17T07:22:17.921Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

tropicalfruit on UFO Betting: Put Up or Shut Up

As someone who's gambled professionally, I believe the (Chesterton's) fence around betting for normies exists because most bets are essentially scams, which is why I'm entirely okay knocking it down for LWers. Let me elaborate.

Probability is complicated and abstract. Not only that, human intuition is really bad at it. Nearly all "bets" throughout our modern history have not been the kind of skin-in-the-game prediction competition we're praising on lesswrong - they've been predatory. One person who understands probability using emotional and logical minipulation to take someone else's money, who doesn't.

Society protects people with taboos. "Betting is icky" is a meme that can easily spread, and will quickly reproduce, becuase it's adaptive in this betting environment. [Dissertation about Bayesian reasoning, calibration, and the Kelley Criterion] is NOT a meme that can easily spread, because it's far too complex and long, and thus it will not reproduce (even though it is also adaptive).

Or at least, it can't spread in the normie population, but it CAN on LessWrong, which is why, on LessWrong, most bets are not scams. They are, in fact, what the scammers falsly proclaimed their own bets to be - friendly competitions wherein two people who disagree about the future both put skin in the game.

The sportsbooks and casinos we have today are predators. From their celebrity endorsements, to the way they form their commercials, to their messaging around winning (and especially parlays), they effectively lie about what they're selling while trying to create addicts. I've engaged with many people across the betting experience spectrum (from other winners, to big losers, to smart people, who were small losers, and realized they needed to quit), and it's pretty clear to me that "betting = icky" is a reasonable idea, even today The fence around it is not Chesterton's, though. It's there to help regular people avoid a certain species of predator gunning for their capital.

We can safely knock it down on here.

thedstrat on The U.S. is becoming less stable

It's interesting how nearly every point on your list is a direct or indirect result of something Donald Trump did.

tao-lin on The hostile telepaths problem

I'm often surprised how little people notice, adapt to, or even punish self deception. It's not very hard to detect when someone's deceiving them self, people should notice more and disincentivise that

vladimir_nesov on A path to human autonomy

I do think that these things are relevant to 'compute it takes to get to a given capability level'.

In practice, there are no 2e23 FLOPs models that cost $300K to train that are anywhere close to Llama-3-405B smart. If there were such models at leading labs (based on unpublished experimental results and more algorithmic insights), they would be much smarter than Llama3-405B when trained with 8e25 FLOPs they have to give, rather than the reference 2e23 FLOPs. Better choice of ways of answering questions doesn't get us far in the actual technical capabilities.

(Post-training like o1 is a kind of "better choice of ways of answering questions" that might help, but we don't know how much compute it saves. Noam Brown gestures at 100,000x from his earlier work, but we haven't seen Llama 4 yet, it might just spontaneously become capable of coherent long reasoning traces as a result of more scale, the bitter lesson making Strawberry Team's efforts moot.)

Many improvements observed at smaller scale disappear at greater scale, or don't stack with each other. Many papers have horrible methodologies, plausibly born of scarcity of research compute, that don't even try (or make it possible) to estimate the compute multiplier. Most of them will be eventually forgotten, for a good reason. So most papers that seem to demonstrate improvements are not strong evidence for the hypothesis of a 1000x cumulative compute efficiency improvement, while this hypothesis predicts observations about what's actually already possible in practice that we are not getting, strong evidence against it. There are multiple competent teams that don't have Microsoft compute, and they don't win over Llama-3-405B, which we know doesn't have all of these speculative algorithmic improvements and uses 4e25 FLOPs (2.5 months on 16K H100s rather than 1.5 months on 128 H100s for 2e23 FLOPs).

In other words, the importance of Llama-3-405B for the question about speculative algorithmic improvements is that the detailed report shows it has no secret sauce, it merely competently uses about as much compute as the leading labs in very conservative ways. And yet it's close in capabilities to all the other frontier models. Which means the leading labs don't have significantly effective secret sauce either, which means nobody does, since the leading labs would've already borrowed it if it was that effective.

There's clearly a case in principle for it being possible to learn with much less data, anchoring to humans blind from birth. But there's probably much more compute happening in a human brain per the proverbial external data token. And a human has the advantage of not learning everything about everything, with greater density of capability over encyclopedic knowledge, which should help save on compute.

_will_ on MIRI 2024 Communications Strategy

Thanks, that’s helpful!

(Fwiw, I don’t find the ‘caring a tiny bit’ story very reassuring, for the same reasons [LW · GW] as Wei Dai, although I do find the acausal trade story for why humans might be left with Earth somewhat heartening. (I’m assuming that by ‘game-theoretic reasons’ you mean acausal trade.))

habryka4 on MIRI 2024 Communications Strategy

(My model of Daniel thinks the AI will likely take over, but probably will give humanity some very small fraction of the universe, for a mixture of "caring a tiny bit" and game-theoretic reasons)

_will_ on MIRI 2024 Communications Strategy

I don't think [AGI/ASI] literally killing everyone is the most likely outcome

Huh, I was surprised to read this. I’ve imbibed a non-trivial fraction of your posts and comments here on LessWrong, and my shoulder Daniel [LW · GW], as of before reading the above, definitely saw extinction as the most likely existential catastrophe.

If you have the time, I’d be very interested to hear what you do think is the most likely outcome. (It’s very possible that you have written about this before and I missed it—my bad, if so.)

habryka4 on Habryka's Shortform Feed

The "Recommended" tab filters out read posts by default. We never had much demand for showing recently-sorted posts while filtering out only ones you've read, but it wouldn't be very hard to build.

Not sure what you mean by "load more at once". We could add a whole user setting to allow users to change the number of posts on the frontpage, but done consistently that would produce a ginormous number of user settings for everything, which would be a pain to maintain (not like, overwhelmingly so, but I would be surprised if it was worth the cost).

nathan-helm-burger on Habryka's Shortform Feed

Just want to chime in with agreement about annoyance over the prioritization of post headlines. One thing in particular that annoys me is that I haven't figured out how to toggle off 'seen' posts showing up. What if I just want to see unread ones?

Also, why can't I load more at once instead of always having to click 'load more'?

nathan-helm-burger on A path to human autonomy

I thought you might say that some of these weren't relevant to the metric of compute efficiency you had in mind. I do think that these things are relevant to 'compute it takes to get to a given capability level'.

Of course, what's actually more important even than an improvement to training efficiency is an improvement to peak capability. I would argue that if Yi-Lightning, for example, had a better architecture than it does in terms of peak capability, then the gains from the additional training it was given would have been larger. There wouldn't have been so much decreasing return to overtraining.

If it were possible to just keep training an existing transformer and have it keep getting smarter at a decent rate, then we'd probably be at AGI already. Just train GPT-4 10x as long.

I think a lot of people are seeing ways in which something about the architecture and/or training regime aren't quite working for some key aspects of general intelligence. Particularly, reasoning and hyperpolation.

Some relevant things I have read:

reasoning limitations: https://arxiv.org/abs/2406.06489

hyperpolation: https://arxiv.org/abs/2409.05513

detailed analysis of logical errors made: https://www.youtube.com/watch?v=bpp6Dz8N2zY

Some relevant seeming things I haven't yet read, where researchers are attempting to analyze or improve LLM reasoning:

https://arxiv.org/abs/2407.02678

https://arxiv.org/html/2406.11698v1

https://arxiv.org/abs/2402.11804

https://arxiv.org/abs/2401.14295

https://arxiv.org/abs/2405.15302

https://openreview.net/forum?id=wUU-7XTL5XO

https://arxiv.org/abs/2406.09308

https://arxiv.org/abs/2404.05221

https://arxiv.org/abs/2405.18512