Davidmanheim's Shortform

davidmanheim

Davidmanheim's Shortform

post by Davidmanheim · 2025-01-16T08:23:40.952Z · LW · GW · 7 comments

7 comments

7 comments

Comments sorted by top scores.

comment by Davidmanheim · 2025-03-31T19:12:40.261Z · LW(p) · GW(p)

Distinct views about LLM Intelligence

In discussing AI and existential risk, I’ve noticed that disagreements often hinge on views of AI capabilities—particularly whether large language models (LLMs) are "doing reasoning." This deserves clearer differentiation. Matt Yglesias once noted that “most optimists are not, fundamentally, AI optimists — they are superintelligence skeptics.” This feels deeply true, and it’s connected to the spectrum of beliefs about current AI systems’ reasoning abilities. To continue Ygelsias’ dichotomy, I’ll sketch a brief and uncharitable pastiche of each view.

The first, which I’ll (unfairly) call the stochastic parrot view, is that LLMs don’t actually understand. What models actually do is constructed based on a textual model of the world that has deeply confused relationships between different things. The successes based on various metrics like exams and reasoning tests are in large part artifacts of training on test data, i.e. leakage. Given that there is some leakage and data about all of the different types of evaluation which are performed on LLMs in the training data, along with substantive answers to similar or identical questions, statistics about the training data text, unrelated to understanding, leads to many correct answers. This success, however, is essentially entirely due to the cognitive labor of those whose input data was used, and not a successful world-model. The failures in doing slight variations of common tasks are largely an obvious result of this, since there isn’t an underlying coherent understanding.

The second, which I’ll (also unfairly) call the already AGI view, is that LLMs do have a largely coherent underlying model of the world. The textual data on which they are trained is rich enough to allow the model’s training to capture many true facts about the world, even though its implicit causal understanding is imperfect. The successes based on various metrics such as exams are similar to that of most humans who take exams, a combination of pattern matching based on training, and cached implicit models of what is expected. At least a large portion of LLM failures in doing slight variations of common tasks are because it is implicitly being asked to reproduce the typical class of answers to questions, and most humans pay only moderate attention and will make similar mistakes, especially in a single pass. The success of prompt design, such as “reason step-by-step,” is just telling the model to do what humans actually do. The abilities which are displayed are evidence that the first-pass failure is sloppiness, not inability.

I suspect that there are two stages of the crux between these views. The first is whether LLMs are doing “true” reasoning, and the second is whether humans are. I have mostly focused on the first, but to conclude, the second seems important as well - if there’s something called reasoning and humans don’t do it most of the time, and some humans don’t do at all - which some have claimed - then given how much these systems have memorized and their extant competencies, I’m not sure that this form of reasoning is needed for AGI, even if it’s needed for ASI.

Replies from: Mo Nastri, sharmake-farah

↑ comment by Mo Putera (Mo Nastri) · 2025-04-01T04:49:49.670Z · LW(p) · GW(p)

Just signal-boosting the obvious references to the second: Sarah Constantin's Humans Who Are Not Concentrating Are Not General Intelligences and Robin Hanson’s Better Babblers.

After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer.
Simple correlations also seem sufficient to capture most polite conversation talk, such as the weather is nice, how is your mother’s illness, and damn that other political party. Simple correlations are also most of what I see in inspirational TED talks, and when public intellectuals and talk show guests pontificate on topics they really don’t understand, such as quantum mechanics, consciousness, postmodernism, or the need always for more regulation everywhere. After all, media entertainers don’t need to understand deep structures any better than do their audiences.
Let me call styles of talking (or music, etc.) that rely mostly on low order correlations “babbling”. Babbling isn’t meaningless, but to ignorant audiences it often appears to be based on a deeper understanding than is actually the case. When done well, babbling can be entertaining, comforting, titillating, or exciting. It just isn’t usually a good place to learn deep insight.

It's unclear to me how much economically-relevant activity is generated by low order correlation-type reasoning, or whatever the right generalisation of "babbling" is here.

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2025-04-01T17:44:56.237Z · LW(p) · GW(p)

Thank you, definitely agree about linking those as relevant.

It's unclear to me how much economically-relevant activity is generated by low order correlation-type reasoning

I think one useful question is whether babbling can work to prune, and it seems the answer from reasoning models is yes.

↑ comment by Noosphere89 (sharmake-farah) · 2025-04-01T19:13:26.363Z · LW(p) · GW(p)

My own take is that I'm fairly sympathetic to the "LLMs are already able to get to AGI" view, with the caveat that most of the difference between human and LLM learning where humans are superior than LLMs comes from being able to do meta-learning over long horizons, and we haven't yet been shown this is possible for LLMs to do purely by scaling compute.

Indeed, I think it's the entire crux of the scaling hypothesis debate, in whether scale enables meta-learning over longer and longer time periods:

https://www.lesswrong.com/posts/deesrjitvXM4xYGZd/metr-measuring-ai-ability-to-complete-long-tasks#hSkQG2N8rkKXosLEF [LW(p) · GW(p)]

comment by Davidmanheim · 2025-01-16T08:23:41.386Z · LW(p) · GW(p)

Toby Ord writes that “the required resources [for LLM training] grow polynomially with the desired level of accuracy [measured by log-loss].” He then concludes that this shows “very poor returns to scale,” and christens it the "Scaling Paradox." (He continues to point out that this doesn’t imply it can’t create superintelligence, but I agree with him about that.)

But what would it look like if this were untrue? That is, what would be the conceptual alternative, where required resources grow more slowly?I think the answer is that it’s conceptually impossible.

To start, there is a fundamental bound on loss at zero, since the best possible model perfectly predicts everything - it exactly learns the distribution. This can happen when overfitting a model, but it can also happen when there is a learnable ground truth; models that are trained to learn a polynomial function can learn them exactly.

But there is strong reason to expect the bound to be significantly above zero loss. The training data for LLMs contains lots of aleatory randomness, things that are fundamentally conceptually unpredictable. I think it’s likely that things like RAND’s random number book are in the training data, and it’s fundamentally impossible to predict randomness. I think something similar is generally true for many other things - predicting world choice for semantically equivalent words, predicting where typos occur, etc.

Aside from being bound well above zero, there's a strong reason to expect that scaling is required to reduce loss for some tasks. In fact, it’s mathematically guaranteed to require significant computation to get near that level for many tasks that are in the training data. Eliezer pointed out that GPTs are predictors [LW · GW], and gives the example of a list of numbers followed by their two prime factors. It’s easy to generate such a list by picking pairs of primes and multiplying them, the writing the answer first - but decreasing loss for generating the next token to predict the primes from the product is definitionally going to require exponentially more computation to perform better for larger primes.

And I don't think this is the exception, I think it's at least often the rule. The training data for LLMs contains lots of data where the order of the input doesn’t follow the computational order of building that input. When I write an essay, I sometimes arrive at conclusions and then edit the beginning to make sense. When I write code, the functions placed earlier often don’t make sense until you see how they get used later. Mathematical proofs are another example where this would often be true.

An obvious response is that we’ve been using exponentially more compute for better accomplishing tasks that aren’t impossible in this way - but I’m unsure if that is true. Benchmarks keep getting saturated, and there’s no natural scale for intelligence. So I’m left wondering whether there’s any actual content in the “Scaling Paradox.”

(Edit: now also posted to my substack.)

Replies from: abramdemski

↑ comment by abramdemski · 2025-01-17T18:04:31.959Z · LW(p) · GW(p)

One problem is that log-loss is not tied that closely to the types of intelligence that we care about. Extremely low log-loss necessarily implies extremely high ability to mimic a broad variety of patterns in the world, but that's sort of all you get. Moderate improvements in log-loss may or may not translate to capabilities of interest, and even when they do, the story connecting log-loss numbers to capabilities we care about is not obvious. (EG, what log-loss translates to the ability to do innovative research in neuroscience? How could you know before you got there?)

When there were rampant rumors about an AI slowdown in 2024, the speculation in the news articles often mentioned the "scaling laws" but never (in my haphazard reading) made a clear distinction between (a) frontier labs seeing that the scaling laws were violated, IE, improvements in loss are really slowing down, (b) there's a slowdown in the improvements to other metrics, (c) frontier labs are facing a qualitative slowdown, such as a feeling that GPT5 doesn't feel like as big of a jump as GPT4 did. Often these concepts were actively conflated.

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2025-01-18T21:59:27.860Z · LW(p) · GW(p)

Strongly agree. I was making a narrower point, but the metric is clearly different than the goal - if anything it's more surprising that we see so much correlation as we do, given how much it has been optimized.

Davidmanheim's Shortform

Contents

7 comments