Why is neuron count of human brain relevant to AI timelines?
post by samuelshadrach (xpostah) · 2024-12-24T05:15:58.839Z · LW · GW · 2 commentsThis is a question post.
Contents
Answers 6 Carl Feynman 4 Charlie Steiner None 2 comments
My take is that it is irrelevant so I want to hear opposing viewpoints.
The really simple argument for its irrelevance is that evolution used a lot more compute to produce human brains than the compute inside a single human brain. If you are making an argument on how much compute can find an intelligent mind, you have to look at how much compute used by all of evolution. (This includes compute to simulate environment, which Ajeya Cotra's bioanchors wrongly ignores.)
What am I missing?
Answers
This is a great question!
Point one:
The computational capacity of the brain used to matter much more than it matters now. The AIs we have now are near-human or superhuman at many skills, and we can measure how skill capacity varies with resources in the near-human range. We can debate and extrapolate and argue with real data.
But we spent decades where the only intelligent system we had was the human brain, so it was the only anchor we had for timelines. So even though it’s very hard to make good estimates from, we had to use it.
Point two:
Most information that gives rise to the human mind is learned, not evolved.
The information encoded by evolution is less than a hundred megabytes. It’s limited by the size of the genome (1 gigabytes). Moreover, we know that much of the genome is unimportant for mental development. About 40% is parasitic (viruses and transposons). Much of the remaining DNA is not under evolutionary control, varying randomly between individuals. Of expressed genes, only about a quarter appear to be expressed in the brain. And some of them encode things AI doesn’t need, like the high-reliability plumbing of the circle of Willis, or the mysteries of love, or the biochemical pickiness of the blood-brain barrier, or wanting to pee when you hear running water. So the “program” contributed by evolution is no more than the size of a largish program like a compiler. (I would claim it’s probably even less. I think the important instincts plus the learning algorithms are only a few thousand lines of code. But that’s debatable.)
On the other hand, the amount learned in a lifetime is on the order of one or a few gigabytes.
Point three:
Most of the information accumulated by evolution has been destroyed. All of the information accumulated in a species is lost when that species goes extinct. And most species have gone extinct, leaving no descendants. The world of the Permian period was (as far as we know) just as busy as today, with hundreds of thousands of animal species. Just one of those species, a little burrowing critter among many other types of little burrowing critters, was the ancestor of all mammals. All the other little burrowing critters lost out. All their evolutionary innovations have been lost.
This doesn’t apply to species with horizontal transmission of genes, like bacteria. But it applies to animals, who are the only creatures with brains.
In the strongest sense, neither the human brain analogy nor the evolution analogy really apply to AI. They only apply in a weaker sense where you are aware you're working with analogy, and should hopefully be tracking some more detailed model behind the scenes.
The best argument to consider human development a stronger analogy than evolutionary history is that present-day AIs work more like human brains than they do like evolution. See e.g. papers finding that you can use a linear function to translate some concepts between brain scans and internal layers in a LLM, or the extremely close correspondence between ConvNet feature and neurons in the visual cortex. In contrast, I predict it's extremely unlikely that you'll be able to find a nontrivial correspondence between the internals of AI and evolutionary history or the trajectory of ecosystems or similar.
Of course, just because they work more like human brains after training doesn't necessarily mean they learn similarly - and they don't learn similarly! In some ways AI's better (backpropagation is great, but it's basically impossible to implement in a brain), in other ways AI's worse (biological neurons are way smarter than artificial 'neurons'). Don't take the analogy too literally. But most of the human brain (the neocortex) already learns its 'weights' from experience over a human lifetime, in a way that's not all that different from self-supervised learning if you squint.
↑ comment by samuelshadrach (xpostah) · 2024-12-25T10:56:02.917Z · LW(p) · GW(p)
See e.g. papers finding that you can use a linear function to translate some concepts between brain scans and internal layers in a LLM, or the extremely close correspondence between ConvNet feature and neurons in the visual cortex.
I would love links to these if you have time.
But also, let’s says it’s true that there is similarity in internal structure of the end results - adult human brain and trained LLM. Adult human brain was produced by evolution + learning after birth. Trained LLM was produced by gradient descent. This does not tell me evolution doesn’t matter and learning after birth matters.
> But most of the human brain (the neocortex) already learns its 'weights' from experience over a human lifetime, in a way that's not all that different from self-supervised learning if you squint.
The difference is that the weights are not initialised with random values at birth (or at the embryo stage, to be more precise).
They only apply in a weaker sense where you are aware you're working with analogy, and should hopefully be tracking some more detailed model behind the scenes.
What do you mean by weaker sense? I say irrelevant and you say weaker sense, so we’re not yet in agreement then. How much predictive power does this analogy have as per you personally?
Replies from: Charlie Steiner↑ comment by Charlie Steiner · 2024-12-25T15:31:44.859Z · LW(p) · GW(p)
Some survey articles:
https://arxiv.org/abs/2306.05126
https://arxiv.org/pdf/2001.07092
The difference is that the weights are not initialised with random values at birth (or at the embryo stage, to be more precise).
The human cortex (the part we have way more of than chimps) is initialized to be made of a bunch of cortical column units, with slowly varying properties over the surface of the brain. But there's decent evidence that there's not much more initialization than that, and that that huge fraction of the brain has to slowly pick up knowledge within the human lifetime before it starts being useful, e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC9957955/
Or you could think about it like our DNA has on the order of a megabyte to spend on the brain, and the adult brain has on the order of a terabyte of information. So 99.99[..]% of the information in the adult brain comes from the learning algorithm, not the initialization.
How much predictive power does this analogy have as per you personally?
Yeah, it's way more informative than the evolution analogy to me, because I expect human researchers + computers spending resources designing AI to be pretty hard to analogize to evolution, but learning within AI to be within a few orders of magnitude on various resources to learning within a brain's lifetime.
Replies from: xpostah↑ comment by samuelshadrach (xpostah) · 2024-12-25T16:39:16.401Z · LW(p) · GW(p)
Thanks for the links. Might go through when I find time.
Even if the papers prove that there's similiarities, I don't see how this proves anything about evolution versus within-lifetime learning.
But there's decent evidence that there's not much more initialization than that, and that that huge fraction of the brain has to slowly pick up knowledge within the human lifetime before it starts being useful, e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC9957955/
This seems like your strongest argument. I will have to study more to understand this.
our DNA has on the order of a megabyte to spend on the brain
That's it? Really? That is new information for me.
Tbh your argument might end up being persuasive to me. So thank you for writing them.
The problem is that me building a background in neuroscience to the point I'm confident I'm not being fooled, will take time. And I'm interested in neuroscience but not that interested in studying it just for AI safety reasons. If you have like a post that covers this argument well (around initialisation not storing a lot of information) it'll be nice. (But not necessary ofcourse, that's upto you)
2 comments
Comments sorted by top scores.
comment by notfnofn · 2024-12-24T12:04:20.075Z · LW(p) · GW(p)
If you are making an argument on how much compute can find an intelligent mind, you have to look at how much compute used by all of evolution.
Just to make sure I fully understand your argument, is this paraphrase correct?
"Suppose we have the compute theoretically required to simulate the human brain down to an adequate granularity for obtaining its intelligence (which might be at the level of cells instead of, say, the atomic level). Even so, one has to consider the compute required to actually build such a simulation, which could be much larger as the human brain was built by the full universe."
(My personal view is that the opposite direction is true: it seems with recent evidence that we can pareto-exceed human intelligence while being very far from the compute required to simulate a brain. An idea I've seen floating around here is that natural selection built our brain randomly with a reward function that valued producing offspring so there is a lot of architecture that is irrelevant to intelligence)
Replies from: xpostah↑ comment by samuelshadrach (xpostah) · 2024-12-25T11:02:07.538Z · LW(p) · GW(p)
Yes your paraphrase is not bad. I think we can assume things outside of Earth don’t need to be simulated, it would be surprising to me if events outside of Earth made the difference between evolution producing Homo sapiens versus some other less intelligent species. (Maybe a few basic things like temperature of the Earth being shifted slowly) For the most part the Earth is causally isolated from the rest of the universe.
Now which parts of the Earth can we safely omit simulating is a harder question as there’s more causal interactions going on. I can make some guesses around parts of the earths environment that can be ignored by the simulation, but they’ll be guesses only.
An idea I've seen floating around here is that natural selection built our brain randomly with a reward function that valued producing offspring so there is a lot of architecture that is irrelevant to intelligence
Yes gradient descent is likely a faster search algorithm, but IMO you’re still using it to search the big search space that evolution searched through, not the smaller one a human brain searches through after being born.