# Human instincts, symbol grounding, and the blank-slate neocortex

post by Steven Byrnes (steve2152) · 2019-10-02T12:06:35.361Z · LW · GW · 23 comments

## Contents

  Intro: What is Common Cortical Algorithm (CCA) theory, and why does it matter for AGI?
we believe CCA theory?
CCA theory vs human-universal traits and instincts
List of ways that human-universal instincts and behaviors can exist despite CCA theory
Mechanism 1: Simple hardcoded connections, not implemented in the neocortex
Mechanism 2: Subcortex-supervised learning.
Mechanism 3: Same learning algorithm + same world = same internal model
Mechansim 4: Human-universal memes
Mechanism 5: "Two-process theory"
Mechanism 6: Time-windows
The special case of language.
Conclusion
None


# Intro: What is Common Cortical Algorithm (CCA) theory, and why does it matter for AGI?

As I discussed at Jeff Hawkins on neuromorphic AGI within 20 years [LW · GW], and was earlier discussed on LessWrong at The brain as a universal learning machine [LW · GW], there is a theory, due originally to Vernon Mountcastle in the 1970s, that the neocortex[1] (75% of the human brain by weight) consists of ~150,000 interconnected copies of a little module, the "cortical column", each of which implements the same algorithm. Following Jeff Hawkins, I'll call this the "common cortical algorithm" (CCA) theory. (I don't think that terminology is standard.)

So instead of saying that the human brain has a vision processing algorithm, motor control algorithm, language algorithm, planning algorithm, and so on, in CCA theory we say that (to a first approximation) we have a massive amount of "general-purpose neocortical tissue", and if you dump visual information into that tissue, it does visual processing, and if you connect that tissue to motor control pathways, it does motor control, etc.

Whether and to what extent CCA theory is true is, I think, very important for AGI forecasting, strategy, and both technical and non-technical safety research directionssee my answer here [LW(p) · GW(p)] for more details.

## Should we believe CCA theory?

CCA theory, as I'm using the term, is a simplified model. There are almost definitely a couple caveats to it:

1. There are sorta "hyperparameters" on the generic learning algorithm which seem to be set differently in different parts of the neocortex. For example, some areas of the cortex have higher or lower density of particular neuron types. There are other examples too.[2] I don't think this significantly undermines the usefulness or correctness of CCA theory, as long as these changes really are akin to hyperparameters, as opposed to specifying fundamentally different algorithms. So my reading of the evidence is that if you put, say, motor nerves coming out of visual cortex tissue, the tissue could do motor control, but it wouldn't do it quite as well as the motor cortex does.[3]
2. There is almost definitely a gross wiring diagram hardcoded in the genome—i.e., set of connections between different neocortical regions and each other, and other parts of the brain. These connections later get refined and edited during learning. Again, we can ask how much the existence of this innate gross wiring diagram undermines CCA theory. How complicated is the wiring diagram? Is it millions of connections among thousands of tiny regions, or just tens of connections among a few regions? Would the brain work at all if you started with a random wiring diagram? I don't know for sure, but for various reasons, my current belief is that this initial gross wiring diagram is not carrying much of the weight of human intelligence, and thus that this point is not a significant problem for the usefulness of CCA theory. (This is a loose statement; of course it depends on what questions you're asking.) I think of it more like: if it's biologically important to learn a concept space that's built out of associations between information sources X, Y, and Z, well, you just dump those three information streams into the same part of the cortex, and then the CCA will take it from there, and it will reliably build this concept space. So once you have the CCA nailed down, it kinda feels to me like you're most of the way there....[4]

Going beyond these caveats, I found pretty helpful literature reviews on both sides of the issue:

• The experimental evidence for CCA theory: see chapter 5 of Rethinking Innateness (1996)
• The experimental evidence against CCA theory: see chapter 5 of The Blank Slate by Steven Pinker (2002).

I won't go through the debate here, but after reading both of those I wound up feeling that CCA theory (with the caveats above) is probably right, though not 100% proven. Please comment if you've seen any other good references on this topic, especially more up-to-date ones.

(Update: I found another reference on CCA; see Gary Marcus vs Cortical Uniformity [LW · GW].)

CCA theory does not mean "no inductive biases"—of course there are inductive biases! It means that the inductive biases are sufficiently general and low-level that they work equally well for extremely diverse domains such as language, vision, motor control, planning, math homework, and so on. I typically think that the inductive biases are at a very low level, things like "we should model inputs using a certain type of data structure involving temporal sequences and spatial relations", and not higher-level semantic knowledge like intuitive biology or "when is it appropriate to feel guilty?" or tool use etc. (I don't even think object permanence or intuitive psychology are built into the neocortex; I think they're learned in early infancy. This is controversial and I won't try to justify it here. Well, intuitive psychology is a complicated case, see below.)

Anyway, that brings us to...

# CCA theory vs human-universal traits and instincts

The main topic for this post is:

If Common Cortical Algorithm theory is true, then how do we account for all the human-universal instincts and behaviors that evolutionary psychologists talk about?

Indeed, we know that there are a diverse set of remarkably specific human instincts and mental behaviors evolved by natural selection. Again, Steven Pinker's The Blank Slate is a popularization of this argument; it ends with Donald E. Brown's giant list of "human universals", i.e. behaviors that are observed in every human culture.

Now, 75% of the human brain (by weight) is the neocortex, but the other 25% consists of various subcortical ("old-brain") structures like the amygdala, and these structures are perfectly capable of implementing specific instincts. But these structures do not have access to an intelligent world-model—only the neocortex does! So how can the brain implement instincts that require intelligent understanding? For example, maybe the fact that "Alice got two cookies and I only got one!" is represented in the neocortex as the activation of neural firing pattern 7482943. There's no obvious mechanism to connect this arbitrary, learned pattern to the "That's so unfair!!!" section of the amygdala. The neocortex doesn't know about unfairness, and the amygdala doesn't know about cookies. Quite a conundrum!

(Update much later: Throughout this post, wherever I wrote "amygdala", I should have said "hypothalamus and brainstem". See here [LW · GW] for a better-informed discussion.)

This is really a symbol grounding problem, which is the other reason this post is relevant to AI alignment. When the human genome builds a human, it faces the same problem as a human programmer building an AI: how can one point a goal system at things in the world, when the internal representation of the world is a complicated, idiosyncratic, learned data structure? As we wrestle with the AI goal alignment problem, it's worth studying what human evolution did here.

# List of ways that human-universal instincts and behaviors can exist despite CCA theory

Finally, the main part of this post. I don't know a complete answer, but here are some of the categories I've read about or thought of, and please comment on things I've left out or gotten wrong!

Mechanism 1: Simple hardcoded connections, not implemented in the neocortex

Example: Enjoying the taste of sweet things. This one is easy. I believe the nerve signals coming out of taste buds branch, with one branch going to the cortex to be integrated into the world model, and another branch going to subcortical regions. So the genes merely have to wire up the sweetness taste buds to the good-feelings subcortical regions.

Mechanism 2: Subcortex-supervised learning.

Example: Wanting to eat chocolate. This is different than the previous item because "sweet taste" refers to a specific innate physiological thing, whereas "chocolate" is a learned concept in the neocortex's world-model. So how do we learn to like chocolate? Because when we eat chocolate, we enjoy it (Mechanism 1 above). The neocortex learns to predict a sweet taste upon eating chocolate, and thus paints the world-model concept of chocolate with a "sweet taste" property. The supervisory signal is multidimensional, such that the neocortex can learn to paint concepts with various labels like "painful", "disgusting", "comfortable", etc., and generate appropriate behaviors in response. (Vaguely related: the DeepMind paper Prefrontal cortex as a meta-reinforcement learning system.)

Mechanism 3: Same learning algorithm + same world = same internal model

Possible example: Intuitive biology. In The Blank Slate you can find a discussion of intuitive biology / essentialism, which "begins with the concept of an invisible essence residing in living things, which gives them their form and powers." Thus preschoolers will say that a dog altered to look like a cat is still a dog, yet a wooden toy boat cut into the shape of a toy car has in fact become a toy car. I think we can account for this very well by saying that everyone's neocortex has the same learning algorithm, and when they look at plants and animals they observe the same kinds of things, so we shouldn't be surprised that they wind up forming similar internal models and representations. I found a paper that tries to spell out how this works in more detail; I don't know if it's right, but it's interesting: free link, official link.

Mechansim 4: Human-universal memes

Example: Fire. I think this is pretty self-explanatory. People learn about fire from each other. No need to talk about neurons, beyond the more general issues of language and social learning discussed below.

Mechanism 5: "Two-process theory"

Possible example: Innate interest in human faces.[5] The subcortex-supervised learning mechanism above (Mechanism 2) can be thought of more broadly as an interaction between a hardwired subcortical system that creates a "ground truth", and a cortical learning algorithm that then learns to relate that ground truth to its complex internal representations. Here, Johnson's "two-process theory" for faces fits this same mold, but with a more complicated subcortical system for ground truth. In this theory, a subcortical system (ETA: specifically, the superior colliculus[6]) gets direct access to a low-resolution version of the visual field, and looks for a pattern with three blobs in locations corresponding to the eyes and mouth of a blurry face. When it finds such a pattern, it passes information to the cortex that this is a very important thing to attend to, and over time the cortex learns what faces actually look like (and suppresses the original subcortical template circuitry). Anyway, Johnson came up with this theory partly based on the observation that newborns are equally entranced by pictures of three blobs versus actual faces (each of which were much more interesting than other patterns), but after a few months the babies were more interested in actual face pictures than the three-blob pictures. (Not sure what Johnson would make of this twitter account.)

(Other possible examples of instincts formed by two-process theory: fear of snakes, interest in human speech sounds, sexual attraction.)

(Update: See my later post Inner alignment in the brain [LW · GW] for a more fleshed-out discussion of this mechanism.)

Mechanism 6: Time-windows

Examples: Filial imprinting in animals, incest repulsion (Westermarck effect) in humans. Filial imprinting is a famous result where newborn chicks (and many other species) form a permanent attachment to the most conspicuous moving object that they see in a certain period shortly after hatching. In nature, they always imprint on their mother, but in lab experiments, chicks can be made to imprint on a person, or even a box. As with other mechanisms here, time-windows provides a nice solution to the symbol grounding problem, in that the genes don't need to know what precise collection of neurons corresponds to "mother", they only need to set up a time window and a way to point to "conspicuous moving objects", which is presumably easier. The brain mechanism of filial imprinting has been studied in detail for chicks, and consists of the combination of time-windows plus the two-process model (mechanism 5 above). In fact, I think the two-process model was proven in chick brains before it was postulated in human brains.

There likewise seem to be various time-window effects in people, such as the Westermarck effect, a sexual repulsion between two people raised together as young children (an instinct which presumably evolved to reduce incest).

Mechanism 7 (speculative): empathetic grounding of intuitive psychology.

Possible example: Social emotions (gratitude, sympathy, guilt,...) Again, the problem is that the neocortex is the only place with enough information to, say, decide when someone slighted you, so there's no "ground truth" to use for subcortex-supervised learning. At first I was thinking that the two-process model for human faces and speech could be playing a role, but as far as I know, deaf-blind people have the normal suite of social emotions, so that's not it either. I looked in the literature a bit and couldn't find anything helpful. So, I made up this possible mechanism (warning: wild speculation).

Step 1 is that a baby's neocortex builds a "predicting my own emotions" model using normal subcortex-supervised learning (Mechanism 2 above). Then a normal Hebbian learning mechanism makes two-way connections between the relevant subcortical structures (amygdala) and the cortical neurons involved in this predictive model.

Step 2 is that the neocortex's universal learning algorithm will, in the normal course of development, naturally discover that this same "predicting my own emotions" model from step 1 can be reused to predict other people's emotions (cf. Mechanism 3 above), forming the basis for intuitive psychology. Now, because of those connections-to-the-amygdala mentioned in step 1, the amygdala is incidentally getting signals from the neocortex when the latter predicts that someone else is angry, for example.

Step 3 is that the amygdala (and/or neocortex) somehow learns the difference between the intuitive psychology model running in first-person mode versus empathetic mode, and can thus generate appropriate reactions, with one pathway for "being angry" and a different pathway for "knowing that someone else is angry".

So let's now return to my cookie puzzle above. Alice gets two cookies and I only get one. How can I feel it's unfair, given that the neocortex doesn't have a built-in notion of unfairness, and the amygdala doesn't know what cookies are? The answer would be: thanks to subcortex-supervised learning, the amygdala gets a message that one yummy cookie is coming, but the neocortex also thinks "Alice is even happier", and that thought also recruits the amygdala, since intuitive psychology is built on empathetic modeling. Now the amygdala knows that I'm gonna get something good, but that Alice is gonna get something even better, and that combination (in the current emotional context) triggers the amygdala to send out waves of jealousy and indignation. This is then a new supervisory signal for the neocortex, which allows the neocortex to gradually develop a model of fairness, which in turn feeds back into the intuitive psychology module, and thereby back to the amygdala, allowing the amygdala to execute more complicated innate emotional responses in the future, and so on.

(Update: See my later post Inner alignment in the brain [LW · GW] for a slightly more fleshed-out discussion of this mechanism.)

The special case of language.

It's tempting to put language in the category of memes (mechanism 4 above)—we do generally learn language from each other—but it's not really, because apparently groups of kids can invent grammatical languages from scratch (e.g. Nicaraguan Sign Language). My current guess is that it combines three things: (1) a two-process mechanism (Mechanism 5 above) that makes people highly attentive to human speech sounds. (2) possibly "hyperparameter tuning" in the language-learning areas of the cortex, e.g. maybe to support taller compositional hierarchies than would be required elsewhere in the cortex. (3) The fact that language can sculpt itself to the common cortical algorithm rather than the other way around—i.e., maybe "grammatical language" is just another word for "a language that conforms to the types of representations and data structures that are natively supported by the common cortical algorithm".

By the way, lots of people (including Steven Pinker) seem to argue that language processing is a fundamentally different and harder task than, say, visual processing, because language requires symbolic representations, composition, recursion, etc. I don't understand this argument; I think vision processing needs the exact same things! I don't see a fundamental difference between the visual-processing system knowing that "this sheet of paper is part of my notebook", and the grammatical "this prepositional phrase is part of this noun phrase". Likewise, I don't see a difference between recognizing a background object interrupted by a foreground occlusion, versus recognizing a noun phrase interrupted by an interjection. It seems to me like a similar set of problems and solutions, which again strengthens my belief in CCA theory.

# Conclusion

When I initially read about CCA theory, I didn't take it too seriously because I didn't see how instincts could be compatible with it. But I now find it pretty likely that there's no fundamental incompatibility. So having removed that obstacle, and also read the literature a bit more, I'm much more inclined to believe that CCA theory is fundamentally correct.

Again, I'm learning as I go, and in some cases making things up as I go along. Please share any thoughts and pointers!

1. I'll be talking a lot about the neocortex in this article, but shout-out to the thalamus and hippocampus, the other two primary parts of the brain's predictive-world-model-building-system. I'm just leaving them out for simplicity; this doesn't have any important implications for this article. ↩︎

2. More examples of region-to-region variation in the neocortex that are (plausibly) genetically-coded: (1) Spindle neurons only exist in a couple specific parts of the neocortex. I don't really know what's the deal with those. Kurzweil claims they're important for social emotions and empathy, if I recall correctly. Hmmm. (2) "Sensitive windows" (see Dehaene): Low-level sensory processing areas more-or-less lock themselves down to prevent further learning very early in life, and certain language-processing areas lock themselves down somewhat later, and high-level conceptual areas don't ever lock themselves down at all (at least, not as completely). I bet that's genetically hardwired. I guess psychedelics can undermine this lock-down mechanism? ↩︎

3. I have heard that the primary motor cortex is not the only part of the neocortex that emits motor commands, but don't know the details. ↩︎

4. Also, people who lose various parts of the neocortex are often capable of full recovery, if it happens early enough in infancy, which suggests to me that the CCA's wiring-via-learning capability is doing most of the work, and maybe the innate wiring diagram is mostly just getting things set up more quickly and reliably. ↩︎

5. See Rethinking Innateness p116, or better yet Johnson's article ↩︎

6. See, for example, Fast Detector/First Responder: Interactions between the Superior Colliculus-Pulvinar Pathway and Stimuli Relevant to Primates. Also, let us pause and reflect on the fact that humans have two different visual processing systems! Pretty cool! The most famous consequence is blindsight, a condition where the subconscious midbrain vision processing system (superior colliculus) is intact but the conscious neocortical visual processing system is not working. This study proves that blindsighted people can recognize not just faces but specific facial expressions. I strongly suspect blindsighted people would react to snakes and spiders too, but can't find any good studies (that study in the previous sentence regrettably used stationary pictures of spiders and snakes, not videos of them scampering and slithering). ↩︎

comment by Charlie Steiner · 2020-06-02T22:55:53.236Z · LW(p) · GW(p)

This was just on my front page for me, for some reason. So, it occurs to me that the example of the evolved FPGA is precisely the nightmare scenario for the CCA hypothesis.

If neurons behave according to simple rules during growth and development, and there are only smooth modulations of chemical signals during development, then nevertheless you might get regions of the cortex that look very similar, but whose cells are exploiting the hardly-noticeable FPGA-style quirks of physics in different ways. You'd have to detect the difference by luckily choosing the right sort of computational property to measure.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-06-03T01:12:57.333Z · LW(p) · GW(p)

Thanks for the comment! When I think about it now (8 months later), I have three reasons for continuing to think CCA is broadly right:

1. Cytoarchitectural (quasi-) uniformity. I agree that this doesn't definitively prove anything by itself, but it's highly suggestive. If different parts of the cortex were doing systematically very different computations, well maybe they would start out looking similar when the differentiation first started to arise millions of years ago, but over evolutionary time you would expect them to gradually diverge into superficially-obviously-different endpoints that are more appropriate to their different functions.

2. Narrowness of the target, sorta. Let's say there's a module that takes specific categories of inputs (feedforward, feedback, reward, prediction-error flags) and has certain types of outputs, and it systematically learns to predict the feedforward input and control the outputs according to generative models following this kind of selection criterion [LW · GW] (or something like that). This is a very specific and very useful thing. Whatever the reward signal is, this module will construct a theory about what causes that reward signal and make plans to increase it. And this kind of module automatically tiles—you can connect multiple modules and they'll be able to work together to build more complex composite generative models integrating more inputs to make better reward predictions and better plans. I feel like you can't just shove some other computation into this system and have it work—it's either part of this coordinated prediction-and-action mechanism, or not (in which case the coordination prediction-and-action mechanism will learn to predict it and/or control it, just like it does for the motor plant etc.). Anyway, it's possible that some part of the neocortex is doing a different sort of computation, and not part of the prediction-and-action mechanism. But if so, I would just shrug and say "maybe it's technically part of the neocortex, but when I say "neocortex", I'm using the term loosely and excluding that particular part." After all, I am not an anatomical purist; I am already including part of the thalamus when I say "neocortex" for example (I have a footnote in the article apologizing for that). Sorry if this description is a bit incoherent, I need to think about how to articulate this better.

3. Although it's probably just the Dunning-Kruger talking, I do think I at least vaguely understand what the algorithm is doing and how it works, and I feel like I can concretely see how it explains everything about human intelligence including causality, counterfactuals, hierarchical planning, task-switching, deliberation, analogies, concepts, etc. etc.

comment by Дмитрий Зеленский (dmitrii-zelenskii) · 2020-03-20T14:27:33.795Z · LW(p) · GW(p)

For me, your examples of why visual perception needs the same things as language, including time window, is a standard, textbook-level (and often used!) proof of the fact they're both (widely understood) Fodorian modules (in case of visual processing, two distinct modules indeed, though the labels "conscious" and "subconscious" are strange, I'm used to calling those "What-path" and "Where-path"), fine-tuned but not fully designed during the time-window, not that they are, vice versa, both handled by general algorithm like a snowflake.

Now, I understand that Fodorian modules (even when you throw away the old requirement of there being a strictly limited part of the cortex responsible for it) are not that widely held nowadays. However, when I look at people, I cannot help seeing them. From prosopagnosia to specific language impairments, aka aphasias (only two of the six commonly discussed aphasias are really language-based but the name stuck) to memory disruptions, we see individual modules breaking - including in-born reaking, before fine-tuning! - and just as well we see people whose general intelligence is reasonably low with unusually good performance of some of their modules.

Addendum: "visual" in "visual processing" is, of course, a red herring. It would be better to speak of two perception modules, with input variable (blindborn people fine-tune it to other things, for example - whereas those blinded in adulthood, AFAIK, do not).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-03-20T19:18:12.999Z · LW(p) · GW(p)

your examples of why visual perception needs the same things as language, including time window, is a standard, textbook-level (and often used!) proof of the fact they're both (widely understood) Fodorian modules

Interesting! Can you point me to a textbook or other reference that makes this argument?

the labels "conscious" and "subconscious" are strange, I'm used to calling those "What-path" and "Where-path"

If you're talking about the dorsal and lateral streams, that's not what I'm talking about. The dorsal steam and lateral stream are both part of the neocortex. I was talking about "vision processing in the neocortex" versus "vision processing in the midbrain (especially superior colliculus)". Does that make sense?

Replies from: dmitrii-zelenskii
comment by Дмитрий Зеленский (dmitrii-zelenskii) · 2020-03-21T11:58:13.529Z · LW(p) · GW(p)

On the second point - I have misunderstood you, now I see what you're talking about. If Fodorian modules' view is right, the neocortex one(s) still isn't (aren't) "conscious". The received wisdom I have says that modules are:

1)Automatic (one cannot consciously change how they work - except by cutting off their input) - hence susceptible to illusions/wrong analyses/...;

2)Autonomous (consciousness only "sees" outputs, a module is black box for its owner; these two properties are related but distinct - yet something that has both can barely be called "conscious");

3)Inherited with a critical period of fine-tuning (that's basically what you called time window).

There were some more points but I (obviously) forgot them. And that brings me to your first point: I can't point to a textbook right away but that was part of several courses I was taught (Psychology of cognitive processes at Moscow State University (Fundamental and Applied Linguistics program); Language, Music, and Cognition in NYI 2016 - nyi.spb.ru).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-03-21T13:09:58.079Z · LW(p) · GW(p)

Thank you! I agree that much of what happens in the neocortex is not conscious, and agree with your 1+2+3. I have edited the wording of that sentence to be less confusing. As for what is and isn't conscious, I like the Global Neuronal Workspace idea [LW · GW]. I use that a lot when thinking about the neocortex.

I think I used the term "sensitive windows" in footnote 2 for what you call "critical period of fine-tuning". I was thinking of things like low-level sensory processing, the fact that you can't learn a language in the native accent as an adult, etc. Then I also talked about "time windows" in the context of filial imprinting.

I used two different terms, "sensitive windows" and "time windows", but we may ask: are these two different things, or two examples of the same thing? I'm not sure. I would guess that they're the same mechanism, but the former is locking in "normal" information-processing connections (from neocortex to neocortex), and the latter is locking in specifically connections to dopamine neurons or connections to other hormones or some other type of connection to the limbic system. I'm still trying to understand how those latter connections work...

Replies from: dmitrii-zelenskii
comment by Дмитрий Зеленский (dmitrii-zelenskii) · 2020-03-22T14:01:15.802Z · LW(p) · GW(p)

I would think that the former are the _mechanism_ of the latter - though, as they say, "don't quote me on that".

There is an interesting question of whether, if many things are modules, there is also non-module part, the "general intelligence" part which does not share those properties. Perhaps unsurprisingly, there is no consensus (though my intuitions say there is the GI part).

Also, it seems that different modules might use the same (common) working memory - though this is not set in stone (and depends, in particular, on your analysis of language - if late Chomsky is right, only phonology (PF) and perhaps semantics (LF) are modular, whereas syntax uses our general recursive ability, and this is why it uses general working memory).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-03-23T01:05:16.496Z · LW(p) · GW(p)

Hmm, interesting!

My thinking about general intelligence is the combination of

• Common Cortical Algorithm—every part of the neocortex is a similarly-constructed general-purpose generative model builder that basically does some combination of self-supervised learning, RL, and Model Predictive Control [LW · GW]

• Specialized Functionality Via The Neocortex's Gross Wiring Diagram—So if you seed a region with particular feedforward and feedback information streams, then it will find patterns and build up the corresponding concept space.

• ...and therefore, that there are lots of different neocortical modules exactly to the extent that there are lots of different neocortical regions, with strict borders between them and few axons crossing those borders. I'm not sure the extent to which this is the case.

• Global Neuronal Workspace (GNW) [LW · GW] connecting diverse parts of the neocortex to each other and to long-term memory.

Also, I'm not convinced that "general working memory" is a thing. My understanding is that we can do things like remember a word by putting it on loop using speech motor control circuits. But I would expect that someone with a lesion to those speech motor control circuits would still be able to, say, hold an image in their head for a short period of time using visual cortex. I think recurrent networks are ubiquitous in the neocortex, and any of those networks can potentially hold an activation at least for a couple seconds, and a few of them for much longer. Or maybe when you're thinking about "general working memory", you're really thinking of the GNW [LW · GW]?

Replies from: dmitrii-zelenskii
comment by Дмитрий Зеленский (dmitrii-zelenskii) · 2020-03-23T14:07:59.474Z · LW(p) · GW(p)

1. "My understanding is that we can do things like remember a word by putting it on loop using speech motor control circuits" - this is called phonological loop in psycholinguistics (psychology) and is NOT THE SAME as working memory - in fact, tests for working memory usually include reading something aloud precisely to occupy the circuits and not let the test subject take advantage of their phonological loop. What I mean by working memory is the number of things one can hold in their mind simultaneously captured by "5+-2" work and Daneman's tests - whatever the explanation is.

2. Fodorian modules are, by definition, barely compatible with CCA. And the Zeitgeist of theoretical linguistics leads me to think that when you use RNN to explain something you're cheating your way to performance instead of explaining what goes on (i.e. to think that brain ISN'T an RNN or a combination thereof - at least not in an obvious sense). Thus we don't quite share neurological assumptions - though bridging to a common point may well be possible.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-03-23T14:31:33.654Z · LW(p) · GW(p)

this is called phonological loop

Thank you for clearing up my confusion! :-)

when you use RNN to explain something...

To be clear, I am using the term "recurrent" as a kinda generic term meaning "having a connectivity graph in which there are cycles". That's what I think is ubiquitous in the neocortex. I absolutely do not think that "the kind of RNN that ML practitioners frequently use today" is similar to how the neocortex works. Indeed, I think very few ML practitioners are using algorithms that are fundamentally similar to brain algorithms. (I think Dileep George is one of the exceptions.)

Fodorian modules are, by definition, barely compatible with CCA

...unless the Fodorian modules are each running the same algorithm on different input data, right?

Replies from: dmitrii-zelenskii
comment by Дмитрий Зеленский (dmitrii-zelenskii) · 2020-03-24T00:11:28.185Z · LW(p) · GW(p)

Oh, then sorry about the RNN attack ;)

Well, no. In particular, if you feed the same sound input to linguistic module (PF) and to the module of (say, initially visual) perception, the very intuition behind Fodorian modules is that they will *not* do the same - PF will try to find linguistic expressions similar to the input whereas the perception module will try to, well, tell where the sound comes from, how loud it is and things like that.

comment by G Gordon Worley III (gworley) · 2019-10-02T19:17:20.460Z · LW(p) · GW(p)

I found this post really interesting. My main interest has been in understanding human values, and I've been excited by predictive coding because it possibly offers a way to ground values (good being derived from error minimization, bad from error maximization). The CCA theory could help explain why it seems so much of the brain is "doing the same kind of thing" that could result in predictive coding being a useful model even if it turns out the brain doesn't literally have neurons wired up as control systems that minimize sensory prediction error.

Replies from: abe-dillon
comment by Abe Dillon (abe-dillon) · 2020-04-12T01:21:11.709Z · LW(p) · GW(p)

Hey, G Gordon Worley III!

I just finished reading this post because Steve2152 was one of the two people (you being the other) to comment on my (accidentally published) post on formalizing and justifying the concept of emotions.

It's interesting to hear that you're looking for a foundational grounding of human values because I'm planning a post on that subject as well. I think you're close with the concept of error minimization. My theory reaches back to the origins of life and what sets living systems apart from non-living systems. Living systems are locally anti-entropic which means: 1) According to the second law of thermodynamics, a living system can never be a truly closed system. 2) Life is characterized by a medium that can gather information such as genetic material.

The second law of thermodynamics means that all things decay, so it's not enough to simply gather information, the system must also preserve the information it gathers. This creates an interesting dynamic because gathering information inherently means encountering entropy (the unknown) which is inherently dangerous (what does this red button do?). It's somewhat at odds with the goal of preserving information. You can even see this fundamental dichotomy manifest in the collective intelligence of the human race playing tug-of-war between conservatism (which is fundamentally about stability and preservation of norms) and liberalism (which is fundamentally about seeking progress or new ways to better society).

Another interesting consequence of the 'telos' of life being to gather and preserve information is: it inherently provides a means of assigning value to information. That is: information is more valuable the more it pertains to the goal of gathering and preserving information. If an asteroid were about to hit earth and you were chosen to live on a space colony until Earth's atmosphere allowed humans to return and start society anew, you would probably favor taking a 16 GB thumb drive with the entire English Wikipedia article text than a server-rack full several petabytes of high-definition recordings of all the reality television ever filmed, because that won't be super helpful toward the goal of preserving knowledge *relevant* to man kind's survival.

The theory also opens interesting discussions like, if all living things have a common goal; why do things like paracites, conflict, and war exist? Also, how has evolution led to a set of instincts that imperfectly approximate this goal? How do we implement this goal in an intelligent system? How do we guarantee such an implementation will not result in conflict? Etc.

Anyway, I hope you'll read it when I publish it and let me know what you think!

Replies from: gworley
comment by G Gordon Worley III (gworley) · 2020-04-12T19:32:31.648Z · LW(p) · GW(p)

Looking forward to reading it. In the meantime, if you didn't stumble on them already, you might enjoy these posts I wrote as I think they point to some similar things:

comment by Charlie Steiner · 2019-10-05T07:24:14.566Z · LW(p) · GW(p)

This is a really cool post, thanks!

comment by hold_my_fish · 2021-03-21T04:53:42.269Z · LW(p) · GW(p)

I find myself returning to this because the idea of a "common cortical algorithm" is intriguing.

It seems to me that if there is a "common cortical algorithm" then there is also a "common cortical problem" that it solves. I suspect it would be useful to understand what this problem is.

(As an example of why isolating the algorithm and problem could be quite different, consider linear programming. To solve a linear programming problem, you can choose a simplex algorithm or an interior-point method, and these are fundamentally different approaches that are both viable. It's also quite a bit easier to state linear programming as a problem than it is to describe either solution approach.)

Do you have a view on the most plausible candidates for a "common cortical problem" (CCP)? The tricky aspects that come to mind: not being too narrow (i.e. the CCP should include (almost) everything the CCA can do), not being too broad (i.e. the CCA should be able solve (almost) every instance of the CCP), and not being too vague (ideally precise enough that you could actually make a benchmark test suite to evaluate proposed solutions).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-03-23T13:40:28.809Z · LW(p) · GW(p)

Thanks!

The way I'm currently thinking about it is: In everywhere but the frontal lobe, the task is something like

Find a generative model that accurately predicts Input Signal X based on Contextual Information Y.

But it's different X & Y for different parts of the cortex, and there can even be cascades where one region needs to predict the residual prediction error from another region (ref). And there's also a top-down attention mechanism such that not all prediction errors are equally bad.

The frontal lobe is a bit different in that it's choosing what action to take or what thought to think (at least in part). That's not purely a prediction task, because it has more than one right answer. I mean, you can predict that you'll go left, then go left, and that's a correct prediction. Or you can predict that you'll go right, then go right, and that's a correct prediction too! So it's not just predictions; we need reinforcement learning / rewards too. In those cases, the task is "Find a generative model that is making correct predictions AND leading to high rewards," presumably. But I don't think that's really something that the neocortex is doing, per se. I think it's the basal ganglia (BG), which sends outputs to the frontal lobe. I think the BG looks at what the neocortex is doing, calculates a value function (using TD learning, and storing its information in the striatum), and then (loosely speaking) the BG reaches up into the neocortex and fiddles with it, trying to suppress the patterns of activity that it thinks would lead to lower rewards and trying to amplify the patterns of activity that it thinks would lead to higher rewards.

See my Predictive coding = RL + SL + Bayes + MPC [LW · GW] for my old first-cut attempt to think through this stuff. Meanwhile I've been reading all about the striatum and RL stuff, more posts forthcoming I hope.

Happy for any thoughts on that. :-)

Replies from: hold_my_fish
comment by hold_my_fish · 2021-03-26T09:51:44.815Z · LW(p) · GW(p)

A few points where clarification would help, if you don't mind (feel free to skip some):

• What are the capabilities of the "generative model"? In general, the term seems to be used in various ways. e.g.
• Sampling from the learned distribution (analogous to GPT-3 at temp=1)
• Evaluating the probability of a given point
• Producing the predicted most likely point (analogous to GPT-3 at temp=0)
• Is what we're predicting the input at the next time step? (Sometimes "predict" can be used to mean filling in missing information, but that doesn't seem to make sense in this context.) Also, I'm not sure what I mean by "time step" here.
• The "input signal" here is coming from whatever is wired into the cortex, right? Does it work to think of this as a vector in ?
• Is the contextual information just whatever is the current input, plus whatever signals are still bouncing around?

Also, the capability described may be a bit too broad, since there are some predictions that the cortex seems to be bad at. Consider predicting the sum of two 8-digit integers. Digital computers compute that easily, so it's fundamentally an easy problem, but for humans to do it requires effort. Yet for some other predictions, the cortex easily outperforms today's digital computers. What characterizes the prediction problems that the cortex does well?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-03-26T15:43:07.275Z · LW(p) · GW(p)

Think of a generative model as something like "This thing I'm looking at is a red bouncy ball". Just looking at it you can guess pretty well how much it would weigh if you lifted it, how it would feel if you rubbed it, how it would smell if you smelled it, and how it would bounce if you threw it. Lots of ways to query these models! Powerful stuff!

some predictions that the cortex seems to be bad at

If a model is trained to minimize a loss function L, that doesn't mean that, after training, it winds up with a very low value of L in every possible case. Right? I'm confused about why you're confused. :-P

comment by VermillionStuka · 2020-02-25T11:57:41.543Z · LW(p) · GW(p)

Thank you very much for this, I had heard of CCA theory but didn't know enough to evaluate it myself. I think this opens new possible paths to AGI I had not thoroughly considered before.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2020-02-25T14:22:39.842Z · LW(p) · GW(p)

Thanks for saying that! :)

comment by countedblessings · 2019-10-02T19:15:32.339Z · LW(p) · GW(p)

Any thoughts on the relevance of category theory as the same kind of "universal modeling system" for mathematics as the brain might well be for real life?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2019-10-06T15:38:07.775Z · LW(p) · GW(p)