Is RL involved in sensory processing?

post by Steven Byrnes (steve2152) · 2021-03-18T13:57:28.888Z · LW · GW · 21 comments

Contents

  (UPDATE 1½ YEARS LATER: I’ve learned a lot since writing this post, and you shouldn’t assume that I still endorse everything herein.)
  “A unified selection signal for attention and reward in primary visual cortex”, Stănişor et al. 2013.
  “A Cholinergic Mechanism for Reward Timing within Primary Visual Cortex”, Chubykin et al. 2013.
  “Central Cholinergic Neurons Are Rapidly Recruited by Reinforcement Feedback”, Hangya et al. 2015.
  “Perceptual learning rules based on reinforcers and attention”, Roelfsema et al. 2010.
None
21 comments

(UPDATE 1½ YEARS LATER: I’ve learned a lot since writing this post, and you shouldn’t assume that I still endorse everything herein.)

I’m biased: I have a strong prior belief that reinforcement learning should not be involved in sensory processing in the brain. (Update: I meant "directly involved", see comment [LW(p) · GW(p)].) (Update 2: I should have said "low-level sensory processing", see discussion of inferotemporal cortex here [LW · GW].) The reason is simple: avoiding wishful thinking.

If there’s a positive reward for looking at beautiful ocean scenes, for example, then the RL solution would converge towards parsing whatever you look at as a beautiful ocean scene, whether it actually is or not!

Predictive learning (a.k.a. self-supervised learning), by contrast, seems perfect for the job of training sensory-processing systems. Well, really weighted predictive learning, where some prediction errors are treated as worse than others, and where top-down attention is providing the weights. (Or something. Not sure about the details here.) Anyway, predictive learning does not have a “wishful thinking” problem; plus there’s a nice information-rich supervisory signal that provides full error vectors (which are used for error-driven learning) which are easier to learn from than 1D rewards. (Imagine a high school teacher grading essays. If they say “That essay was bad; next time try using shorter paragraphs”, that’s like an error gradient signal, telling the student how to improve. If they just say “That essay was bad”, that’s like a reward signal, and now making progress is much harder!!) So that’s been my belief: I say reward learning has no place in the sensory-processing parts of the brain.

So later on, I was excited to learn that the basal ganglia (which plays a central role in reinforcement learning) sends signals into the frontal lobe of the brain—the home of plans and motor outputs—but not to the other lobes, which are more involved in sensory processing. (Update: There's at least one exception, namely inferotemporal cortex; I guess the division between RL / not-RL does not quite line up perfectly with the division between frontal lobe / other lobes.) (Update 2: I have a lot more to say about the specific case of inferotemporal cortex here [LW · GW].) Anyway, that seemed to confirm my mental picture.

Then I was reading Marblestone Wayne Kording 2016 (let’s call it MWK), and was gobsmacked—yes, gobsmacked!—when I came across this little passage:

Reward-driven signaling is not restricted to the striatum [part of the basal ganglia], and is present even in primary visual cortex (Chubykin et al., 2013; Stănişor et al., 2013). Remarkably, the reward signaling in primary visual cortex is mediated in part by glial cells (Takata et al., 2011), rather than neurons, and involves the neurotransmitter acetylcholine (Chubykin et al., 2013; Hangya et al., 2015). On the other hand, some studies have suggested that visual cortex learns the basics of invariant object recognition in the absence of reward (Li and Dicarlo, 2012), perhaps using reinforcement only for more refined perceptual learning (Roelfsema et al., 2010).

But beyond these well-known global reward signals, we argue that the basic mechanisms of reinforcement learning may be widely re-purposed to train local networks using a variety of internally generated error signals. These internally generated signals may allow a learning system to go beyond what can be learned via standard unsupervised methods, effectively guiding or steering the system to learn specific features or computations (Ullman et al., 2012)."

Could it be?? Am I just horribly confused about everything?

So I downloaded and skimmed MWK’s key citations here. To make a long story short, I’m not convinced. Here are my quick impressions of the relevant papers ...

“A unified selection signal for attention and reward in primary visual cortex”, Stănişor et al. 2013.

Start with common sense. Let’s say you just love finding pennies on the sidewalk. It’s just your favorite thing. You dream about it at night. Now, if you see a penny on the sidewalk, you’re bound to immediately notice it and pay continuing attention to it. That’s obvious, right? The general pattern is: Attention is often directed towards things that are rewarding, and the amount of reward is likely to bear at least some relation to how much attention you pay. Moreover, the strength and direction of attention influences the activity of neurons all over the cortex.

Now, the authors did an experiment on macaques, where a dot appeared and they had to look at it, and the color of the dot impacted how much reward they got when they did so. I guess the idea was that they were controlling for attention because the macaques were paying attention regardless of the dot colors—how else would they saccade to it? I don’t really buy that. I think that attention has a degree—it’s not just on or off. Let’s say I love finding pennies on the sidewalk, and I like finding nickels on the sidewalk, but my heart’s not in it. When I see either a nickel or a penny, I’ll look at it. But I’ll look a lot more intently at the penny than the nickel! For example, maybe I was singing a song in my head as I walked. If I see the nickel, maybe I’ll look at the nickel but keep singing the song. I notice the nickel, but I still have some attention left over! If I see the penny, I’ll be so excited that everything else in my head stops in its tracks, and 100% of my attention is focused on that penny.

From that perspective, everything in the paper seems to actually support my belief that reward-based learning plays no role whatsoever in the visual cortex. Reward affects the frontal lobe, and then depending on the reward, the frontal lobe flows more or less attention towards the visual cortex. That would nicely explain their finding that: “The neuronal latency of this reward value effect in V1 was similar to the latency of attentional influences. Moreover, V1 neurons with a strong value effect also exhibited a strong attention effect, which implies that relative value and top–down attention engage overlapping, if not identical, neuronal selection mechanisms.”

“A Cholinergic Mechanism for Reward Timing within Primary Visual Cortex”, Chubykin et al. 2013.

The authors took rats and coaxed them to learn that after they did a certain thing (lick a thing a certain number of times), a rewarding thing would happen to them (they’re thirsty and they get a drop of water). Right when they expected the reward, there were telltale signs of activity in their primary visual cortex (V1).

I figure, as above, that this talk about “rewards” is a red herring—the important thing is that the reward expectation coincides with some shift of the rat’s attention, which has ripple effects all over the cortex, and thus all parts of the cortex learn to expect those ripple effects.

Then the experimenters changed the number of licks required for the reward. The telltale signs of activity, predictably, shifted to the new reward time ... but not if the experimenters infused the rat brain (or more specifically the part of the visual cortex where they were recording) with a toxin that prevented acetylcholine effects. How do I account for that? Easy: acetylcholine = learning rate—see separate post [LW · GW]. No acetylcholine, no learning. The visual cortex is still a learning algorithm, even if it’s not learning from rewards, but rather learning to expect a certain attentional shift within the brain.

“Central Cholinergic Neurons Are Rapidly Recruited by Reinforcement Feedback”, Hangya et al. 2015.

The authors offer loads of evidence that reward and punishment cause acetylcholine to appear. But I don’t think they claim (or offer evidence) that the acetylcholine is communicating a reward. Indeed, if I’m reading it right, the acetylcholine spikes after both reward and punishment, whereas a reward signal like phasic dopamine needs to swing both positive and negative.

The evidence all seems to be consistent with my belief (see separate post [LW · GW]) that acetylcholine controls learning rate, and that it’s biologically advantageous for the brain to use a higher learning rate when important things like reward and punishment are happening (and when you’re aroused, when a particular part of the brain is in the spotlight of top-down attention, etc.).

“Perceptual learning rules based on reinforcers and attention”, Roelfsema et al. 2010.

If I’m reading it right (a big “if”!), everything in this paper is consistent with what I wrote above and in the other post [LW · GW]. Acetylcholine determines learning rate, and you learn better by having the capability to set different learning rates at different times and in different parts of the brain, and the presence of rewards and punishments is one signal that maybe the learning rate should be unusually high right now.

~~

Well, those are the main citations. This is a very quick-and-dirty analysis, but I’m sufficiently satisfied that I’m going to stick with my priors here: reward-based learning is not involved in sensory processing in the brain.

(Thanks Adam Marblestone for comments on a draft.)

21 comments

Comments sorted by top scores.

comment by Kaj_Sotala · 2021-03-18T15:16:57.068Z · LW(p) · GW(p)

I’m biased: I have a strong prior belief that reinforcement learning should not be involved in sensory processing in the brain. The reason is simple: avoiding wishful thinking.

If there’s a positive reward for looking at beautiful ocean scenes, for example, then the RL solution would converge towards parsing whatever you look at as a beautiful ocean scene, whether it actually is or not!

That seems like a strong argument for why RL should not be the only mechanism for sensory processing, but a weaker one for why it shouldn't be involved at all?

Without looking at the papers you cited, one reason for one might expect RL to be involved in sensory processing would be the connection between perception and skill learning. Some of the literature on expertise that I've seen suggests that the development of a skill involves literally learning to see differently, in which case reinforcement learning should be associated with sensory processes, shaping one's perception as one practices a skill so that one starts to see things as an expert would.

E.g. Gary Klein:

In this project, we studied the way nurses could tell when a very premature infant was developing a life-threatening infection. Beth Crandall, one of my coworkers, had gotten funding from the National Institutes of Health to study decision making and expertise in nurses. She arranged to work with the nurses in the neonatal intensive care unit (NICU) of a large hospital. These nurses cared for newly born infants who were premature or otherwise at risk. 

Beth found that one of the difficult decisions the nurses had to make was to judge when a baby was developing a septic condition-in other words, an infection. These infants weighed only a few pounds-some of them, the microbabies, less than two pounds. When babies this small develop an infection, it can spread through their entire body and kill them before the antibiotics can stop it. Noticing the sepsis as quickly as possible is vital. 

Somehow the nurses in the NICU could do this. They could look at a baby, even a microbaby, and tell the physician when it was time to start the antibiotic (Crandall and Getchell-Reiter 1993). Sometimes the hospital would do tests, and they would come back negative. Nevertheless, the baby went on antibiotics, and usually the next day the test would come back positive.

This is the type of skilled decision making that interests us the most. Beth began by asking the nurses how they were able to make these judgments. ments. "It's intuition," she was told, or else "through experience." And that was that. The nurses had nothing more to say about it. They looked. They knew. End of story. That was even more interesting: expertise that the person clearly has but cannot describe. Beth geared up the methods we had used with the firefighters. Instead of asking the nurses general questions, such as, "How do you make this judgment?" she probed them on difficult cases where they had to use the judgment skills. She interviewed nurses one at a time and asked each to relate a specific case where she had noticed an infant developing sepsis. The nurses could recall incidents, and in each case they could remember the details of what had caught their attention. The cues varied from one case to the next, and each nurse had experienced experienced a limited number of incidents. Beth compiled a master list of sepsis cues and patterns of cues in infants and validated it with specialists in neonatology. 

Some of the cues were the same as those in the medical literature, but almost half were new, and some cues were the opposite of sepsis cues in adults. For instance, adults with infections tend to become more irritable. Premature babies, however, become less irritable. If a microbaby cried every time it was lifted up to be weighed and then one day it did not cry, that would be a danger signal to the experienced nurse. Moreover, the nurses were not relying on any single cue. They often reacted to a pattern of cues, each one subtle, that together signaled an infant in distress. [...]

In her project with the neonatal intensive care unit, Beth Crandall had asked the experienced nurses how they detected the early signs of sepsis. They told her it was experience and intuition. They did not know what they knew, because what they knew was perceptual-how to see. The only way Beth was going to find out anything useful was to have the nurses tell their stories of specific instances, each tied to a different set of perceptual cues. At the end of the interviews, Beth could draw all the stories together and compile a master list of cues to sepsis.

and David Chapman:

Because most of life is routine, and most objects and situations are mostly familiar, and because we’ve practiced our visual skills from early childhood, they suffice for most tasks, and go unnoticed. Needing to learn new visual skills is unusual for adults.

Some video games are an outstanding exception. Video games are designed to make learning new skills fun, and many games teach you to see in new ways. When you enter a new segment of the game, everything is happening much too fast; you have no idea where to look or what it means. Enemies come out of nowhere and kill you before you even see what they are. With practice, you learn to see things you couldn’t before, because you didn’t know how.

Image courtesy Evelyn Chai

You are sneaking along a gloomy passageway in the necromancer’s tower. Suddenly you die. WTF just happened??

You reload the game from the last save point.

You are sneaking along that passageway and out of the corner of your eye you see something violent happen on left side of the screen and then you die. You reload.

You are sneaking along, looking left, and a golem leaps out of the archway on your left side and kills you.

This time, you’re watching the archway cautiously, and when the golem leaps out you hit it with a lightning bolt. A moment later, something happens on the right and you die.

Next try, you zap the golem with a lightning bolt, you flick your eyes right, and as the tomb over there opens, you manage to get one of the zombies with a fireball. But another one kills you. You noticed that the headless zombie hesitated for a moment before attacking.

You zap the golem and incinerate the one-arm zombie with a fireball while the headless one gropes around. You cartwheel to dodge its attack, and finish it off with a mid-air flying dagger thrust. Awesomeness! Unfortunately the floorboards you land on are rotten and you fall through to your death.

… An hour later, you stroll through the tower, knocking off monsters and skipping traps without really thinking about it, because you have learned to perceive the meanings of routine necromantic phenomena. Archways harbor golems, zombies without heads can’t see, rotten floorboards are a bit darker than solid ones. Now you know what to look out for, and where to look to see it.

You whack the necromancer, collect the Arcane Eggbeater Of Destiny, and return with it to the College of Wizards to get your next homework assignment. [...]

What we’ve learned from visual psychology suggests that seeing involves learned, task-specific skills. It is contextual and purposive, which makes it a good fit for everyday, reasonable, routine activity. (And not such a great fit for objective rationality.)

Visual activity is not separate from the rest of what we’re doing. The phrase “hand-eye coordination” points at this. In video games, your visual actions that tell your lower-level visual processing systems what to do are just as much part of the skill of fighting a group of monsters as swinging your sword is. Shifts of visual attention, for instance, are seamlessly integrated with the rest of the killing dance. As a more mundane example, if you are looking for scissors, you’ll move your head as well as your eyes to check around the desk, shove clutter out of the way to see behind or beneath it, and eventually get up and go open a drawer and peer inside. Visual activity and bodily motions are entangled. [...]

Much of what you see, you see as something. You don’t see a textured black region of the visual image, you see a loudspeaker. Or a raven. It’s already a loudspeaker or a raven when you first experience it.7 Bottom-up vision has done that work for you.

What you see something as depends on your knowledge, context, and purposes. If you are familiar with moussaka and you see it on a plate in a restaurant, you’ll probably see it as moussaka. If you aren’t, you’ll probably see it as a mushy casserole. You can’t see it as moussaka, because that’s not part of your ontology. If you see moussaka on a city sidewalk, you might just see it as disgusting, potentially pathogenic slop that you want to avoid. What you see a clump of atoms as depends on what you are looking out for, and why. Although bottom-up processes can do much of the work for you, especially in the case of rigid manufactured objects like loudspeakers with standard shapes and colors, your top-down direction often also plays a critical role.

and Josh Waitzkin:

Let’s say that I spend fifteen years studying chess. [...] We will start with day one. The first thing I have to do is to internalize how the pieces move. I have to learn their values. I have to learn how to coordinate them with one another. Early on, these steps might seem complex. There is the pawn, the knight, the bishop, the rook, the queen, and the king. Each piece is unique, with its own strengths and weaknesses. Each time I look at a chess piece I have to remember what it is and how it moves. Then I look at the next piece and try to remember how that one moves. There are initially thirty-two pieces on a chessboard. To make a responsible chess decision, I have to look at all those pieces and check for captures, quick attacks, and other obvious possibilities. By the time I get to the third piece, I’m already a bit overwhelmed. By the tenth piece I have a headache, have already forgotten what I discovered about the first nine pieces, and my opponent is bored. At this point I will probably just make a move and blunder. 

So let’s say that now, instead of launching from the standard starting position, we begin on an empty board with just a king and a pawn against a king. These are relatively simple pieces. I learn how they both move, and then I play around with them for a while until I feel comfortable. Then, over time, I learn about bishops in isolation, then knights, rooks, and queens. Soon enough, the movements and values of the chess pieces are natural to me. I don’t have to think about them consciously, but see their potential simultaneously with the figurine itself. Chess pieces stop being hunks of wood or plastic, and begin to take on an energetic dimension. Where the piece currently sits on a chessboard pales in comparison to the countless vectors of potential flying off in the mind. I see how each piece affects those around it. Because the basic movements are natural to me, I can take in more information and have a broader perspective of the board. Now when I look at a chess position, I can see all the pieces at once. The network is coming together. 

Next I have to learn the principles of coordinating the pieces. I learn how to place my arsenal most efficiently on the chessboard and I learn to read the road signs that determine how to maximize a given soldier’s effectiveness in a particular setting. These road signs are principles. Just as I initially had to think about each chess piece individually, now I have to plod through the principles in my brain to figure out which apply to the current position and how. Over time, that process becomes increasingly natural to me, until I eventually see the pieces and the appropriate principles in a blink. While an intermediate player will learn how a bishop’s strength in the middlegame depends on the central pawn structure, a slightly more advanced player will just flash his or her mind across the board and take in the bishop and the critical structural components. The structure and the bishop are one. Neither has any intrinsic value outside of its relation to the other, and they are chunked together in the mind. 

This new integration of knowledge has a peculiar effect, because I begin to realize that the initial maxims of piece value are far from ironclad. The pieces gradually lose absolute identity. I learn that rooks and bishops work more efficiently together than rooks and knights, but queens and knights tend to have an edge over queens and bishops. Each piece’s power is purely relational, depending upon such variables as pawn structure and surrounding forces. So now when you look at a knight, you see its potential in the context of the bishop a few squares away. Over time each chess principle loses rigidity, and you get better and better at reading the subtle signs of qualitative relativity. Soon enough, learning becomes unlearning. The stronger chess player is often the one who is less attached to a dogmatic interpretation of the principles. This leads to a whole new layer of principles—those that consist of the exceptions to the initial principles. Of course the next step is for those counterintuitive signs to become internalized just as the initial movements of the pieces were. The network of my chess knowledge now involves principles, patterns, and chunks of information, accessed through a whole new set of navigational principles, patterns, and chunks of information, which are soon followed by another set of principles and chunks designed to assist in the interpretation of the last. Learning chess at this level becomes sitting with paradox, being at peace with and navigating the tension of competing truths, letting go of any notion of solidity. [...]

Most people would be surprised to discover that if you compare the thought process of a Grandmaster to that of an expert (a much weaker, but quite competent chess player), you will often find that the Grandmaster consciously looks at less, not more. That said, the chunks of information that have been put together in his mind allow him to see much more with much less conscious thought. So he is looking at very little and seeing quite a lot. This is the critical idea.I

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-03-18T15:40:37.603Z · LW(p) · GW(p)

Thanks! My current model is that

  1. The frontal lobe does involve RL, and this is used to think high-value thoughts and take high-value actions;
  2. One reason that thoughts / actions can be high-value is by acquiring valuable information, and one way they can do this is by directing saccades and attention towards parts of the visual field (or other sensory input) where valuable information is at;
  3. That corresponding sensory input processing area is still doing predictive learning, but it uses a higher learning rate [LW · GW] when it is the focus of top-down attention, and therefore tends to develop a rich pattern-recognizing vocabulary that is lopsidedly tailored towards recognizing the types of patterns that carry valuable information from the perspective of the RL-based frontal lobe.

So RL is involved, just a step removed. (Maybe my post title was bad. :-P ) Do you think that's an adequate involvement of RL to explain those and other examples?

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2021-03-18T17:42:16.225Z · LW(p) · GW(p)

Maybe. :) I don't have much of a position on "which part of the brain is the sensory processing-related reinforcement learning implemented in", just on the original claim of "we shouldn't expect to find RL involved in sensory processing".

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-03-18T17:49:11.008Z · LW(p) · GW(p)

That's fair; the first sentence now has a caveat to that effect. Thanks again!

comment by jacob_cannell · 2022-10-29T16:54:14.134Z · LW(p) · GW(p)

If there’s a positive reward for looking at beautiful ocean scenes, for example, then the RL solution would converge towards parsing whatever you look at as a beautiful ocean scene, whether it actually is or not!

That sounds like a broken implementation of RL.

Alexnet did well on Imagenet purely by training with supervised learning on a large dataset - the update signals on the output propagated all the way back down to the first visual layer V1 equivalent resulting in they typical gabor-like features (which you also tend to get with unsupervised learning).

Deepmind's atari agent took that same pattern (and code) and applied it to atari, simply replacing the supervised signal with a score-reward derived signal, and it worked just as well.

The relevant reward for 'looking at beautiful ocean scenes' would be an intrinsic curiosity type reward which estimates bayesian value of information. It is zero for predictable/boring information, also zero for completely unpredictable noise, and maximal for information which balances learning tractability with complexity relevant to the network's current knowledge.

Given that background it makes sense that IT (as near the top of the vision heterarchy) is heavily involved in both estimating visual bayesian surprise (propagating forward value of information for saccade targets and higher level planning), and also possibly also adapts learning (neuromodulation) to focus capacity more on learning high value information (backpropagating utility).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-10-30T00:27:57.630Z · LW(p) · GW(p)

Thanks! First and foremost, I wrote this post “a long time ago” (in terms of the state of my knowledge). It's not great. I would say things differently now.

That sounds like a broken implementation of RL.

Yes. My impression is that lots of neuroscience papers suggest theoretical interpretations of what such-and-such thing in the brain is doing, and these interpretations don’t make any big-picture sense, i.e., if those interpretations were true, then the brain would have a very broken implementation of RL.  :)

Deepmind's atari agent took that same pattern (and code) and applied it to atari, simply replacing the supervised signal with a score-reward derived signal, and it worked just as well.

The wishful-thinking failure mode that I was referring to is when you have model-based RL where the reward function depends in part on the outputs of the world-model, and the world-model is also being updated to maximize the reward function.

The DeepMind Atari thing you mention does not have this type of problem, because the “score-reward derived signal” is an exogenous ground truth, coming directly from the Atari emulator.

Human brains do potentially have that problem though: I claim that the “reward function” that humans feel each second is not an exogenous ground-truth, but instead significantly depends on the activity in the cortex. For example, it’s “rewarding” to open your report card and find that all the grades are A+, but the trained model in your cortex is critical for understanding what the report card says and means. Your brainstem superior colliculus can’t read English!

This isn’t an unsolvable problem, but it does mean that the implementation of RL will be broken if those very same parts of the cortex are also being updated with a learning rule designed to maximize rewards, right?

Having read your whole comment, I’m not sure whether you agree or disagree with the school of thought (endorsed by me, Yann LeCun, Jeff Hawkins, Blake Richards, Randall O'Reilly, etc.) that low-level visual cortex is being updated on predictive loss (a.k.a. self-supervised learning), not RL. Do you?

(That school of thought is not counter to RL in general. Instead I would say that the brain is implementing “a model-based RL agent”, and the “model” part of that overall architecture is updated on predictive loss.)

The relevant reward for 'looking at beautiful ocean scenes' would be an intrinsic curiosity type reward which estimates bayesian value of information.

I guess you disagree with evolutionary aesthetics? For example, that’s the idea that rabbits find it the view from sheltered vegetated areas to be very pleasant, and beavers find the view around tree-lined rivers and ponds to be very pleasant, and humans find mountain views and water views and whatnot to be very pleasant, and nocturnal animals like rats find dark scenes to be very pleasant to look at, so on. I’m strongly inclined to believe that evolutionary aesthetics is true, at least the habitat-selection part and maybe for mate-selection and other things too, and I think that the corresponding contribution to reward is calculated by the brainstem (superior colliculus) in a way that’s specified in detail by the genome and does not depend on within-lifetime learning.

(I’m not denying that it’s also possible to acquire view-preferences through within-lifetime learning driven by curiosity or various other things.)

(Yes that means that I was wrong to use “beautiful ocean scenes” as the example in this post. Oops. The report card example above would have been better.)

Replies from: jacob_cannell
comment by jacob_cannell · 2022-10-30T02:23:41.940Z · LW(p) · GW(p)

This isn’t an unsolvable problem, but it does mean that the implementation of RL will be broken if those very same parts of the cortex are also being updated with a learning rule designed to maximize rewards, right?

I used the DL examples to illustrate that you can - in theory and practice - actually train systems fully end to end simply by backpropping updates through the outputs (although that generally isn't all that efficient)

Making the utility/reward function fully endogenous such that it's inputs are all just internal from other brain regions doesn't change anything at all, still works. If anything it works better as you can get much denser reward signals, assuming the internal rewards are reasonable approximations of the ideal utility driven updates.

For example, it’s “rewarding” to open your report card and find that all the grades are A+, but the trained model in your cortex is critical for understanding what the report card says and means.

That still doesn't seem to be a foundational problem - again consider the end-to-end RL agent. It receives a reward for getting an A+, and it backprops a resulting update down into the cortex which improves it's ability to understand what the report card says and means in order to predict and get the resulting contingent reward. It might help to replace 'reward' with bayesian update. Rewards are simply a means to estimate the ideal bayesian update.

There is a general functional equivalent of bayesian updates, that as long as they point in generally the right direction, they help - and it doesn't really matter where they came from. Evidence updates - properly computed - can only help.

There are many learning related mechanisms: local learning mechanisms like STDP and the global modulatory signals like dopamine, serotonin, acetylcholine, etc - which are all facets of a general learning mechanism which estimates the ideal bayesian updates. The ideal perfect update would be that derived from knowing the true sequence of action outputs which maximize future inclusive fitness, which is obviously unavailable, but the brain has very clever mechanisms for approximating that ideal - one of which is something like "modeling and predicting local inputs is generally a good idea".

low-level visual cortex is being updated on predictive loss (a.k.a. self-supervised learning), not RL.

It was obvious even before DL that you could learn low-level visual features through local predictive loss (eg sparse coding, and various neuroscience). Alexnet/Resnets/etc were surprising because they got better results simply by full end to end training. That bothered me slightly, but my reasonable explanation was that UL of the time were simply too small. If you have a small model and a huge pile of data then capacity is scarce and end-to-end supervised training on a narrow specific task allows you to specifically focus that precious capacity on the small subset of features relevant for the narrow specific task. But as you increase task generality and you increase capacity this swings to favor UL - and now indeed as predicted the large foundation models are trained with UL/SSL.

If you divide the brain in half based on closer-to-inputs vs closer-to-outputs, the input side benefits most directly from UL, and the output side benefits more directly from RL. So it's not entirely surprising that dopamine ('reward' prediction error) is used mostly on the output side, but serotonin(uncertainty/learning-rate) and acetycholine(attention,'temperature') are used on both. But I don't see much evidence that there are different learning algorithms, in the sense that SGD and GA are different learning algorithms. The brain clearly implements something similar to SGD but probably better than the current adam/backprop - some efficient approx bayes method.

The difference between model-free and model-based is simply that model-based arch makes some additional assumptions about how the architecture works internally. The brain is somewhat beyond that dichotomy.

I guess you disagree with evolutionary aesthetics? For example, that’s the idea that rabbits find it the view from sheltered vegetated areas to be very pleasant, and beavers find the view around tree-lined rivers and ponds to be very pleasant, and humans find mountain views and water views and whatnot to be very pleasant, and nocturnal animals like rats find dark scenes to be very pleasant to look at, so on.

It's rather obviously bunk. Firstly it descends from the ev psych mindset that evolutionary learning is the single hammer to explain everything, the same beliefs [LW(p) · GW(p)] that gave us evolved modularity.

The prime lesson from recent neuroscience contra ev psych is that the brain can be understood from the consequences of simple universal intra-lifetime optimization processes and objectives rather than ev psych style inter-lifetime optimization. There's a pretty convincing stack of evidence that we have an intrinsic motivation/reward signal that estimates bayesian value of information, and this fully explains our tastes for music, art, pretty graphics, etc. Music in particular is a curiosity/info-value superstimuli.

From the wikipedia link:

The East African savanna is the ancestral environment in which much of human evolution is argued to have taken place. There is also a preference for landscapes with water, with both open and wooded areas, with trees with branches at a suitable height for climbing and taking foods, with features encouraging exploration such as a path or river curving out of view, with seen or implied game animals, and with some clouds

Lol really? That's all encoded in the genome? Where are the genetic mutants who prefer dry landscapes, or upside down landscapes, or only water landscapes? Why is it that the popular landscapes today on art websites (now generated by AI of course) are mostly sci-fi fantasy and have little relationship to the savanna crap described in this article?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-10-30T03:55:42.145Z · LW(p) · GW(p)

I used the DL examples to illustrate that you can - in theory and practice - actually train systems fully end to end simply by backpropping updates through the outputs (although that generally isn't all that efficient)

The rules are different in episodic RL versus online / within-lifetime RL, right? If the Atari agent’s has latent activations that correspond to delusional wishful-thinking beliefs about its current situation, the Atari agent is going to fall in the snake pit and die…

  • …And in an episodic context, that’s awesome! We want the delusional agent to fall in a snake pit and die! It will re-spawns with model weights that correspond to better epistemics.
  • …But in an online context, that’s terrible! When the animal dies, it’s game over from the perspective of within-lifetime learning.

If you’ve seen end-to-end RL, in a non-episodic, online-learning context, with random initialization of weights at birth, in a partially unknown / random environment that can sometimes kill you, and which doesn’t offer a dense reliable reward signal for free, then I would be interested to know how that works.

Where are the genetic mutants who prefer dry landscapes, or upside down landscapes, or only water landscapes?

I don’t think it’s unusual for two people to look at a view and one person says it’s lovely and the other person says it isn’t. Some people think dogs are cute, some people don’t. Some people go nuts without a window view, others don’t care. Etc. I don’t think anyone has been doing systematic measurements to suss out population-level variation, and it would be tricky anyway because life experience also matters and is not randomly distributed.

So for all I know, there are genetic mutants who prefer seeing a brick wall out their window over seeing the ocean out their window. They’re probably living a normal life. They’d have unusually little problem finding a reasonably-priced apartment that they like in a big city. :-P Maybe their friends make fun of them. But nobody is writing scientific articles about them.

Why is that the popular landscapes today on art websites (now generated by AI of course) are mostly sci-fi fantasy and have little relationship to the savanna crap described in this article?

Probably a similar reason as I prefer ice cream to mammoth meat, right? The superior colliculus is not a high-quality image processing system, like the visual cortex is. If I understand the literature correctly, people tried to figure out how the innate human face detector in the superior colliculus worked, and it seemed to be triggering on “three dark-ish blobs in a roughly triangle pattern”. By the same token, habitat aesthetics is presumably calculating some relatively simple heuristic properties of the field-of-view, not calculating “whether it’s a savannah”. In the much bigger universe of all possible images, there are probably lots of things that do as well or better than the savannah on those same rough heuristics.

Also, we can like or dislike the looks of things for other reasons unrelated to innate aesthetic sense, e.g. maybe the aesthetics of this painting are subconsciously reminding me of some TV show I really liked in my teens, or whatever.

So that’s two reasons that I don’t particularly expect people to buy paintings of savannahs. :)

The prime lesson from recent neuroscience contra ev psych is that the brain can be understood from the consequences of simple universal intra-lifetime optimization processes and objectives rather than ev psych style inter-lifetime optimization.

I don’t know why you think the objective is or has to be simple. Any time an animal is behaving in a way that brings no immediate obvious-to-it benefits, especially early in life and/or in solitary animals, that behavior must be either directly innate or built into the objective, right? IIUC, birds build nests without ever seeing one, therefore nest-building must be somehow in the objective. IIUC, lots of solitary prey animals can flee the smell of their predators without ever having a near-miss encounter with that same type of predator, therefore fleeing-that-particular-smell must be somehow in the objective. Animals eat the right kinds of food, move in the right kinds of ways, live in the right kinds of places, freeze and flee and fight when appropriate, etc., without trial-and-error learning, I think, and therefore all those things are somehow in the objective.

I think fully describing the objective of a human would probably take like thousands of lines of pseudocode. So it’s not “complicated” in the sense that a 100-trillion-parameter-trained-model is “complicated”, but still, there’s lots of stuff going on there.

AFAICT, what I’m saying is a very good fit to the brain. I’m an advocate of cortical uniformity as much as anyone, but the hypothalamus and brainstem are totally different from the cortex in that sense. They consist of hundreds of little bits and bobs doing quite different and specific things. (See my recent post about the hypothalamus [LW · GW].) I claim that almost all the complexity in the hypothalamus and brainstem is feeding into the objective function, directly or indirectly.

Back to aesthetics-of-habitat. If an animal is in the wrong microenvironment—e.g. if an animal with camouflage is not standing in front of the correct background—it can easily be fatal, including without any warning. So there’s a strong evolutionary pressure here. Calculating some heuristics on the field-of-view and feeding that into the objective seems like it would mitigate that problem, in a way that seems biologically plausible and indeed biologically straightforward to me.

Specifically, the superior colliculus is in the brainstem, it seems to have the right design to do that kind of thing, and it projects to the VTA/SNc (among other places) which sends out the loss function / rewards.

So I wind up feeling very strongly that aesthetics-of-habitat is a real thing.

Replies from: jacob_cannell
comment by jacob_cannell · 2022-10-30T06:06:01.309Z · LW(p) · GW(p)

The rules are different in episodic RL versus online / within-lifetime RL, right?

I may not understand what you mean by 'episodic' - because usually that seems to imply the setup where you have a single reward at the end of an episode, but that isn't how must game RL works as the rewards come anytime you get points which depends on the game.

If the Atari agent’s has latent activations that correspond to delusional wishful-thinking beliefs about its current situation, the Atari agent is going to fall in the snake pit and die…

But yes I agree that the atari agent in particular is technically doing inter and intra lifetime with SGD, unlike humans/animals. extrinsic reward RL is inefficient for several reasons - the reward sparsity issue, but also just low-D reward signal, and the feedback indirection problem you mentioned earlier.

I think fully describing the objective of a human would probably take like thousands of lines of pseudocode

That sounds reasonable but is extremely low complexity compared to evolved modularity.

Back to aesthetics-of-habitat. If an animal is in the wrong microenvironment—e.g. if an animal with camouflage is not standing in front of the correct background—it can easily be fatal, including without any warning.

Sure, but that clearly can't be an explanation for general human information value (aesthetics), which should also explain:

  • the pleasure of music
  • the pleasure of art, etc etc
  • the pleasure of beautiful scenery etc etc
  • exactly which types of audio stimuli infants will attend to
  • why we want to know the answer to trivia questions that we almost know more than those completely unfamiliar or completely familiar
  • why we'll risk electric shocks for magic tricks and trivial knowledge

Sensory information has intrinsic value in terms of what we can learn from it - the intrinsic bayesian value of information. Curiosity and aesthetics are both manifestations of this same unifying principle which explains all the various manifestations listed above and more. There is extensive DL literature on using info-value intrinsic motivation: we know it works and exactly how and why. There is also now growing neurosci lit establishing that the brain uses variations of the same general bayesian-optimal mechanism to estimate value of information, this seems to involve perhaps anterior cingulate cortex, anterior insula, and striatal reward circuits[1], and the standard dopaminergic pathways more generally[2] [3].

So there really is little remaining room/need for a specific 'landscape aesthetic', unless it's just the region specific version of the same theme.

Animals (and humans to some degree) do have known innate triggered terrain aversions (fear of heights, or fear of open spaces, etc) and thus terrain affections are possible, but I'm not aware of specific evidence for them in humans.


  1. The psychology and neuroscience of curiosity ↩︎

  2. Systems neuroscience of curiosity ↩︎

  3. Shared striatal activity in decisions to satisfy curiosity and hunger at the risk of electric shocks ↩︎

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-10-30T19:46:44.593Z · LW(p) · GW(p)

That sounds reasonable but is extremely low complexity compared to evolved modularity.

Yes! You don’t have to convince me to be opposed to evolved modularity in the cortex! (i.e. the school of thought that the genome specifies the logic behind intuitive botany, and separately the logic behind intuitive physics, and separately the logic behind grammar, etc., as advocated e.g. by Steven Pinker & Gary Marcus). I myself argue against that school of thought all the time!!

that clearly can't be an explanation for general human information value (aesthetics)

It doesn’t explain everything. But it does explain some important things.

For example, most people prefer to see a river view out their windows than seeing a brick wall 5 meters away out their windows. Is this a value-of-information thing? No. If both scenes have been the same for years (e.g. no animals or boats on the river), neither view carries any useful information, but people will still tend to prefer the river view. If the brick wall has more information, maybe because there are bugs crawling on it, or a window into your neighbor’s apartment with the blinds always down (so they can’t see you but you can see silhouettes etc.), people still tend to prefer the river view!

(I myself don’t mind the brick-wall-view, but I still slightly prefer the river view and am clearly in the minority on this anyway—just look at apartment rental prices etc.)

I’m not sure how you would explain that fact. In my opinion, it’s a direct consequence of superior colliculus heuristic calculations that are ultimately derived from evolutionary pressure to be in the right micro-habitat.

I think the superior colliculus calculates various heuristic / statistical properties of the visual inputs, including ones related to micro-habitat selection, probably ones related to mate selection, almost definitely heuristics for recognizing snakes and spiders (I think by the way they move [slither / scuttle]), almost definitely heuristics for whether you’re looking at a human face (and perhaps eye contact in particular), and maybe some other things I’m not thinking of.

I think these various heuristics are constantly outputting scores on a scale of 0 to 100%, and those scores serve as inputs to a reward function.

But that same reward function also includes lots of other things!! For example, being in pain is bad. If every time I look at orange things I get electrocuted, throughout my whole life, I’m going to wind up with an “aesthetic” distaste for the color orange. But the superior colliculus heuristics have nothing to do with that preference! Instead of superior colliculus → brainstem → negative valence, the main pathway would instead probably be visual cortex → amygdala → brainstem → negative valence, IMO.

Some kind of “curiosity drive” is part of the reward function too, I firmly believe.

exactly which types of audio stimuli infants will attend to

Funny that you brought that up. I was going to bring it up! I think there’s a lot of information in bird songs, and there’s a lot of information in human speech sounds, and I think it’s not inherently obvious in advance which of those is going to be more useful to a human child, and yet I’d bet anything that human infants are more attentive to human speech sounds than bird songs (given equal exposure), and that the brainstem auditory processing (inferior colliculus) is responsible for that difference. (I haven’t specifically done a deep-dive lit review. If it turns out to be crux-y, maybe I will.)

I think I’m generally much more skeptical than you that an infant animal can figure out value of information sufficiently quickly and reliably for this to be a complete explanation of what they attend to. From our worldly adult perspective, it’s obvious that human speech is carrying much more useful information for a human infant than bird songs are, and that for a baby bird it’s the other way around. But before the baby human/bird has made headway in learning / parsing those auditory patterns, or understanding the world more generally, I don’t think it yet has any way to figure out which of those is better to invest in.

By the same token, I think I’m much more skeptical than you that an infant animal can figure out what is “empowering” and what isn’t, in a way that completely explains all the complex species-specific infant behavior. From our perspective, it’s obvious that a camouflaged animal is “empowered” by standing in front of the appropriate background when there’s a predator nearby, because it’s less likely to get eaten, and getting eaten is very disempowering. But how is the animal supposed to figure that out, except through probably-fatal trial-and-error?

Animals (and humans to some degree) do have known innate triggered terrain aversions (fear of heights, or fear of open spaces, etc) and thus terrain affections are possible, but I'm not aware of specific evidence for them in humans.

So you think that fear-of-heights is innate in non-human animals, but not innate in humans? That strikes me as weird. Looking over a precipice invokes a weirdly specific kind of tingling (in my experience) that doesn’t happen in other equally-frightening contexts. Fear of heights is widespread in the human population way out of proportion to its danger, in comparison with other things like fear-of-knives and fear-of-driving. Fear-of-heights seems obviously evolutionarily useful for humans. If it could evolve in some animals, why not humans?

Hmm, I feel like we’re at least partly having a stupid argument where you’re trying to convince me that the reward function is less than 100% superior colliculus hardcoded heuristics, and meanwhile I’m trying to convince you that the reward function is more than 0% superior colliculus hardcoded heuristics. Maybe we can just agree that it’s more than 0% and less than 100%! :)

general human information value (aesthetics)

I think it’s important that we reserve the word “aesthetics” for something like “finding some things more pleasant to look at than other things”. Then you can propose a theory that a complete explanation for aesthetics is information value. And we can discuss that theory.

Such a discussion is hard if we define “aesthetics” to be information value by definition.

Replies from: jacob_cannell
comment by jacob_cannell · 2022-10-30T21:39:20.498Z · LW(p) · GW(p)

I think it’s important that we reserve the word “aesthetics” for something like “finding some things more pleasant to look at than other things”.

Sure - let's call it information preferences.

Let's start - as we always should - from core first principles: what is the optimal information preference - ie that which would maximize total future discounted inclusive fitness?

Some information has obvious instrumental planning value - like the natural language instructions for how to start a fire for example. Most information has less obvious immediate instrumental value compared to that extreme, but all information has some potential baseline future utility in how it improves the predictive capacity of the world model.

As you discussed earlier the brain has a 'world model' loosely consisting of much of sensory cortex, and there is overwhelming evidence that said world model components are trained via UL/SSL style prediction. There's also strong foundational reasons why this is in fact bayes optimal (solomonoff induction, bayesian inference, etc) - so it's hardly surprising we see that in the brain.

But that isn't nearly enough.

The brain's model is trained only from a highly localized egocentric observation stream, and the brain thus must actively decide it's own training dataset. Most information is near useless; the high utility information is extremely sparse and rare. So it's crucially important to estimate the value of information - the core of which is compression progress or improvement in the world model's predictive capacity. It's also not all that difficult to estimate - you can estimate it from learning updates (when using optimal variance adjusted learning rates). A large update then indicates compresison progress (because there was a prediction error followed by an update to reduce prediction error, and the system computing these updates is smart enough to compute the minimal correct sized updates), and is always highly specific to the knowledge already encoded in the model.

This is also 'curiosity' but it is far more general and powerful than the typical meaning of that term, and it is fully general enough and powerful enough to explain most all of our information preferences - I've already given many examples.

For example, most people prefer to see a river view out their windows than seeing a brick wall 5 meters away out their windows. Is this a value-of-information thing?

Yes, the view of the river is vastly higher value-of-information - it's constantly changing with time of day lighting, weather, etc. However it's also an empowerment thing, as the river view suggests you can actually go out and explore the landscape. So it's confusing multiple intrinsic motivation signals.

A better comparison is a simple static painting of a river landscape vs static painting of a brick wall. The river landscape view still has higher info value - it has higher total compressible entropy for typical humans who have experience with vaguely similar visual patterns.

The brick wall with insects could be higher info value - and indeed some children find a wall with moving insects highly fascinating (as I did) - it's called an ant farm and it was definitely more interesting than landscape paintings (or most any paintings really).

I think I’m generally much more skeptical than you that an infant animal can figure out value of information sufficiently quickly and reliably for this to be a complete explanation of what they attend to.

They can and I already linked some articles demonstrating exactly this and some of the theory of how it works - although obviously it's not the only factor for attention, but it is primary and by far the single most important factor.

Optimal bayesian learning rates are simple consequences of uncertainty/variance - namely the tradeoff between uncertainty/precision in the current network weights vs that of some new update. Variance-adjusted learning rates are also critical for modern DL methods, which mostly estimate them using slow rolling gradient statistics (ie Adam etc). That isn't as plausible for the brain, but the brain does have a system which seems to encode various forms of global uncertainty/variance across different timescales (factored into meta-level predictable uncertainty and unpredicted uncertainty - which is exactly what you want as you need to ignore predicted uncertainty aka unlearnable noise) which seems to be then distributed by the serotogenic ralph nuclei globally to essentially all of the learning regions and is a core learning rate type input.

The local hedonic reward value of information signals are then - i'm guessing - not too difficult to compute from local downstream serotegenic statistics. These somehow ultimately funnel into the dopaminergic rewards centers (which are more typically known for handling reward prediction errors but really it's more like prediction errors in general). And it has to be localized because each brain module is the ideal place to compute value-of-information relative to it's current knowledge. This explains why serotonin is indirectly rewarding (and serotonin analogs such as LSD/psilocybin generally increase both plasticity causing 'neural drift' and euraka hedonic info-value reward), and much else.

From our worldly adult perspective, it’s obvious that human speech is carrying much more useful information for a human infant than bird songs are, and that for a baby bird it’s the other way around.

Well once again the compression progress or bayesian surprise value of information fully explains exactly why human babies prefer human language and bird babies prefer bird songs - the value of information depends on predictability which depends on the current world model knowledge - it needs to balance novelty with familiarity in the right way.

Adult mono-linguistic humans don't find other languages super interesting, because the relevant brain regions have transitioned to a low variance/uncertainty and thus low learning rate state, which scales down the value-of-info reward proportionality. It's all relative to what you currently know which depends on age and total experience history.

A true feral child wouldn't find human language all that more interesting than bird song - but probably still somewhat more interesting because human language just has more compressible complexity than bird song - aliens would find it more interesting.

Hmm, I feel like we’re at least partly having a stupid argument where you’re trying to convince me that the reward function is less than 100% superior colliculus hardcoded heuristics,

In terms of range of phenomena explained or bits of entropy it's pretty obviously close to 99% bayesian value of information, and pretty much needs to be. The superior colliculus hardcorded heuristics simply doesn't have the capacity to have heuristics for the intrinsic value of language, math, science fiction stories, trippy fractal videos, etc, etc. It's essentially the same argument for universal learning from scratch - and indeed it's just a higher order effect or consequence of learning from scratch.

For the remaining 1% you still need some low complexity priors that steer or bias the value-of-information heuristic. For example the complexity of food taste, and our preference for just the right amount of novelty, is mostly explained by the same standard info-value, but it clearly also has a low complexity prior bias for +sweetness/carbs, +fats, +sodium, -bitter, etc. (Combined with some amount of additional supervised learning feedback from digestion to learn what's toxic or not)

But as far as the reward centers are concerned, there is no difference between food taste and other forms of information tastes - it's all just info taste (with weak low complexity priors/bias) all the way down.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-10-31T14:56:20.997Z · LW(p) · GW(p)

Thanks for all that!

Hmm, I think maybe we’re talking past each other a bunch here.

Suppose that a young camouflaged prey animal “wants” to stand in front of the appropriate background (that matches its camouflage), even without trial-and-error experience of getting attacked or not attacked by predators in front of various different backgrounds, and suppose that the proximal cause of this behavior is that the brainstem superior colliculus is calculating heuristics on visual input and then contributing to the overall reward function.

(I’m still not sure how much you’re inclined to believe that this is a thing that happens in real animals, but let’s assume it for the sake of argument. You did say that “terrain affections are possible”, at least.)

I feel like you would describe this fact as a victory for empowerment theory, because getting eaten by a predator is disempowering. Therefore we can describe this superior colliculus heuristic calculation thing as “part of an approximation of empowerment”.

Whereas I would describe that same fact as a failure of empowerment theory, because the thing I care about (for various reasons) is what the brain is actually doing, not why it evolved that way.

Do you agree so far?

If so, you can ask, why is that the thing I care about? I.e., if the heuristic evolved to advance empowerment, why don’t I think that matters?

  • First, if we’re trying to help humans, and the human brainstem heuristics are coming apart from empowerment (as all approximations do, cf. Goodhart’s law), I think we should pay much more attention to the heuristics than the empowerment. For example, I happen to care about animal welfare, and that caring tendency presumably evolved because it was “empowering” on net in the societies of my ancestors. But maybe in the modern world, and especially if I had an omnipotent AGI assistant, maybe I would be increasing my own personal empowerment if I didn’t care about animal welfare. But I wouldn’t want that. And that’s OK. We shouldn’t be using a normative theory where I’m somehow wrong / confused to choose to care about animal welfare at the expense of my own empowerment.
  • Second, we’re going to make an AGI, and presumably in that AGI we (like evolution) will use heuristic approximations of empowerment instead of empowerment itself. And those heuristics will come apart from empowerment. And then the AGI will actually follow the heuristics, not empowerment. At some point, it will be too late to turn off the AGI, and the heuristics will be in charge of the future of terrestrial life. So we should really care directly about what behaviors the heuristics are incentivizing, and we should only care indirectly about what behaviors empowerment would be incentivizing.

A better comparison is a simple static painting of a river landscape vs static painting of a brick wall. 

Sure. But I claim that lots of people like paintings of river landscapes, even when the painting has been hanging for months or even years and thus offers zero novelty. When I moved from one house to another, my wife hung up many of the same paintings that were on prominent display in the old house, despite us having a pile of paintings that we’ve never hung up at all and thus which would have had dramatically more information value.

I claim that this fact is a victory for superior colliculus micro-habitat-related heuristics.

…But I still think that the strongest case is the theoretical one:

  • By default, being in the wrong micro-habitat gives a negative reward which can be both very sparse and often irreversibly fatal (e.g. a higher chance of getting eaten by a predator, starving to death, freezing to death, burning to death, drowning, getting stuck in the mud, etc. etc., depending on the species)
  • Therefore, it’s very difficult to learn which micro-habitat to occupy purely by trial-and-error without the help of any micro-habitat-specific reward-shaping.
  • Such reward-shaping is straightforward to implement by doing heuristic calculations on sensory inputs.
  • Animal brains (specifically brainstem & hypothalamus in the case of mammals) seem to be perfectly set up with the corresponding machinery to do this—visual heuristics within the superior colliculus, auditory heuristics within the inferior colliculus, taste heuristics within the medulla, smell heuristics within the hypothalamus, etc.

So even without any behavioral evidence whatsoever, I would still be shocked if this wasn’t a thing in every animal from flies to humans.

Replies from: jacob_cannell
comment by jacob_cannell · 2022-10-31T17:55:52.813Z · LW(p) · GW(p)

I feel like you would describe this fact as a victory for empowerment theory, because getting eaten by a predator is disempowering.

I feel like empowerment is mostly a digression here. There is the behavioral empowerment hypothesis which just says something like "in the absence of specific goals, animals act as if they are pursuing empowerment", which I think is largely true - if we use a broad-empowerment definition like optionality. But there is a difference between "acting as if seeking empowerment", and specifically seeking empowerment due to empowerment reward as computed by some more direct approximation.

So I agree with you that assuming the superior colliculus is directly computing an innate terrain camo affection, that is at best just "acting as if" behavioral empowerment, and doesn't really explain much more than "acting as if maximizing genetic fitness".

In this thread I have mostly been discussing value-of-information, which seems to be one of the primary reward signals for the human brain. Assuming that theory - the serotonin based info-value indirect reward pathway stuff - is all true, then I think we both agree that modeling that directly is naturally important for accurately modelling human values. Info value is a sort of empowerment related signal, but it's not optionality empowerment. I do think it's likely the brain also has estimated optionality reward somewhere, but I haven't researched that as much and not sure where it is. But I think we also would agree that if there is direct optionality reward then modelling that is also important for modelling human values.

There is another use of empowerment which is as a general bound on an unknown utility function. If you break utility down into short and long term components, the long term component converges to empowerment (for many/most agentic utility functions). That really is independent of how the brain may or may not be using empowerment signals. The hope is AGI optimizing for our long term empowerment would seek to acquire power and then hand it over to us, which is pretty much what we want. The main risk is that the short term component matters, but for that we can use some learned approximation of human values. That would eventually diverge in the long term, but that's ok because the long term component converges to empowerment. Another issue is identifying the agents/agency to empower, which - as you point out - needs to consider altruism. Empowerment of a single human isn't even the correct bound if we define the agent as our physical brain, as due to altruism our utility function is diffusely wider than a pure selfish rational assumption. The easiest way to handle that is by empowering humanity or agency more broadly. The harder more correct way is using some more complex multi-agent theory of mind, so that empowering a single human is empowering all the simulacra sub-agents they care about (of which the self is just one for an altruist).

But I claim that lots of people like paintings of river landscapes, even when the painting has been hanging for months or even years and thus offers zero novelty. When I moved from one house to another, my wife hung up many of the same paintings that were on prominent display in the old house,

Paintings have high novelty only on the first viewings, after that any novelty only comes from forgetting - but they are still more interesting than a blank wall. Regardless much of the reason people hang art is for the benefit of others.

I claim that this fact is a victory for superior colliculus micro-habitat-related heuristics.

More people like abstract art, surreal art, portraits, etc - landscapes are but a small fraction. So is that defeat for superior colliculus micro-habitat-related heuristics?

My quick read of the human/primate superior colliculus indicates it mostly computes subonscious saccade/gaze targets mostly from V1 feedback but also from multi-modal signals from numerous regions, and is then inhibited/overriden when the cortex directs conscious eye movements from frontal eye fields.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-11-01T19:41:05.657Z · LW(p) · GW(p)

Thanks for all that—very helpful! I’ll just respond to the part at the end.

My quick read of the human/primate superior colliculus indicates it mostly computes subonscious saccade/gaze targets mostly from V1 feedback but also from multi-modal signals from numerous regions, and is then inhibited/overriden when the cortex directs conscious eye movements from frontal eye fields.

I’m with you except for the “mostly”. Saccades (and “orienting reactions” more generally) are one of the things that the superior colliculus does, and it happens to be a well-studied one for various reasons (namely, it’s technically easy and medically relevant, I think). But I also think that the superior colliculus does other things too, and the literature on those things is underdeveloped (to put it politely). So randomly-selected SC papers “mostly” talk about saccades but we shouldn’t conclude that SC itself “mostly” exists for the purpose of saccades.

For example, I think there’s abundant evidence that pretty much any animal with eyes has a set of a number (dozens?) of innate reactions to certain types of visual stimuli, and I think there’s strong reason to locate the visual-detection part of those reactions in the superior colliculus (a.k.a. optic tectum in non-mammals):

  • Detecting “an unexpected visual stimulus that’s worthy of an orienting reaction” is a special case of this category, and SC definitely does that.
  • There’s pretty good literature on face-detection in infant humans, and the author I like (Mark Johnson) suggests that it involves a low-resolution innate face detector (basically three dark blobs forming an inverted triangle) in the superior colliculus.
  • My impression is that Mark Johnson was inspired by the filial-imprinting-in-chicks literature, where it’s known that the optic tectum is detecting salient visual things that might be worthy of imprinting on. (I haven’t really dived into this yet, could be wrong.)
  • This paper argues that the mouse SC can detect expanding dark blobs in the upper field of view—which serve as a heuristic approximation that maybe there’s an incoming bird-of-prey—and trigger a scamper-away brainstem reaction. Actually this paper is even better on that topic.
  • I think there’s good behavioral evidence (see e.g. refs here) that monkeys are scared[1] of slithering snakes for direct innate reasons, independent of any within-lifetime evidence that slithering snakes are unusually important. If so, we need to explain that, presumably via some innate heuristics somewhere in the brain for what a slithering snake looks like. Which part of the brain? As usual, papers on this topic are a mess, but all the evidence I’ve seen is consistent with SC, and no other possibility makes any sense to me.
  • (More generally, the process-of-elimination argument carries a lot of weight for me. The idea that visual cortex could detect snakes / birds / whatever, all on its own, without any other source of ground truth, would fly in the face of everything else I think I know about the cortex [LW · GW]. And if it’s not the cortex, the SC is the only other possibility.)
  • This paper has some issues but it does present I think decent evidence that the structure and activity of SC is compatible with the types of calculations-on-visual-input that I’m talking about.

More people like abstract art, surreal art, portraits, etc - landscapes are but a small fraction. So is that defeat for superior colliculus micro-habitat-related heuristics?

Again, eating ice cream is not actually health-promoting. But it effectively triggers various taste-related heuristics that originally evolved to make us have health-promoting eating behavior.

By the same token, houses full of “pretty” paintings are not in fact better places for hunter-gatherers to live. But they (probably) do a slightly better job of triggering various vision-related heuristics that originally evolved to reward-shape early humans to find good and safe places to hang out in the African Savannah.

This can be true even if the “pretty” paintings are abstract and have no superficial resemblance to the African savannah. Ice cream likewise has very little superficial resemblance to the food that our African ancestors ate. And “Three dark blobs forming an inverted triangle” has very little superficial resemblance to a human face, yet that’s supposedly what the human SC’s innate face detector is actually looking for.

I think that there are many contributions to the decision of what to look at (including info-value) and what paintings to hang in your house (including impressing your friends). I only claim that SC habitat-derived heuristics are ONE of the contributions.

Likewise, I wasn’t sure before, but my current impression is that you do NOT claim that info-value is a grand unified theory that completely 100% explains what we are looking at at any given moment.

If so, it might be difficult to make progress by chatting about what paintings people hang or whatever—neither of us has a theory that makes sharp predictions about everything. ¯\_(ツ)_/¯

  1. ^

    Technically, it seems more like monkeys are innately “aroused” (in the psych jargon sense, not the sexual sense) by the sight of slithering snakes, but this arousal has a strong tendency to transmute into fear during within-lifetime learning. See here.

Replies from: jacob_cannell
comment by jacob_cannell · 2022-11-01T21:14:20.978Z · LW(p) · GW(p)

The evidence seems pretty clear that the SC controls unconscious saccades/gazes. Given that background it makes perfect sense the SC is also a good location for simple crude innate detectors which bias saccades towards important targets: especially for infants, because human infants are barely functionally conscious at birth and so in the beginning the SC may have complete control. But gradually the higher 'conscious loops' involving BG, PFC and various other modules begin to take more control through the FEF (although not always of course).

That all seems compatible with your evidence - I also remember reading that there are central pattern generators which actually start training the visual cortex in the womb on simple face like patterns. But I believe in a general rule of three for the brain: whenever you find evidence that the brain is doing something two different ways that both seem functionally correct, it's probably using both of those methods and some third one you haven't thought of yet. And SC having innate circuits to bias saccades seems likely.

But that same evidence doesn't show the SC is much involved in higher level conscious decisions about whether to stare at a painting or listen to a song for a few minutes vs eating ice cream. That is all reward-shaped high level planning involving the various known dopaminergic decision pathways combined with serotogenic (and other) pathways that feed into general info-value.

Likewise, I wasn’t sure before, but my current impression is that you do NOT claim that info-value is a grand unified theory that completely explains what we are looking at at any given moment.

I do claim it is the likely grand unified theory that most completely explains conscious decisions about info consumption choices in adults - and the evidence from which I sampled earlier is fairly extensive IMHO; whereas innate SC circuits explain infant gazes (in humans, SC probably has a larger role in older smaller brained vertebrates). Moreover, if we generalize from SC to other similar subcortical structures, I do agree those together mostly control the infant for the first few years - as the higher level loops which conscious thinking depend on all require significant training.

Also - as I mentioned earlier I agree that fear of heights is probably innate, simple food taste biases are innate, obviously sexual attraction has some innate bootstrapping, etc so I'm open to the idea there is some landscape biasing in theory, but clearly the SC is unlikely to be involved in food taste shaping, and I don't think you have shown much convincing evidence it is involved in visual taste shaping. But clearly there most be some innate visual shaping for at least sexual attraction - so evidence that SC drives that would also be good evidence it drives some landscape biasing, for example. But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1. So evidence that the SC's inputs shift from V1 in infants up to higher visual cortex in adulthood would also be convincing, as that seems somewhat necessary for it to be involved in reward prediction of higher level learned visual patterns.

I'm also curious about this part:

And if it’s not the cortex, the SC is the only other possibility.

And generally if we replace the reward shaping/bias of "savanna landscape" with "sexually attractive humanoid" then I'm more onboard the concept of something that is highly likely an innate circuit somewhere. (I don't even buy the evolutionary argument for a savana landscape bias - humans spread out to many ecological niches including coastal zones which are nothing like the savana)

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-11-02T18:20:27.253Z · LW(p) · GW(p)

Here are some of my complaints about info-value as a grand unified theory by itself (i.e., in the absence of innate biases towards certain types of information over other types):

  • There are endless fractal-like depths of complexity in rocks, and there are endless fractal-like depths of complexity in ants, and there are endless fractal-like depths of complexity in birdsong, and there are endless fractal-like depths of complexity in the shape of trees, etc. So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained. You cite video-game ML papers, but in this respect, video-games (especially 1980s video-games) are not analogous to the real world. You can easily saturate the novelty in Pac-Man, and then the only way to get more novelty is to “progress” in the game in roughly the way that the game-designers intended. Pac-Man does not have 10,000 elaborate branching fascinating “side-quests” that don’t advance your score, right? If it did, I claim those ML papers would not have worked. But the real world does have “side quests” like that. You can get a lifetime supply of novelty by just closing your eyes and thinking about patterns in prime numbers, etc. Yet children reliably learn relevant things like language and culture much more than irrelevant things like higher-order patterns in pebble shape and coloration. Therefore, I am skeptical of any curiosity / novelty drive completely divorced from “hardcoded” drives that induce disproportionate curiosity / interest in certain specific types of things (e.g. human speech sounds) over other things (e.g. the detailed coloration of pebbles).
    • (I think these hardcoded drives, like all hardcoded drives, are based on relatively simple heuristics, as opposed to being exquisitely aimed at specific complex concepts like “hunting”. I think “some simple auditory calculation that disproportionately triggers on human speech sounds” is a very plausible example.)
    • If your response is “There is an objective content-neutral metric in which human speech sounds are more interesting than the detailed coloration of pebbles”, then I’m skeptical, especially if the metric looks like a kind of “greedy algorithm” that does not rely on the benefit of hindsight. In other words, once you’ve invested years into learning to decode human speech sounds, then it’s clear that they are surprisingly information-rich. But before making that investment, I think that human speech sounds wouldn’t stand out compared to the coloration of pebbles or the shapes of trees or the behavior of ants or whatever. Or at least they wouldn’t stand out so much that it would explain human children’s attention to them.
  • We need to explain the fact that (I claim) different people seem very interested in different things, and these interests are heritable, e.g. interest-in-people versus interest-in-machines. Hardcoded drives in “what is interesting” would explain that, and I’m not sure what else would.
  • This is unlikely to convince you, but there is a thing called “specific language impairment” where (according to my understanding) certain otherwise-intelligent kids are unusually inattentive to language, and wind up learning language much slower than their peers (although they often catch up eventually). I’m familiar with this because I think one of my kids has a mild case. If he’s playing, and someone talks to him, he rarely orients to it, just as I rarely orient to bird sounds if I’m in the middle of an activity. Speech just doesn’t draw his attention much! And both his tendency to converse and ability to articulate clearly are way below age level. I claim that’s not a coincidence—learning follows attention. Anyway, a nice theory of this centers around an innate human-speech-sound detector being less active than usual. (Conversely, long story [LW · GW], but I think one aspect of autism is kinda the opposite of that—overwhelming hypersensitivity to certain stimuli often including speech sounds and eye contact, which then leads to avoidance behavior.)
  • There’s some evidence that a certain gene variant (ASPM) helps people learn tonal languages like Chinese, and the obvious-to-me mechanism is tweaking the innate human-speech-sound heuristics. That’s probably unlikely to convince you, because there are other possible mechanisms too, and ASPM is expressed all over the brain and you can’t ethically do experiments to figure out what part of the brain is mediating this.

whenever you find evidence that the brain is doing something two different ways that both seem functionally correct, it's probably using both of those methods and some third one you haven't thought of yet

I strongly disagree with the idea that SC and cortex are doing similar things. See discussion here [LW · GW]. I think the cortex + striatum is fundamentally incapable of having an innate snake detector, because the cortex + striatum is fundamentally implementing a learning algorithm. Given a ground-truth loss function for the presence / absence of snakes, the cortex + striatum can do an excellent job learning to detect snakes in particular. But without such a loss function, they can’t. (Well, they can detect snakes without a special loss function, but only as “just another learned latent variable”. This latent variable couldn’t get tied to any special innate reaction, in the absence of trial-and-error experience.)

Anyway, I claim that SC is playing the role of implementing the snake-heuristic calculations that underlie that loss function. (Among other things.)

But that same evidence doesn't show the SC is much involved in higher level conscious decisions about whether to stare at a painting or listen to a song for a few minutes vs eating ice cream. That is all reward-shaped high level planning involving the various known dopaminergic decision pathways combined with serotogenic (and other) pathways that feed into general info-value.

SC projects to VTA/SNc, which is related to whether we find things positive/negative valence, pleasant/unpleasant etc. It’s not the only contribution, but I claim it’s one contribution.

clearly the SC is unlikely to be involved in food taste shaping, and I don't think you have shown much convincing evidence it is involved in visual taste shaping.

I think the relevant unit is “brainstem and hypothalamus”, of which the SC is one part, the part that seems like it has the right inputs and multi-layer architecture to do things like calculate heuristics on the visual FOV. Food taste shaping is a different part of the brainstem, namely the gustatory nucleus of the medulla.

But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1.

I’m surprised that you wrote this. I thought you were on board with the idea that we should think of visual cortex as loosely (or even tightly) analogous to deep learning? Let’s train a 12-layer randomly-initialized ConvNet, and look at the vector of activations from layer 10, and decide on that basis whether you’re looking at a person, in the absence of any ground truth. It’s impossible, right? The ConvNet was randomly initialized, you can’t get any object-level information from the fact that neuron X in layer 10 has positive or negative activation, because it’s not a priori determined what role neuron X is going to wind up playing in the trained model.

We need ground truth somehow, and my claim is that SC provides it. So my mainline expectation is that SC gets visual information in a way that bypasses the cortex altogether. This is at least partly true (retina→LGN→SC pathway). SC does get inputs from visual cortex, as it turns out, which had me confused for a while but I’m OK with it now. That’s a long story, but I still think the cortical input is unrelated to how SC detects human faces and snakes and whatnot.

Replies from: jacob_cannell
comment by jacob_cannell · 2022-11-02T23:21:30.996Z · LW(p) · GW(p)

There are endless fractal-like depths of complexity in rocks, and there are endless fractal-like depths of complexity in ants, and there are endless fractal-like depths of complexity in birdsong, and there are endless fractal-like depths of complexity in the shape of trees, etc. So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained.

There is not 'endless fractal-like depths of complexity' in the retinal images of rocks or ants or trees, which is what is actually relevant here. For any model, a flat uniform color wall has near zero compressible complexity, as does a picture of noise (max entropy but it's not learnable). Real world images have learnable complexity which crucially varies based on both the image and the model's current knowledge. But it's never "endless": generally it's going to be on order or less than the image entropy the cortex gets from the retina, which is comparable to compression with modern codecs.

You cite video-game ML papers,

Actually in this thread I cited neurosci papers: first that curiosity/info-value is a reward processed like hunger[1], and a review article from 2020[2] which is an update from a similar 2015 paper[3].

So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained.

Sure - but curiosity/info-gain obviously isn't all of reward, so the various other components can also steer behavior paths towards fitness relevant directions, which then can indirectly biases the trajectory of the curiosity-driven learning as it's always relevant to the models' current knowledge and thus the experience trajectory.

Therefore, I am skeptical of any curiosity / novelty drive completely divorced from “hardcoded” drives that induce disproportionate curiosity / interest in certain specific types of things (e.g. human speech sounds) over other things (e.g. the detailed coloration of pebbles).

Recall that the infant is mostly driven by subcortical structures and innate patterns for the early years, and during all this time it is absolutely bombarded with human speech as the primary consistent complex audio signal. There may be some attentional bias towards human speech, but it may not be necessary, as there aren't many other audio streams that could come close to competing. Birdsong is both less interesting and far less pervasive/frequent for most children's audio experience. Also 'visually interesting pebbles' don't seem that different than other early children's toys: seems children would find them interesting (although there shape is typically boring).

I strongly disagree with the idea that SC and cortex are doing similar things.

I didn't say they did - I said I'm aware of two proposals for an innate learning bias for faces: CPGs pretraining the viscortex in the womb, and innate attentional bias circuits in the SC. These are effectively doing a similar thing.

We need ground truth somehow, and my claim is that SC provides it. So my mainline expectation is that SC gets visual information in a way that bypasses the cortex altogether. T

For attentional bias/shaping the SC likely can only support very simple pattern biases close to a linear readout. So a simple bias to attend to faces seems possible, but I was actually talking about sexual attraction when I said:

But clearly there most be some innate visual shaping for at least sexual attraction - so evidence that SC drives that would also be good evidence it drives some landscape biasing, for example. But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1.

For sexual attraction the patterns are just too complex, so they are represented in IT or similar higher visual cortex. Any innate circuit that references human body shape images and computes their sexual attraction must get that input from higher viscortex - which then leads to the whole symbol grounding problem - as you point out, and I naturally agree.

But regardless of the specific solution to the symbol grounding problem, a consequence of that solution is that the putative brain region computing attraction value of images of humanoid shapes would need to compute that primarily from higher viscortex/IT input.

I think your model may be something like the SC computes an attentional bias which encodes all the innate sexiness geometry and thus guides us to spend more time saccading at sexy images rather than others, and possibly also outputs some reward info for this.

But that could not work as stated, simply because the sexiness concept is very complex and requires a deepnet to compute (symmetry, fat content, various feature ratios, etc).

Also this must be true, because otherwise we wouldn't see the failures of sexual imprinting in birds that we do in fact observe.

So how can the genome best specify a complex concept innately using the least number of bits? Just indexing neurons in the learned cortex directly would be bit-minimal, but as you point out that isn't robust.

However the topographic organization of cortex can help, as it naturally clusters neurons semantically.

Another way to 'locate' specific learned neurons more robustly is through proxy matching, where you have dumb simple humanoid shape detectors and symmetry detectors etc encoding a simple sexiness visual concept - which could potentially be in the SC. But then during some critical window the firing patterns of those proxy circuits are used to locate the matching visual concept in visual cortex and connect to that. In other words, you can use the simple innate proxy circuit to indirectly locate a cluster of neurons in cortex, simply based on firing pattern correlation.

This allows the genome to link to a high complex concept by specifying a low complexity proxy match for that concept in its earlier low complexity larval stage.

Proxy matching implies that after critical period training whatever neurons represents innate sexiness must then shift to get their input from higher viscortex: IT rather than just V1, and certainly not LGN.

Another related possibility is that the SC is just used to create the initial bootstrapping signal, and then some other brain region actually establishes the connection to innate downstream dependencies of sexiness and learned sexiness - so separating out the sexiness proxy from the used sexiness concept.

Anyway my point was more that innate sexual attraction must be encoded somewhere, and any evidence that the SC is crucially involved with that is evidence it is crucially involved with other innate visual bias/shaping.


  1. Shared striatal activity in decisions to satisfy curiosity and hunger at the risk of electric shocks ↩︎

  2. Systems neuroscience of curiosity ↩︎

  3. The psychology and neuroscience of curiosity ↩︎

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-11-03T18:13:07.502Z · LW(p) · GW(p)

Thanks again for taking the time to chat, I am finding this super helpful in understanding where you’re coming from.

Your description of proxy matching is a close match to what I’m thinking. (Sorry if I’ve been describing it poorly!)

I think I got confused because I’m mapping it to neuroanatomy differently than you. I think SC is the “proxy” part, but the “matching” part is somewhere else, not SC. For example, it might look something like this:

  1. SC calculates a proxy, based on LGN→SC inputs. The output of this calculation is a signal, which I’ll call “Fits proxy?”
  2. The “Fits proxy?” signal then gets sent (indirectly) to a certain part of the amygdala, where it’s used as a “ground truth” for supervised learning.
  3. This part of the amygdala builds a trained model. The input to the trained model is (mostly-high-level) visual information, especially from IT. The output of the trained model is some signal, which I’ll call “Fits model?”
  4. The SC’s “Fits proxy?” signal and the amygdala’s “Fits model?” signal both go down to the hypothalamus and brainstem, possibly hitting the very same neurons that trigger specific innate reactions.
  5. Optionally, the trained model in the amygdala could stop updating itself after a critical period early in life.
  6. Also optionally, as the animal gets older, the “Fits proxy?” signal could have less and less influence on those innate reaction neurons in the hypothalamus & brainstem, while the “Fits model?” signal would have more and more influence.

(This is one example; there are a bunch of other variations on this theme, including ones where you replace “part of the amygdala” with other parts of the forebrain like nucleus accumbens shell or lateral septum, and also where the proxy is coming from other places besides SC.)

(This is a generalization of “calculating correlations”. If the amygdala trained model is only one “layer” in the deep learning sense, then it would be just calculating linear correlations between IT signals and the proxy, I think. My best guess is that the amygdala is learning a two-layer feedforward model (more or less), so a bit more complicated than linear correlations, although low confidence on that.)

Again, since the trained model is in the amygdala, not SC, there’s no need to “shift” the SC’s inputs to IT. That’s why I was confused by what you wrote.  :)

Replies from: jacob_cannell
comment by jacob_cannell · 2022-11-03T23:52:14.941Z · LW(p) · GW(p)

Hey thanks for explaining this - makes sense to me and I think we are mostly in agreement. Using the proxy signal as a supervised learning target to recognize the learned target pattern in IT is a straightforward way to implement the matching, but probably not quite complete in practice. I suspect you also need to combine that with some strong priors to correctly carve out the target concept.

Consider the equivalent example of trying to train a highly accurate cat image detector given a dataset containing say 20% cats combined with a crappy low complexity proxy cat detector to provide the labels. Can you really bootstrap improve discriminative models in that way with non-trivial proxy label noise? I suspect that the key to making this work is using the powerful generative model of the cortex as a regularizer, so you train it to recognize images the proxy detector labels as cats that are also close to the generative model's data manifold. If you then reoptimize (in evolutionary time) the proxy detector to leverage that I think it makes the problem much more tractable. The generative model allows you to make the learned model far more selective around the actual data manifold to increase robustness. In very simple vague terms the model would then be learning the combination of high proxy probability combined with low distance to the data manifold of examples from the critical training set.

Later if you then test OoD on vague non-cats (dogs, stuffed animals) not encountered in training that would confuse the simple proxy the learned model can reject those - even though it never saw them during critical training - simply because they are far from the generative manifold, and the learned model is 'shrunk' to fit that manifold.

I do agree the amygdala does seem like a good fit for the location of the learned symbol circuit, although at that point it raises the question of why not also just have the proxy in the amygdala? If the amygdala has the required inputs from LGN and/or V1 it would be my guess that it could also just colocate the innate proxy circuit. (I haven't looked in the lit to see if those connections exist)

Also 6 seems required for the system to work as well in adulthood as it typically does, and yet also explain the out of distribution failures for imprinting etc. (Once the IT representation is learned you want to use that exclusively, as it should be strictly superior to the proxy circuit. This seems a little weird at first, but the)

The hope is that this same mechanism which seems well suited for handling imprinting also works for grounding sexual attraction (as an elaboration of imprinting) and then more complex concepts like representations of other's emotions from facial expression, vocal tone, etc proxies, and then combining that with empathic simulation to ground a model of other's values/utility for social game theory, altruism, etc.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2022-11-07T18:53:33.824Z · LW(p) · GW(p)

The hope is that this same mechanism which seems well suited for handling imprinting also works for grounding sexual attraction (as an elaboration of imprinting) and then more complex concepts like representations of other's emotions from facial expression, vocal tone, etc proxies, and then combining that with empathic simulation to ground a model of other's values/utility for social game theory, altruism, etc.

Yes, that is my hope too! And the main thing I’m working on most days is trying to flesh out the details.

I do agree the amygdala does seem like a good fit for the location of the learned symbol circuit, although at that point it raises the question of why not also just have the proxy in the amygdala? If the amygdala has the required inputs from LGN and/or V1 it would be my guess that it could also just colocate the innate proxy circuit. (I haven't looked in the lit to see if those connections exist)

For example, I claim that all the vision-related inputs to the amygdala have at some point passed through at least one locally-random filter stage (cf. “pattern separation” in neuro literature [LW · GW] or “compressed sensing” in DSP literature). That’s perfectly fine if the amygdala is just going to use those inputs as feedstock for an SL model. SL models don’t need to know a priori which input neuron is representing which object-level pattern, because it’s going to learn the connections, so if there’s some randomness involved, it’s fine. But the randomness would be a very big problem if the amygdala needs to use those input signals to calculate a ground-truth proxy.

As another example, a ground-truth proxy requires zero adjustable parameters (because how would you adjust them?), whereas a learning algorithm does well with as many adjustable parameters as possible, more or less.

So I see these as very different algorithmic tasks—so different that I would expect them to wind up in different parts of the brain, just on general principles.

The amygdala is a hodgepodge grouping of nuclei, some of which are “really” (embryologically & evolutionarily) part of the cortex, and the rest of which are “really” part of the striatum (ref). So if we’re going to say that the cortex and striatum are dedicated to running within-lifetime learning algorithms (which I do say [LW · GW]), then we should expect the amygdala to be in that same category too.

By contrast, SC is in the brainstem, and if you go far enough back, SC is supposedly a cousin of the part of the pre-vertebrate (e.g. amphioxus) nervous system that implements a simple “escape circuit” by triggering swimming when it detects a shadow—in other words, a part of the brain that triggers an innate reaction based on a “hardcoded” type of pattern in visual input. So it would make sense to say that the SC is still more-or-less doing those same types of calculations.

comment by Adam Shai (adam-shai) · 2022-10-29T17:27:45.707Z · LW(p) · GW(p)

Another paper you might be interested in that shows reinforcement learning effects even after training has reached asymptotic performance in a perceptual task:  https://elifesciences.org/articles/49834