Emotional valence vs RL reward: a video game analogy
post by Steven Byrnes (steve2152) · 2020-09-03T15:28:08.013Z · LW · GW · 6 commentsContents
Definition of "the valence of an emotional state" (at least as I'm using the term) How is emotional valence implemented computationally? A video game analogy Worked examples None 6 comments
(Update Sept 2021: I no longer believe that we make decisions that maximize the expected sum of future rewards—see discussion of TD learning here [LW · GW]. It's more like "maximizing expected reward next step" (but "next step" can entail making a long-term plan). So this post isn't quite right in some of its specifics. That said, I don't think it's wildly wrong and I think it would just need a couple tweaks. Anyway, it's just a brainstorming post, don't take it too literally.)
I recently read a book about emotions and neuroscience (brief review here [LW · GW]) that talked about "valence and arousal" as two key ingredients of our interoception. Of these, arousal seems pretty comprehensible—the brain senses the body's cortisol level, heart rate, etc. But the valence of an emotion—what is that? What does it correspond to in the brain and body? My brief literature search didn't turn up anything that made sense to me, but after thinking about it a bit, here is what I came up with (with the usual caveat that it may be wrong or obvious). But first,
Definition of "the valence of an emotional state" (at least as I'm using the term)
Here's how I want to define the valence of an emotional state:
- When I'm proud, that's a nice feeling, I like having that feeling, and I want that feeling to continue. That's positive valence.
- When I have a feeling of guilt and dread, that's a bad feeling, I don't like having that feeling, and I want that feeling to end as soon as possible. That's negative valence.
There's a chance that I'm misusing the term; the psychological literature itself seems all over the place. For example, some people say anger is negative valence, but when I feel righteous anger, I like having that feeling, and I want that feeling to continue. (I don't want to want that feeling to continue, but I do want that feeling to continue!) So by my definition, righteous anger is positive valence!
There are some seemingly-paradoxical aspects of how valence does or doesn't drive behavior:
- Sometimes I have an urge to snack, or to procrastinate, but doing so doesn't make me happy or put me in a positive-valence state; it makes my mood worse, and I know it's going to make my mood worse, but I do it anyway.
- Conversely, sometimes it occurs to me that I should go meditate, and I know it will make me happy, but I feel an irresistible urge not to, and I don't.
- ...and yet these are exceptions. I do tend to usually take actions that lead to more positive-valence states and fewer negative-valence states. For example, I personally go way out of my way to try to avoid future feelings of guilt.
(See also: Scott Alexander on Wanting vs Liking vs Approving [LW · GW])
How is emotional valence implemented computationally? A video game analogy
Here's a simple picture I kinda like, based on an analogy to action-type video games. (Ha, I knew it, playing all those video games in middle school wasn't a waste of time after all!)
In many video games you control a character with a "health" level. It starts at 100 (or whatever), and if it ever gets to 0, you die. There are two ways to gain or lose health:
- Event-based health changes: When you get hit by an enemy, you lose health points. When you fall from a great distance and hit the ground, you lose health points. When you pick up a health kit, you gain health points. Etc.
- State-based health changes: In certain situations, you lose or gain a certain number of health points every second.
- For example, maybe you can walk across lava, but if you don't have the appropriate protective gear, you keep losing health points at a fixed rate, for as long as you're in the lava. So you run across the lava as fast as you can, and with luck, you can make it to the other side before you die.
In the brain, I've come around to the reinforcement-learning-type view that the neocortex tries to maximize a "reward" signal (among other things) [LW · GW]. So in the above, replace "gain health points" with "get positive reward", replace "lose health points" with "get negative reward", then the "state-based" situation corresponds to my current working theory of what valence is. Pretty simple, right?
To be explicit:
- If negative reward keeps flowing as long as you're in a state, you perceive that state as having negative valence. As a reward-maximizing system, you will feel an urge to get out of that state as quickly as possible, you will describe the state as aversive, and you will try to avoid it in the future (other things equal).
- If positive reward keeps flowing as long as you're in a state, you perceive that state as having positive valence. As a reward-maximizing system, you will feel an urge to stay in that state as long as possible, you will describe the state as attractive, and you will try to get back into that state in the future (other things equal).
Worked examples
- Other things equal, I seek out positive-valence emotional states and try to avoid negative-valence emotional states. Easy: I choose actions based in part [LW · GW] on the predicted future rewards, and the rewards associated with the valence of my emotional state is one contributor to that total reward.
- I have an urge to have a snack, even though I know eating it will make me unhappy: I predict that as I eat the snack, I'll get a bunch of positive reward right while I eat it. I also predict that the negative-valence feeling after the snack will dole out slightly negative rewards for a while afterwards. So I feel a bit torn. But if the positive reward is sufficiently large and soon, and the negative-valence feeling afterwards is sufficiently mild and short-lived, then this is appealing on net, so I eat the snack. In the video game analogy, it's a bit like jumping down onto a platform with a giant restorative health kit ... but then you need to run through lava for a while to get back to where you were. Well, if the health gain from the health kit is large enough to outweigh the health loss from needing to run through lava afterwards, then OK, maybe that's worth doing.
- I have an urge to NOT meditate, even though I know meditating will make me happy: Just the opposite. Starting meditating involves stopping other things I'm doing, or turning down the opportunity to do other more-immeditely-appealing things, and that gives me a bunch of negative reward all at once. That outweighs the steady drip of positive reward that I get from time spent being happy, in my brain's unconscious calculation.
6 comments
Comments sorted by top scores.
comment by Gordon Seidoh Worley (gworley) · 2020-09-05T22:27:30.167Z · LW(p) · GW(p)
There's this from QRI that I think also points to a similar interpretation of valence and arousal as the one you use here.
Replies from: steve2152↑ comment by Steven Byrnes (steve2152) · 2020-09-11T22:45:50.219Z · LW(p) · GW(p)
Thanks but I don't see the connection between what I wrote and what they wrote ...
Update: Maybe you meant that they understand the term "valence" in the same way I do. That seems plausible. Their explanation of valence is wildly different than mine, but we are both talking about the same thing, I think.
comment by Kaj_Sotala · 2020-09-04T14:19:14.594Z · LW(p) · GW(p)
What does it correspond to in the brain
lukeprog talks a bit about this in "The Neuroscience of Pleasure [LW · GW]".
comment by noggin-scratcher · 2020-09-03T20:32:45.000Z · LW(p) · GW(p)
Well, if the health gain from the health kit is large enough to outweigh the health loss from needing to run through lava afterwards, then OK, maybe that's worth doing.
Also even if it's not actually enough, and you're going to come out at a small loss overall, sometimes by the magic of time discounting it still feels like a net positive to your present self - because the cost is further away into the future than the gain.
Replies from: steve2152↑ comment by Steven Byrnes (steve2152) · 2020-09-03T23:35:24.326Z · LW(p) · GW(p)
Yes and this is one of many ways that humans don't maximize the time-integral of reward. Sorry for an imperfect analogy. :)
Replies from: noggin-scratcher↑ comment by noggin-scratcher · 2020-09-05T10:22:23.447Z · LW(p) · GW(p)
I thought it was still quite an apt analogy, because we do essentially the same time-biased thing in all kinds of other contexts