[Intuitive self-models] 2. Conscious Awareness

post by Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · LW · GW · 12 comments

Contents

  2.1 Post summary / Table of contents
  2.2 The “awareness” concept
    2.2.1 The cortex has a finite computational capacity that gets deployed serially
    2.2.2 Predictive learning represents that algorithmic property via a kind of abstract container called “awareness”
    2.2.3 “S(apple)”, defined as the self-reflective thought “apple being in awareness”, is different from the object-level thought “apple”
  2.3 Awareness over time: The “Stream of Consciousness”
  2.4 Relation between “awareness” and memory
    2.4.1 Intuitive model of memory as a storage archive
    2.4.2 Intuitive connection between memory and awareness
  2.5 The valence of S(X) thoughts
    2.5.1 Positive-valence S(X) models often go with “what my best self would do” (other things equal)
    2.5.2 Positive-valence S(X) models also tend to go with X’s that are object-level motivating (other things equal)
  2.6 S(A) as “the intention to immediately do action A”, and the rapid sequence [S(A) ; A] as the signature of a deliberate action
    2.6.1 Clarification: Two ways to “think about an action”
    2.6.2 For any action A where S(A) has positive valence, there’s often a two-step temporal sequence: [S(A) ; A actually happens]
    2.6.3 This two-step sequence corresponds to “deliberate” / “intentional” actions (as opposed to “spontaneously blurting something out”, “acting on instinct”, etc.)
    2.6.4 The common temporal sequence above—i.e. [S(A) with positive valence ; A actually happens]—is itself incorporated into the intuitive self-model. Call it D(A) for “Deciding to do action A”
    2.6.5 An application: “Illusions of free will”
  2.7 Conclusion
None
12 comments

2.1 Post summary / Table of contents

This is the second of a series of eight blog posts [? · GW], which I’ll be serializing over the next month or two. (Or email or DM [? · GW] me if you want to read the whole thing right now.)

The previous post [LW · GW] laid some groundwork for talking about intuitive self-models. Now we’re jumping right into the deep end: the intuitive concept of “conscious awareness” (or “awareness” for short). Some argue (§1.6.2 [LW · GW]) that if we can fully understand why we have an “awareness” concept, then we will thereby understand phenomenal consciousness itself! Alas, “phenomenal consciousness itself” is outside the scope of this series—again see §1.6.2 [LW · GW]. Regardless, the “awareness” concept is centrally important to how we conceptualize our own mental worlds, and well worth understanding for its own sake.

In one sense, “awareness” is nothing special: it’s an intuitive concept, built like any other intuitive concept. I can think the thought “a squirrel is in my conscious awareness”, just as I can think the thought “a squirrel is in my glove compartment”.

But in a different sense, “awareness” feels a bit enigmatic. The “glove compartment” concept is a veridical model (§1.3.2 [LW · GW]) of a tangible thing in a car. Whereas the “awareness” concept is a veridical model of … what exactly, if anything?

I have an answer! The short version of my hypothesis is: The brain algorithm involves the cortex, which has a limited computational capacity that gets deployed serially—you can’t both read health insurance documentation and ponder the meaning of life at the very same moment.[1] When this aspect of the brain algorithm is itself incorporated into a generative model via predictive (a.k.a. self-supervised) learning, it winds up represented as an “awareness” concept, which functions as a kind of abstract container that can hold any other mental concept(s) in it.

That’s still not the whole story of intentions and decisions—it’s missing the critical ingredient of an intuitive agent that actively causes the decisions. That turns out to be a whole giant can of worms, which we’ll tackle in Post 3.

Prior work: From my perspective, my main hypothesis (§2.2) should be “obvious” if you’re familiar with “Global Workspace Theory” [LW · GW][2] and/or “Attention Schema Theory”—and indeed I found Michael Graziano’s Rethinking Consciousness (2019) to be extremely helpful for clarifying my thinking.[3] Graziano & I have some differences though.[4] Also, §2.3 partly follows chapter 5 of Daniel Dennett’s Consciousness Explained (1991). Once we get into §2.5–§2.6 and the whole rest of the series, I mostly felt like I was figuring things out from scratch—but please let me know if you’ve seen relevant prior literature!

2.2 The “awareness” concept

2.2.1 The cortex has a finite computational capacity that gets deployed serially

What do I mean by that heading? Here are a few different ways to put it:

2.2.2 Predictive learning represents that algorithmic property via a kind of abstract container called “awareness”

See text. Diagram on the left is copied from here.

If I say, “This apple is on my mind”, that’s a self-reflective thought. It involves a concept I’m calling “awareness”, and also the concept of “this apple”, and those two concepts are connected by a kind of container-containee relationship.

And I claim that this thought is modeling a situation where the cortex[2] is, at some particular moment, using its finite computational capacity to process my intuitive model of this apple.

More generally:

So there’s a map-territory correspondence: awareness is a (somewhat) veridical (§1.3.2 [LW · GW]) model of this particular aspect of the brain algorithm.

2.2.3 “S(apple)”, defined as the self-reflective thought “apple being in awareness”, is different from the object-level thought “apple”

Illustration of how “apple” and “S(apple)” are two different thoughts. The two purple arrows indicate map–territory correspondences (§1.3.2 [LW · GW]) between (b) and (a), and between (c) and (b). To be clear, the “territory” for (c) is really “(b) being active in the cortex”, not (b) per se.

Astute readers might be wondering: if the “awareness” concept can itself be part of an intuitive model active in the cortex, then wouldn’t the thought “the apple is in awareness right now” be self-contradictory?

After all, the thing you’re thinking “right now” would be “the apple is in awareness right now”, rather than just “the apple” itself, right?

Yes! In order to think the former thought, you would have to stop thinking of just “the apple” itself, and flip to a different thought, where there’s a frame (in the sense of “frame semantics” in linguistics or “frame languages” in GOFAI) involving the “awareness” concept, and the “apple” concept, interconnected by container-containee relationship.

For various purposes later on, it will be nice to have a shorthand. So S(apple) (read: apple in a self-reflective frame) will denote the apple-is-in-awareness thought. It's “self-reflective” in the sense that it involves “awareness”, which is part of the intuitive self-model.

2.3 Awareness over time: The “Stream of Consciousness”

[Optional bonus section! You can skip §2.3, and still be able to follow the rest of the series.]

Here’s an aspect of the intuitive “awareness” concept that does not veridically correspond to the algorithmic phenomenon that it’s modeling. Daniel Dennett makes a big deal out of this topic in Consciousness Explained (1991), because it was important for his thesis to find aspects of our intuitive “awareness” concept that is not veridical, and this one seems reasonably clear-cut.

As background, there are various situations where, for events that unfold over the course of some fraction of a second, later sensory inputs are taken into account in how you remember experiencing earlier sensory inputs. Dennett uses an obscure psychology result called “color phi phenomenon” as his main case study, but the phenomenon is quite common, so I’ll use a more everyday example: hearing someone talk.

I’ll start from the computational picture. As discussed in §1.2 [LW · GW], your cortex is (either literally or effectively) searching through its space of generative models for one that matches input data and other constraints, via probabilistic inference. Some generative models, like a model that predicts the sound of a word, are extended in time, and therefore the associated probabilistic inference has to be extended in time as well.

So suppose somebody says the word “smile” to me over the course of 0.4 seconds. The actual moment-by-moment activation of my cortex algorithm might look like:

…But interestingly, if I then immediately ask you what you were experiencing just now, you won’t describe it as above. Instead you’ll say that you were hearing “sm-” at t=0 and “-mi” at t=0.2 and “-ile” at t=0.4. In other words, you’ll recall it in terms of the time-course of the generative model that ultimately turned out to be the best explanation.

So with that as background, here’s how someone might intuitively describe their awareness over time:

Statement: When I’m watching and paying attention to something, I’m constantly aware of it as it happens, moment-by-moment. I might not always remember things perfectly, but there’s a fact of the matter of what I was actually experiencing at any given time.

Intuitive model underlying that statement: Within our intuitive models, there’s a “awareness” concept / frame as above, and at any given moment it has some content in it, related to the current sensory input, memories, thoughts, or whatever else we’re paying attention to. The flow of this content through time constitutes a kind of movie, which we might call the “stream of consciousness”. The things that “I” have “experienced” are exactly the things that were frames of that movie. The movie unfolds through time, although it’s possible that I’ll misremember some aspect of it after the fact.

What’s really happening, and is the model veridical in this respect? In the above example of hearing the word “smile”, there was no moment when the beginning part of the word was the active part of the active generative model. When “smi-” was entering our brain, the “smile” generative model was not yet strongly activated—that happened slightly later. But it doesn’t seem to be that way subjectively—we remember hearing the whole word as the beginning, middle, and end of “smile”. So,

Question: was the beginning part of hearing the word “smile” actually “experienced”?

Answer: That question is incoherent, because this is an area where the intuitive model above is not veridical.

Specifically, in the brain algorithm in question, there are two history streams we can talk about:

The “stream-of-consciousness” intuitive model smushes these together into the same thing—just one history stream, labeled “what I was experiencing at that moment”.

That smushing-together is an excellent approximation on a multi-second timescale, but inaccurate if you zoom into what’s happening at sub-second timescales.

So a question like “what was I really experiencing at t=0.1 seconds” doesn’t seem answerable—it’s a question about the “map” (intuitive model) that doesn’t correspond to any well-defined question about the “territory” (the algorithms that the intuitive model was designed to model). Or equivalently, it corresponds equally well to two different questions about the territory, with two different answers, and there’s just no fact of the matter about which is the real answer.

Anyway, the intuitive model, with just one history stream instead of two, is much simpler, while still being perfectly adequate to play the role that it plays in generating predictions (see §1.4 [LW · GW]). So it’s no surprise that this is the generative model built by the predictive learning algorithm. Indeed, the fact that this aspect of the model is not perfectly veridical is something that basically never comes up in normal life.

2.4 Relation between “awareness” and memory

2.4.1 Intuitive model of memory as a storage archive

Statement: “I remember going to Chicago”

Intuitive model: Long-term memory in general, and autobiographical long-term memory in particular, is some kind of storage archive. Things can get pulled from that archive into the “awareness” abstract container. And there are memories of myself-in-Chicago stored in that box, which can be retrieved deliberately or by random association.

What’s really happening? There’s some brain system (mainly the hippocampus, I think) that stores episodic memories. The memories can get triggered by pattern-matching (a.k.a. “autoassociative memory”), and then the memory and its various associations can activate all around the cortex.

Is the model veridical? Yeah, pretty much. As above, it’s not a veridical model of your brain as a hunk of meat in 3D space, but it is a reasonably veridical model of an aspect of the algorithm that your brain is running.

2.4.2 Intuitive connection between memory and awareness

Statement: “An intimate part of my awareness is its tie to long-term memory. If you show me a video of me going scuba diving this morning, and I absolutely have no memory whatsoever of it, and you can prove that the video is real, well I mean, I don't know what to say, I must have been unconscious or something!”[5]

Intuitive model: Whatever happens in “awareness” also gets automatically cataloged in the memory storage archive—at least the important stuff, and at least temporarily. And that’s all that’s in the memory storage archive. The memory storage archive just is a (very lossy) history of what’s been in awareness. This connection is deeply integrated into the intuitive model, such that imagining something in memory that was never in awareness, or conversely imagining that there was recently something very exciting and unusual in awareness but that it’s absent from memory, seems like a contradiction, demanding of some exotic explanation like “I wasn’t really conscious”.

Is the model veridical in this respect? Yup, I think this aspect of the intuitive model is veridically capturing the relation between cortex and episodic memory storage within the (normally-functioning) brain algorithm.

2.5 The valence of S(X) thoughts

We have lots of self-reflective thoughts—i.e., thoughts that involve components of the intuitive self-model—such as S(Christmas presents) = the self-reflective idea that Christmas presents are on my mind (see §2.2.3 above). And those thoughts have valence, just like any other thought. Let’s explore that idea and its consequences.

(Warning: I’m using the term “valence” in a specific and idiosyncratic way—see my Valence series [LW · GW].)

The starting question is: What controls the valence of an S(X) model?

Well, it’s the same as anything else—see How does valence get set and adjusted? [LW · GW]. One thing that can happen is that S(X) might directly trigger an innate drive, which injects positive or negative valence as a kind of ground truth. Another thing that can happen is: S(X) might have a strong association with / implication of some other thought / concept C. In that case, we’ll often think of S(X), then think of C, back and forth in rapid succession. And then by TD learning [LW · GW], some of the valence of C will splash onto S(X) (and vice-versa).

That latter dynamic—valence flowing through salient associations—turns out to have some important implications as I’ll discuss next (and more on that in Post 8).

2.5.1 Positive-valence S(X) models often go with “what my best self would do” (other things equal)

Notice how S(⋯) thoughts are “self-reflective”, in the sense that they involve me and my mind, and not just things in the outside world. This is important because it leads to S(⋯) having strong salient associations with other thoughts C that are also self-reflective. After all, if a self-reflective thought is in your head right now, then it’s much likelier for other self-reflective thoughts to pop into your head immediately afterwards.

As a consequence, here are two common examples of factors that influence the valence of S(X):

Contrast either of those with a non-self-reflective (i.e., object-level) thought related to doing my homework, e.g. “What’s the square root of 121 again?”. If I’m thinking about that, then the question of what other people think about me, and how my life plans are going, are less salient.

There’s a pattern here, which is that self-reflective thoughts are more likely to be positive-valence (motivating) if it’s something that we’re proud of, that we like to remember, that we’d like other people to see, etc.

But that’s not the only factor. The object-level is relevant too:

2.5.2 Positive-valence S(X) models also tend to go with X’s that are object-level motivating (other things equal)

For example, if I’m tired, then I want to go to sleep. Maybe going to sleep right now wouldn’t help my social image, and maybe it’s not appealing in the context of the narrative of my life. More generally, maybe I don’t think that “my best self” would be sleeping now, instead of working more. But nevertheless, the self-reflective thought “I’m gonna go to sleep now” will be highly motivating to me, because of its obvious association with / implication of sleep itself.

Maybe I’ll even say “Screw being ‘my best self’, I’m tired, I’m going to sleep”.

What’s going on? It’s the same dynamic as above, but this time the salient association of S(X) is X itself. When I think the self-reflective thought S(go to sleep) ≈ “I’m thinking about going to sleep”, some of the things that it tends to bring to mind are object-level thoughts about going to sleep, e.g. the expectation of feeling the soft pillow on my head. Those thoughts are motivating, since I’m tired. And then by TD learning [LW · GW], S(X) winds up with positive valence too.

(Conversely, just as the valence of X splashes onto S(X), by the same logic, the valence of S(X) splashes onto X. More on that below.)

2.6 S(A) as “the intention to immediately do action A”, and the rapid sequence [S(A) ; A] as the signature of a deliberate action

2.6.1 Clarification: Two ways to “think about an action”

I’ll be arguing shortly that, for a voluntary [LW · GW] action A, S(A) is the “intention” to immediately do A. You might find this confusing: “Can’t I think self-reflectively about an action, without intending to do that action??” Yes, but … allow me to clarify.

Put aside self-reflective thoughts for a moment; let’s just start at the object level. If “the idea of standing up is on my mind” at some moment, that might mean either or both of two rather different things:

The punchline: When I say “an action A” in this series, it always refers to the second bullet, not the first—an action program, not merely an action idea.

So far that’s all the object-level domain. But there’s an analogous distinction in the self-reflective domain, “S(stand up)” is ambiguous as written. It could be the thought: “standing up (as a thing that could happen) is the occupant of conscious awareness”—i.e., a veridical model of the first bullet point situation above. Or it could be the thought “standing up (the action program itself) is the occupant of conscious awareness”—i.e., a veridical model of the second bullet point situation above.

And just as above, when I say S(A), I’ll always be talking about the latter, not the former; it’s the latter that (I’ll argue) corresponds to an “intention”.

That said, those two aspects of standing up are obviously strongly associated with each other. They can activate simultaneously. And even if they don’t, each tends to bring the other to mind, such that the valence of one influences the valence of the other.

With that aside, let’s get into the substance of this section!

2.6.2 For any action A where S(A) has positive valence, there’s often a two-step temporal sequence: [S(A) ; A actually happens]

In this section I’ll give a kind of first-principles derivation of something that we should expect to happen in brain algorithms, based on the discussion thus far. Then afterwards, I’ll argue that this phenomenon corresponds to our everyday notion of intentions and actions. Here goes:

Put these together, and we conclude that there ought to be a frequent pattern:

2.6.3 This two-step sequence corresponds to “deliberate” / “intentional” actions (as opposed to “spontaneously blurting something out”, “acting on instinct”, etc.)

Here are a few reasons that you might believe me on this:

Evidence from introspection: I’m suggesting that (step 1) you think of yourself sending a command to wiggle your fingers, and you find that thought to be motivating (positive valence), and then (step 2) a fraction of a second later, the command is sent and your fingers are actually wiggling. To me, that feels like a pretty good fit to “intentionally” / “deliberately” doing something. Whereas “acting on impulses, instincts, reflexes, etc.” seems to be missing the self-reflective step 1 part.

Evidence from the report of an insight meditator: For what it’s worth, meditation guru Daniel Ingram writes here: “In Mind and Body, the earliest insight stage, those who know what to look for and how to leverage this way of perceiving reality will take the opportunity to notice the intention to breathe that precedes the breath, the intention to move the foot that precedes the foot moving, the intention to think a thought that precedes the thinking of the thought, and even the intention to move attention that precedes attention moving.” I claim that’s a good match to what I wrote—S(A) would be the “intention” to do action A.

Evidence from the systematic differences between deliberate actions and spontaneous actions: Consider spontaneous actions like “blurting out”, also called instinctive, reflexive, unthinking, reactive, spontaneous, etc. According to my story, a key difference between these types of actions, versus deliberate actions, is that the valence of S(A) is necessarily positive in deliberate actions, but need not be positive in spontaneous actions. And in §2.5 above, I said that the valence of S(A) is influenced by the valence of A, but S(A) is also influenced by “what my best self would do”—S(A) tends to be more positive for actions A that would positively impact my social image, fit well into the narrative of my life, and so on. And correspondingly, those are exactly the kinds of actions that are more likely to be “deliberate” than “spontaneous”. Good fit!

2.6.4 The common temporal sequence above—i.e. [S(A) with positive valence ; A actually happens]—is itself incorporated into the intuitive self-model. Call it D(A) for “Deciding to do action A”

The whole point of these intuitive generative models is to observe things that often happen, and then expect them to keep happening in the future. So if the [S(A) with positive valence ; A actually happens] pattern happens regularly, of course the brain will incorporate that as an intuitive concept in its generative models. I’ll call it D(A).

2.6.5 An application: “Illusions of free will”

The stereotypical deliberate-action scenario above is:

Here’s a different scenario:

Now suppose that Step 1’ was not in fact S(A), but that it could have been S(A)—in the specific sense that the hypothesis “what just happened was [S(A) ; A]” is a priori highly plausible and compatible with everything we know about ourselves and what’s happening.

In that case, we should expect the D(A) generative model to activate. Why? It’s just the cortex doing what it always does: using probabilistic inference to find the best generative model given the limited information available. It’s no different from what happens in visual perception: if I see my friend’s head coming up over the hill, I automatically intuitively interpret it as the head of my friend whose body I can’t see; I do not interpret it as my friend’s severed head. The latter would be a priori less plausible than the former.

Anyway, if D(A) activates despite a lack of actual S(A), that would be a (so-called) “illusion of free will”. Examples include the "choice blindness" experiment of Johansson et al. 2005, the "I Spy" and other experiments described in Wegner & Wheatley 1999, some instances of confabulation, and probably some types of “forcing” in stage magic. As another (possible) example, if I’m deeply in a flow state, writing code, and I take an action A = typing a word, then the self-reflective S(A) thought is almost certainly not active (that’s what “flow state” means, see Post 4), but if you ask me after the fact whether I had “decided” to execute action A, I think I would say “yes”.

2.7 Conclusion

I think this is a nice story for how the “conscious awareness” concept comes to exist in our mental worlds, how it relates to other intuitive notions like memory, stream-of-consciousness, intentions, and decisions, and how all these entities in the “map” (intuitive model) relate to corresponding entities in the “territory” (brain algorithms, as designed by the genome).

However, the above story of intentions and decisions is not yet complete! There’s an additional critical ingredient within our intuitive self-models. Not only are there intentions and decisions in our minds, but we also intuitively believe there to be a protagonist—an entity that actively intends our intentions, and decides our decisions, and wills our will! Following Dennett, I’ll call that concept “the homunculus”, and that will be the subject of the next post.

Thanks Thane Ruthenis, lsusr, Seth Herd, Linda Linsefors, and Justis Mills for critical comments on earlier drafts.

  1. ^

    It is, of course, possible to read health insurance documentation and ponder the meaning of life in rapid succession, separated by as little as a fraction of a second. Especially when “pondering the meaning of life” includes nihilism and existential despair! USA readers, you know what I’m talking about.

  2. ^

    It won’t come up again in this series, but I’ll note for completeness that “awareness” is related to the activation state of some parts of the cortex much more than other parts. For example, the primary visual cortex is not interconnected with other parts of the cortex or with long-term memory in the same direct way that many other cortical areas are; hence, you can say that we’re “not directly aware” of what happens in the primary visual cortex. In the lingo, people describe this fact by saying that there’s a “Global Neuronal Workspace” [LW · GW] consisting of many (most?) parts of the cortex, but that the primary visual cortex is not one of those parts.

  3. ^

    Relatedly, some bits of text in this post are copied from my earlier post Book Review: Rethinking Consciousness [LW · GW].

  4. ^

    From my perspective, Graziano’s main thesis and my §2.2 are pretty similar in the big picture. I think the biggest difference between his presentation and mine is that we stand at different places on the spectrum from “evolved modularity” to “universal learning machine” [LW · GW]. Graziano seems to be more towards the “evolved modularity” end, where he thinks that evolution specifically built “awareness” into the brain to serve as sensory feedback for attention actions, in analogy to how evolution specifically built the somatosensory cortex to serve as sensory feedback for motor actions. By contrast, my belief is much closer to the “universal learning machine” end, where “awareness” (like the rest of the intuitive self-model) comes out of a somewhat generic within-lifetime predictive learning algorithm, involving many of the same brain parts and processes that would create, store, and query an intuitive model of a carburetor.

    Again, that’s all my own understanding. Graziano has not read or endorsed anything in this post.

  5. ^

    I adapted that statement from something Jeff Hawkins said. But tragically, it’s not just a hypothetical: Clive Wearing developed total amnesia 40 years ago, and ever since then “he constantly believes that he has only recently awoken from a comatose state”.

12 comments

Comments sorted by top scores.

comment by cubefox · 2024-09-26T12:58:16.048Z · LW(p) · GW(p)

I don't know whether this is relevant to you, but in "x is aware of y" ("y is in x's awareness"), y is considered an intensional term, while for "x physically contains y", y is considered extensional. (And x is extensional in both cases.)

"Extensional" means that co-referring terms can always be substituted for each other without affecting the truth value of the resulting proposition. For "intensional" terms this is not necessarily the case.

For example, "Steve is aware of Yvain" does not entail "Steve is aware of Scott", even if Scott = Yvain. Namely when Steve doesn't know that Scott = Yvain.

However, "The house contains Scott" ("Scott is in the house") implies "The house contains Yvain" because Yvain = Scott.

Most relations only involve extensional terms. Some examples of relations which involve intensional terms: is aware of, thinks, believes, wants, intends, loves, means.

Intensional (with an "s") terms are present mainly or only in relations which express "intentionality" (with a "t") in the technical philosophical sense: a mind (or mental state) representing, or being about, something. It's a central question in philosophy of mind how this can happen. Because ordinary physical objects don't seem to exhibit this property.

Though I'm not completely sure whether your theory has ambitions to solve this problem.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T13:16:09.299Z · LW(p) · GW(p)

Thanks! I feel like that’s a very straightforward question in my framework. Recall this diagram from above:

[UPDATED TO ALSO COPY THE FOLLOWING SENTENCE FROM OP:  †To be clear, the “territory” for (c) is really “(b) being active in the cortex”, not (b) per se.]

Your “truth value” is what I call “what’s happening in the territory”. In the (b)-(a) map-territory correspondence, the “territory” is the real world of atoms, so two different concepts that point to the same possibility in the real world of atoms will have the same “truth value”. In the (c)-(b) map-territory correspondence, the “territory” is the cortex, or more specifically what concepts are active in the cortex, so different concepts are always different things in the territory.

Do you agree that that’s a satisfactory explanation in my framework of why “apple is in awareness” is intensional while “apple is in the cupboard” is extensional? Or am I missing something?

Replies from: cubefox
comment by cubefox · 2024-09-26T14:50:30.109Z · LW(p) · GW(p)

So here (c) is about / represents (b), which itself is about / represents (a). Both (b) and (c) are thoughts (the thought of an apple and the thought of the thought of an apple), so it is expected that they both can represent things. And (a) is a physical object, so it isn't surprising that (a) doesn't represent anything.

However, it is not clear how this difference in capacity for representation arises. More specifically, if we think of (c) not as a thought/concept, but as the cortex, which is a physical object, it is not clear how the cortex could represent / be about something, namely (b).

It is also not clear why thinking about X doesn't imply thinking about Y even in cases where X=Y, while X being on the cupboard implies Y being on the cupboard when X=Y.

Tangential considerations:

I notice that in (b)-(a), (a) is intensional, as expected, while in (c)-(b), (b) does seem to be extensional. Which is not expected, since (c) is a thought about (b).

For example, in the case of (b)-(a) we could have a thought about the apple on the cupboard, and a thought about the apple I bought yesterday, which would not be the same thought, even if both apples are the same object, since I may not know that the apple on the cupboard is the same as the apple I bought yesterday.

But when thinking about our own thoughts, no such failure of identification seems possible. We always seem to know whether two thoughts are the same or not. Apparently because we have direct "access" to them because they are "internal", while we don't have direct "access" to physical objects, or to other external objects, like the thoughts of other people. So extensionality fails for thoughts about external objects, but holds for thoughts about internal objects, like our own thoughts.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T15:22:20.205Z · LW(p) · GW(p)

Thanks! The “territory” for (c) is not (b) per se but rather “(b) being active in the cortex”. (That’s the little dagger on the word “territory” below (b), I explained it in the OP but didn’t copy it into the comment above, sorry.)

So “thought of the thought of an apple” is not quite what (c) is. Something like “thought of the apple being on my mind” would be closer.

More specifically, if we think of (c) not as a thought/concept, but as the cortex, which is a physical object, it is not clear how the cortex could represent / be about something, namely (b).

I sorta feel like you’re making something simple sound complicated, or else I don’t understand your point. “If you think of a map of London as a map of London, then it represents London. If you think of a map of London as a piece of paper with ink on it, then does it still represent London?” Umm, I guess? I don’t know! What’s the point of that question? Isn’t it a silly kind of thing to be talking about? What’s at stake?

It is also not clear why thinking about A doesn't imply thinking about B even in cases where A=B, while A being on the cupboard implies B being on the cupboard when A=B.

Again, I feel like you’re making common sense sound esoteric (or else I’m missing your point). If I don’t know that Yvain is Scott, and if at time 1 I’m thinking about Yvain, and if at time 2 I’m thinking about Scott, then I’m doing two systematically different things at time 1 versus time 2, right?

But when thinking about our own thoughts, no such failure of identification seems possible.

In some contexts, two different things in the map wind up pointing to the same thing in the territory. In other cases, that doesn’t happen. For example, in the domain of “members of my family”, I’m confident that the different things on my map are also different in the territory. Whereas in the domain of anatomy, I’m not so confident—maybe I don’t realize that erythrocytes = red blood cells. Anyway, whether this is true or not in any particular domain doesn’t seem like a deep question to me—it just depends on the domain, and more specifically how easy it is for one thing in the territory to “appear” different (from my perspective) at different times, such that when I see it the second time, I draw it as a new dot on the map, instead of invoking the preexisting dot.

Replies from: cubefox
comment by cubefox · 2024-09-26T17:56:25.097Z · LW(p) · GW(p)

I sorta feel like you’re making something simple sound complicated, or else I don’t understand your point. “If you think of a map of London as a map of London, then it represents London. If you think of a map of London as a piece of paper with ink on it, then does it still represent London?” Umm, I guess? I don’t know! What’s the point of that question? Isn’t it a silly kind of thing to be talking about? What’s at stake?

Well, it seems that purely the map by itself (as a physical object only) doesn't represent London, because the same map-like object could have been created as an (extremely unlikely) accident. Just like a random splash of ink that happens to look like Jesus doesn't represent Jesus, or a random string generator creating the string "Eliezer Yudkowsky" doesn't refer to Eliezer Yudkowsky. What matters seems to be the intention (a mental object) behind the creation of an actual map of London: Someone intended it to represent London.

Or assume a local tries to explain to you where the next gas station is, gesticulates, and uses his right fist to represent the gas station and his left fist to represent the next intersection. The right fist representing the gas station is not a fact about the physical limb alone, but about the local's intention behind using it. (He can represent the gas station even if you misunderstand him, so only his state of mind seems to matter for representation.)

So it isn't clear how a physical object alone (like the cortex) can be about something. Because apparently maps or splashes or strings or fists don't represent anything by themselves. That is not to say that the cortex can't represent things, but rather that it isn't clear why it does, if it does.

Again, I feel like you’re making common sense sound esoteric (or else I’m missing your point). If I don’t know that Yvain is Scott, and if at time 1 I’m thinking about Yvain, and if at time 2 I’m thinking about Scott, then I’m doing two systematically different things at time 1 versus time 2, right?

Exactly. But it isn't clear why these thoughts are different. If your thinking about someone is a relation between yourself and someone else, then it isn't clear why you thinking about one person could ever be two different things.

(A similar problem arises when you think about something that might not exist, like God. Does this thought then express a relation between yourself and nothing? But thinking about nothing is clearly different from thinking about God. Besides, other non-existent objects, like the largest prime number, are clearly different from God.)

Maybe it is instead a relation between yourself and your concept of Yvain, and a relationship between yourself and your concept of Scott, which would be different relations, if the names express different concepts, in case you don't regard them as synonymous. But both concepts happen to refer to the same object. Then "refers to" (or "represents") would be a relation between a concept and an object. Then the question is again how reference/representation/aboutness/intentionality works, since ordinary physical objects don't seem to do it. What makes it the case that concept X represents, or doesn't represent, object Y?

But when thinking about our own thoughts, no such failure of identification seems possible.

In some contexts, two different things in the map wind up pointing to the same thing in the territory. In other cases, that doesn’t happen. For example, in the domain of “members of my family”, I’m confident that the different things on my map are also different in the territory.

If you believe x and y are members of your family, that doesn't imply you having a belief on whether x and y are identical or not. But if x and y are thoughts of yours (or other mental objects), you know whether they are the same or not. Example: you are confident that your brother is a member of your family, and that the person who ate your apple is a member of your family, but you are not confident about whether your brother is identical to the person who ate your apple.

It seems such examples can be constructed for any external objects, but not for internal ones, so the only "domain" where extensionality holds for intentionality/representation relations is arguably internal objects (our own mental states).

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T19:05:56.048Z · LW(p) · GW(p)

I feel like the difference here is that I’m trying to talk about algorithms (self-supervised learning, generative models, probabilistic inference), and you’re trying to talk about philosophy? (See §1.6.2 [LW · GW]). I think there are questions that seem important and tricky in your philosophy-speak, but seem weird or obvious or pointless in my algorithm-speak … Well anyway, here’s my perspective:

Let’s say:

  • There’s a real-world thing T (some machine made of atoms),
  • T is upstream of some set of sensory inputs S (light reflecting off the machine and hitting photoreceptors etc.)
  • There’s a predictive learning algorithm L tasked with predicting S,
  • This learning algorithm gradually builds a trained model (a.k.a. generative model space, a.k.a. intuitive model space) M.

In this case, it is often (though not always) the case that some part of M will have a straightforward structural resemblance to T. In §1.3.2 [LW · GW], I called that a “veridical correspondence”.

If that happens, then we know why it happened; it happened because of the learning algorithm L! Obviously, right? Veridical map-territory correspondence is generally a very effective way to predict what’s going to happen, and thus predictive learning algorithms very often build trained models with veridical aspects. (I think the term “teleosemantics” [LW · GW] is relevant here? Not sure.)

By contrast, if some part of M has a straightforward structural resemblence to T, then the hypothesis that this happened by coincidence is astronomically unlikely, compared to the hypothesis that this happened because it’s a good way for L to reduce its loss function.

(Then you say: “Ah, but what if that astronomical coincidence comes to pass?” Well then I would say “Huh. Funny that”, and I would shrug and go on with my day. I never claimed to have an airtight philosophical theory of about-ness or representing-ness or whatever! It was you who brought it up!)

Other times, there isn’t a veridical correspondence! Instead, the predictive learning algorithm builds an M, no part of which has any straightforward structural resemblance to T. There are lots of reasons that could happen. I gave one or two examples of non-veridical things in this post, and much more coming up in Post 3.

But it isn't clear why these thoughts are different. If your thinking about someone is a relation between yourself and someone else, then it isn't clear why you thinking about one person could ever be two different things.

M is some data structure stored in the cortex. If I don’t know that Scott is Yvain, then Scott is one part of M, and Yvain is a different part of M. Two different sets of neurons in the cortex, or whatever. Right? I don’t think I’m saying anything deep here. :)

Replies from: cubefox
comment by cubefox · 2024-09-26T20:00:34.606Z · LW(p) · GW(p)

I'm not sure how much "structural resemblance" or "veridical correspondence" can account for representation/reference. Maybe our concept of a sock or an apple somehow (structurally) resembles a sock or an apple. But what if I'm thinking of the content of your suitcase, and I don't know whether it is a sock or an apple or something else? Surely the part of the model (my brain) which represents/refers to the content of your suitcase does not in any way (structurally or otherwise) resemble a sock, even if the content of your suitcase is indeed identical to a sock.

M is some data structure stored in the cortex. If I don’t know that Scott is Yvain, then Scott is one part of M, and Yvain is a different part of M. Two different sets of neurons in the cortex, or whatever. Right? I don’t think I’m saying anything deep here. :)

But Scott and Ivan are an object in the territory, not parts of a model, so the parts of the model which do represent Scott and Yvain require the existence of some sort of representation relation.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T21:40:33.523Z · LW(p) · GW(p)

Maybe our concept of a sock or an apple somehow (structurally) resembles a sock or an apple.

I could start writing pairs of sentences like:

REAL WORLD: feet often have socks on them
MY INTUITIVE MODELS: “feet” “often” “have” “socks” “on” “them”

REAL WORLD: socks are usually stretchy
MY INTUITIVE MODELS: “socks” “are” “usually” “stretchy”

(… 7000 more things like that …)

If you take all those things, AND the information that all these things wound up in my intuitive models via the process of my brain doing predictive learning from observations of real-world socks over the course of my life, AND the information that my intuitive models of socks tend to activate when I’m looking at actual real-world socks, and to contribute to me successfully predicting what I see … and you mix all that together … then I think we wind up in a place where saying “my intuitive model of socks has by-and-large pretty good veridical correspondence to actual socks” is perfectly obvious common sense. :)

(This is all the same kinds of things I would say if you ask me what makes something a map of London.)

But what if I'm thinking of the content of your suitcase, and I don't know whether it is a sock or an apple or something else? Surely the part of the model (my brain) which represents/refers to the content of your suitcase does not in any way (structurally or otherwise) resemble a sock, even if the content of your suitcase is indeed identical to a sock.

Right, if I don’t know what’s in your suitcase, then there will be rather little veridical correspondence between my intuitive model of the inside of your suitcase, and the actual inside of your suitcase! :)

(The statement “my intuitive model of socks has by-and-large pretty good veridical correspondence to actual socks” does not mean I have omniscient knowledge of every sock on Earth, or that nothing about socks will ever surprise me, etc.!)

But Scott and [Yvain] are an object in the territory, not parts of a model, so the parts of the model which do represent Scott and Yvain require the existence of some sort of representation relation.

Oh sorry, I thought that was clear from context … when I say “Scott is one part of M”, obviously I mean something more like “[the part of my intuitive world-model that I would describe as Scott] is one part of M”. M is a model, i.e. data structure, stored in the cortex. So everything in M is a part of a model by definition.

comment by Signer · 2024-09-26T09:33:31.908Z · LW(p) · GW(p)

I still don't get this "only one thing in awareness" thing. There are multiple neurons in cortex and I can imagine two apples - in what sense there can only be one thing in awareness?

Or equivalently, it corresponds equally well to two different questions about the territory, with two different answers, and there’s just no fact of the matter about which is the real answer.

Obviously the real answer is the model which is more veridical^^. The latter hindsight model is right not about the state of the world at t=0.1, but about what you thought about the world at t=0.1 later.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T13:07:17.623Z · LW(p) · GW(p)

I still don't get this "only one thing in awareness" thing. There are multiple neurons in cortex and I can imagine two apples - in what sense there can only be one thing in awareness?

One thought in awareness! Imagining two apples is a different thought from imagining one apple, right? They’re different generative models, arising in different situations, with different implications, different affordances, etc. Neither is a subset of the other. (I.e., there are things that I might do or infer in the context of one apple, that I would not do or infer in the context of two apples.)

I can have a song playing in my head while reading a legal document. That’s because those involve different parts of the cortex. In my terms, I would call that “one thought” involving both a song and a legal document. On the other hand, I can’t have two songs playing in my head simultaneously, nor can I be thinking about two unrelated legal documents simultaneously. Those involve the same parts of the cortex being asked to do two things that conflict. So instead, I’d have to flip back and forth.

There are multiple neurons in the cortex, but they’re not interchangeable. Again, I think autoassociative memory / attractor dynamics is a helpful analogy here. If I have a physical instantiation of a Hopfield network, I can’t query 100 of its stored patterns in parallel, right? I have to do it serially.

I don’t pretend that I’m offering a concrete theory of exactly what data format a “generative model” is etc., such that song-in-head + legal-contract is a valid thought but legal-contract + unrelated-legal-contract is not a valid thought. …Not only that, but I’m opposed to anyone else offering such a theory either! We shouldn’t invent brain-like AGI until we figure out how to use it safely [LW · GW], and those kinds of gory details would be getting uncomfortably close, without corresponding safety benefits, IMO.

Replies from: Signer
comment by Signer · 2024-09-26T17:59:53.928Z · LW(p) · GW(p)

Imagining two apples is a different thought from imagining one apple, right?

I mean, is it? Different states of the whole cortex are different. And the cortex can't be in a state of imagining only one apple and, simultaneously, be in a state of imagining two apples, obviously. But it's tautological. What are we gaining from thinking about it in such terms? You can say the same thing about the whole brain itself, that it can only have one brain-state in a moment.

I guess there is a sense in which other parts of the brain have more various thoughts relative to what cortex can handle, but, like you said, you can use half of cortex capacity, so why not define song and legal document as different thoughts?

As abstract elements of provisional framework cortex-level thoughts are fine, I just wonder what are you claiming about real constrains, aside from "there limits on thoughts". because, for example, you need other limits anyway - you can't think arbitrary complex thought even if it is intuitively cohesive. But yeah, enough gory details.

On the other hand, I can’t have two songs playing in my head simultaneously, nor can I be thinking about two unrelated legal documents simultaneously.

I can't either, but I don't see just from the architecture why it would be impossible in principle.

Again, I think autoassociative memory / attractor dynamics is a helpful analogy here. If I have a physical instantiation of a Hopfield network, I can’t query 100 of its stored patterns in parallel, right? I have to do it serially.

Yes, but you can theoretically encode many things in each pattern? Although if your parallel processes need different data, one of them will have to skip some responses... Would be better to have different networks, but I don't see brain providing much isolation. Well, it seems to illustrate complications of parallel processing that may played a role in humans usually staying serial.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-09-26T18:24:18.545Z · LW(p) · GW(p)

You say “tautological”, I say “obvious”. You can’t parse a legal document and try to remember your friend’s name at the exact same moment. That’s all I’m saying! This is supposed to be very obvious common sense, not profound.

What are we gaining from thinking about it in such terms?

Consider the following fact:

FACT: Sometimes, I’m thinking about pencils. Other times, I’m not thinking about pencils.

Now imagine that there’s a predictive (a.k.a. self-supervised) learning algorithm which is tasked with predicting upcoming sensory inputs, by building generative models. The above fact is very important! If the predictive learning algorithm does not somehow incorporate that fact into its generative models, then those generative models will be worse at making predictions. For example, if I’m thinking about pencils, then I’m likelier to talk about pencils, and look at pencils, and grab a pencil, etc., compared to if I’m not thinking about pencils. So the predictive learning algorithm is incentivized (by its predictive loss function) to build a generative model that can represent the fact that any given concept might be active in the cortex at a certain time, or might not be.

Again, this is all supposed to sound very obvious, not profound.

You can say the same thing about the whole brain itself, that it can only have one brain-state in a moment.

Yes, it’s also useful for the predictive learning algorithm to build generative models that capture other aspects of the brain state, outside the cortex. Thus we wind up with intuitive concepts that represent the possibility that we can be in one mood or another, that we can be experiencing a certain physiological reaction, etc.