The Missing Math of Map-Making
post by johnswentworth
score: 33 (16 votes) ·
Consider the clever arguer from The Bottom Line [LW · GW]:
Now suppose there is a clever arguer, holding a sheet of paper, and they say to the owners of box A and box B: “Bid for my services, and whoever wins my services, I shall argue that their box contains the diamond, so that the box will receive a higher price.” So the box-owners bid, and box B’s owner bids higher, winning the services of the clever arguer.
The clever arguer begins to organize their thoughts. First, they write, “And therefore, box B contains the diamond!” at the bottom of their sheet of paper. Then, at the top of the paper, the clever arguer writes, “Box B shows a blue stamp,” and beneath it, “Box A is shiny,” and then, “Box B is lighter than box A,” and so on through many signs and portents; yet the clever arguer neglects all those signs which might argue in favor of box A.
This is a great example of a broken chain [LW · GW]: the chain of cause-and-effect between the actual contents of the boxes and the arguer’s conclusion is broken. With the chain broken, no amount of clever arguing can make the conclusion at the bottom of the paper more true.
Much of the sequences [LW · GW] can be summarized as “to determine the accuracy of a belief, look at the cause-and-effect process which produced that belief”. A few examples:
If the causal chain from territory to map is broken outright, then that’s a pretty obvious problem, but some of the links above provide more subtle examples too. In general, map-making processes should produce accurate maps to exactly the extent that they approximate Bayesian reasoning [LW · GW].
Point is: looking at the cause-and-effect processes which produce maps/beliefs is pretty central to rationality.
Given all that, it’s rather odd that we don’t have a nice mathematical theory of map-making processes [LW · GW] - cause-and-effect systems which produce accurate maps/beliefs from territories.
We have many of the pieces needed for such a theory laying around already. We have a solid theory of causality [LW · GW], so we know how to formalize “cause-and-effect processes”. Information theory lets us quantify map-territory correspondence. We even have an intuitive notion that accurate map-making processes should approximate Bayesian inference.
Yet there’s some large chunks missing. For instance, suppose Google collects a bunch of photos from the streets of New York City, then produces a streetmap from it. The vast majority of the information in the photos is thrown away in the process - how do we model that mathematically? How do we say that the map is “accurate”, despite throwing away all that information? More generally, maps/beliefs tend to involve some abstraction - my beliefs are mostly about macroscopic objects (trees, chairs, etc) rather than atoms. What does it mean for a map to be “accurate” at an abstract level, and what properties should my map-making process have in order to produce accurate abstracted maps/beliefs?
Comments sorted by top scores.
comment by jessicata (jessica.liu.taylor)
· score: 15 (7 votes) · LW
What does it mean for a map to be “accurate” at an abstract level, and what properties should my map-making process have in order to produce accurate abstracted maps/beliefs?
The notion of a homomorphism in universal algebra and category theory is relevant here. Homomorphisms map from one structure (e.g. a group) to another, and must preserve structure. They can delete information (by mapping multiple different elements to the same element), but the structures that are represented in the structure-being-mapped-to must also exist in the structure-being-mapped-from.
Analogously: when drawing a topographical map, no claim is made that the topographical map represents all structure in the territory. Rather, the claim being made is that the topographical map (approximately) represents the topographic structure in the territory. The topographic map-making process deletes almost all information, but the topographic structure is preserved: for every topographic relation (e.g. some point being higher than some other point) represented in the topographic map, a corresponding topographic relation exists in the territory.
comment by ChristianKl
· score: 9 (3 votes) · LW
Science and Sanity which coined the phrase "The map is not the territory" does contain a long discussion on what can be meant by saying a map is accurate and how abstraction works in this context.
I would recommend reading it if you are interested in the nature of maps.
comment by Dagon
· score: 3 (2 votes) · LW
I am still finding it difficult to understand how the focus on causality of mapmaking is more helpful than examining the intent to summarize (which encompasses which information gets thrown away based on what domain of prediction the map is created for) and the (pretty pure bayesean) accuracy of predictions.
comment by johnswentworth
· score: 3 (2 votes) · LW
I think "intent to summarize" + "accuracy of predictions" is basically the whole story. What I want is a theory that can talk about both, at the same time.
Causality of mapmaking matters mainly because of the second piece: the map-making process is what determines how accurate the predictions will be. (The Bottom Line is an example which highlights this.)
comment by John_Maxwell (John_Maxwell_IV)
· score: 2 (1 votes) · LW
For instance, suppose Google collects a bunch of photos from the streets of New York City, then produces a streetmap from it. The vast majority of the information in the photos is thrown away in the process - how do we model that mathematically? How do we say that the map is “accurate”, despite throwing away all that information? More generally, maps/beliefs tend to involve some abstraction - my beliefs are mostly about macroscopic objects (trees, chairs, etc) rather than atoms. What does it mean for a map to be “accurate” at an abstract level, and what properties should my map-making process have in order to produce accurate abstracted maps/beliefs?
Representation learning might be worth looking into. The quality of a representation is typically measured using its reconstruction error, I think. However, there is some complexity here, I'd argue, because in real-world applications, some aspects of the reconstruction usually matter much more than others. I care about reconstructing the navigability of the streets, but not the advertisements on roadside billboards.
This actually presents a challenge [LW · GW] to standard rationalist tropes about the universality of truth in a certain way, because my friend the advertising executive might care more about minimizing reconstruction error on roadside billboards, and select a representation scheme which is optimized according to that metric. As Stuart Russell puts it in Artificial Intelligence: A Modern Approach:
We should say up front that the enterprise of general ontological engineering has so far had only limited success. None of the top AI applications (as listed in Chapter 1) make use of a shared ontology—they all use special-purpose knowledge engineering. Social/political considerations can make it difficult for competing parties to agree on an ontology. As Tom Gruber (2004) says, “Every ontology is a treaty—a social agreement—among people with some common motive in sharing.” When competing concerns outweigh the motivation for sharing, there can be no common ontology.
(Emphasis mine. As far as I can tell, "ontology" is basically GOFAI talk for "representation".)
comment by hex50
· score: 1 (1 votes) · LW
The problem might go all the up to the notion of correspondence to reality itself. There's no simple way of stating that we're accessing reality without that also constituting a representation.
Your mutual information is between representations coming from one part of your mind to other parts; likewise what is considered "accurate" information about reality is really just accuracy relative to some idealized set of expectations from some other model that would take the photos as evidence.