Abstract Mathematical Concepts vs. Abstractions Over Real-World Systems

post by Thane Ruthenis · 2025-02-18T18:04:46.717Z · LW · GW · 10 comments

Contents

  Motivation
  Ideas
None
10 comments

Consider concepts such as "a vector", "a game-theoretic agent", or "a market". Intuitively, those are "purely theoretical" abstractions: they don't refer to any specific real-world system. Those abstractions would be useful even in universes very different from ours, and reasoning about them doesn't necessarily involve reasoning about our world.

Consider concepts such as "a tree", "my friend Alice", or "human governments". Intuitively, those are "real-world" abstractions. While "a tree" bundles together lots of different trees, and so doesn't refer to any specific tree, it still refers to a specific type of structure found on Earth, and shaped by Earth-in-particular's specific conditions. While tree-like structures can exist in other places in the multiverse, there's an intuitive sense that any such "tree" abstraction would "belong" to the region of the multiverse in which the corresponding trees grow.

Is there a way to formalize this, perhaps in the natural-abstraction framework [? · GW]? To separate the two categories, to find the True Name of "purely theoretical concepts"?


Motivation

Consider a superintelligent agent/optimization process. For it to have disastrous real-world consequences, some component of it would need to reason about the real world. It would need to track where in the world it's embedded, what input-output pathways there are, and how it can exploit these pathways in order hack out of the proverbial box/cause other undesirable consequences.

If we could remove its ability to think about "unapproved" real-world concepts, and make it model itself as not part of the world, then we'd have something plausibly controllable. We'd be able to pose it well-defined problems (in math and engineering, up to whatever level of detail we can specify without exposing it to the real world – which is plenty) and it'd spit out solutions to them, without ever even thinking about causing real-world consequences. The idea of doing this would be literally outside its hypothesis space!

There are tons of loopholes and open problems here, but I think there's promise too.


Ideas

(I encourage you to think about the topic on your own before reading my attempts.)

 

Take 1: Perhaps this is about "referential closure". For concepts such as "vectors" or "agents", we can easily specify the list of formal axioms that would define the frameworks within which these concepts make sense. For things like "trees", however, we would have to refer to the real world directly: to the network of causes and effects entangled with our senses.

... Except that we more or less can, nowadays, specify the mathematical axioms underlying the processes generating our universe (something something Poincaré group). To a sufficiently advanced superintelligence, there'd be no real difference.

Take 2: Perhaps the intuitions are false, and the difference is quantitative, not qualitative.

"Vectors" are concepts such that there's a simple list of axioms under which they're simple to describe/locate: they have low Kolmogorov complexity. By comparison, "trees" have a simple generator, but locating them within that generator's output (the quantum multiverse) takes very many bits.

I guess this is kind of plausible – indeed, it's probably the null hypothesis – but it doesn't feel satisfying.

Especially the pessimistic case: the "continuum" idea doesn't make sense to me. I think there's a big jump between "a human" and "an agent", and I don't see what abstractions could sit between them. (An abstraction over {humans, human governments, human corporations}, which is nevertheless more specific than "an agent in general"? Empirically, humanity hasn't been making use of this abstraction – we don't have a term for it – so it's evidently not convergently useful.)

Take 3: Causality-based definitions. Perhaps "theoretical abstractions" are convergently useful abstractions which can't be changed by any process within our universe (i. e., within the net of causes and effects entangled with our senses)? "Trees" can be wiped out or modified, "vectors" can't be.

This doesn't really work, I think. There are two approaches:

Intuitively, it feels like there's something to the "causality" angle, but I haven't been able to find a useful approach here.

Take 4: Perhaps this is about reoccurrence.

Consider the "global ontology" of convergently useful concepts [LW · GW] defined over our universe. A concept such as "an Earthly tree" appears in it exactly once: as an abstraction over all of Earth's trees (which are abstractions over their corresponding bundles-of-atoms which have specific well-defined places, etc.). "An Earthly tree", specifically, doesn't reoccur anywhere else, at higher or lower or sideways abstraction levels.

Conversely, consider "vectors" or "markets". They never show up directly. Rather, they serve as "ingredients" in the makeup of many different "real-world" abstractions. "Markets" can model human behavior in a specific shop, or in the context of a country, and in relation to many different types of "goods" – or even the behavior of biological and even purely physical [LW(p) · GW(p)] systems.

Similar for "agents" (animals, humans, corporations, governments), and even more obviously for "vectors".

Potential counterarguments:


Take 4 seems fairly promising to me, overall. Can you spot any major issues with it? Alternatively, a way to more properly flesh it out/formalize it?

10 comments

Comments sorted by top scores.

comment by Towards_Keeperhood (Simon Skade) · 2025-02-18T18:55:17.855Z · LW(p) · GW(p)

I have a binary distinction that is a bit different from the distinction you're drawing here. (Where tbc one might still draw another distinction like you do, but this might be relevant for your thinking.) I'll make a quick try to explain it here, but not sure whether my notes will be sufficient. (Feel free to ask for further clarification. If so ideally with partial paraphrases and examples where you're unsure.)

I distinguish between objects and classes:

  • Objects are concrete individual things. E.g. "your monitor", "the meeting you had yesterday", "the German government".
  • A class is a predicate over objects. E.g. "monitor", "meeting", "government".

The relationship between classes and objects is basically like in programming. (In language we can instantiate objects from classes through indicators like "the", "a", "every", "zero", "one", ..., plural "-s" inflection, prepended posessive "'s", and perhaps a few more. Though they often only instantiates objects if it is in the subject position. In the object position some of those keywords have a bit of a different function. I'm still exploring details.)

In language semantics the sentence "Sally is a doctor." is often translated to the logic representation "doctor(Sally)", where "doctor" is a predicate and "Sally" is an object / a variable in our logic. From the perspective of a computer it might look more like adding a statement "P_1432(x_5343)" to our pool of statements believed to be true.

We can likewise say "The person is a doctor" in which case "The person" indicates some object that needs to be inferred from the context, and then we again apply the doctor predicate to the object.

The important thing here is that "doctor" and "Sally"/"the person" have different types. In formal natural language semantics "doctor" has type <e,t> and "Sally" has type "e". (For people interested in learning about semantics, I'd recommend this excellent book draft.[1])

There might still be some edge cases to my ontology here, and if you have doubts and find some I'd be interested in exploring those.

Whether there's another crisp distinction between abstract classes (like "market") and classes that are less far upstream from sensory perceptions (like "tree") is a separate question. I don't know whether there is, though my intuition would be leaning towards no.

  1. ^

    I only read chapters 5-8 so far. Will read the later ones soon. I think for the people familiar with CS the first 4 chapters can be safely skipped.

comment by Martín Soto (martinsq) · 2025-02-19T10:59:30.188Z · LW(p) · GW(p)

I don't see how Take 4 is anything other than simplicity (in the human/computational language). As you say, it's a priori unclear whether a an agent is an instance of a human or the other way around. You say the important bit is that you are subtracting properties from a human to get an agent. But how shall we define subtraction here? In one formal language, the definition of human will indeed be a superset of that of agent... but in another one it will not. So you need to choose a language. And the natural way forward every time this comes up (many times), is to just "weigh by Turing computations in the real world" (instead of choosing a different and odd-looking Universal Turing Machine), that is, a simplicity prior.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2025-02-19T15:32:51.407Z · LW(p) · GW(p)

I expect that it's strongly correlated with simplicity in practice, yes. However, the two definitions could diverge.

Consider trees, again. You can imagine a sequence of concepts/words such that (1) each subsequent concept is simple to define in terms of the preceding concepts, (2) one of the concepts in the sequence is "trees". (Indeed, that's basically how human minds work.)

Now consider some highly complicated mathematical concept, which is likewise simple to define in terms of the preceding concepts. It seems plausible that there are "purely mathematical" concepts like this such that their overall complexity (the complexity of the chain of concepts leading to them) is on par with the complexity of the "trees" concept.

So an agent that's allowed to reason about concepts which are simple to define in its language, and which can build arbitrarily tall "towers" of abstractions, can still stumble upon ways to reason about real-life concepts. By comparison, if Take 4 is correct, and we have a reference global ontology on hand[1], we could correctly forbid it from thinking about concepts instantiated within this universe without crippling its ability to think about complex theoretical concepts.

(The way this is correlated with simplicity is if there's some way to argue that the only concepts that "spontaneously reoccur" at different levels of organization of our universe, are those concepts that are very simple. Perhaps because moving up/down abstraction levels straight-up randomizes the concepts you're working with. That would mean the probability of encountering two copies of a concept is inversely proportional with its bitwise "length".)

  1. ^

    Step 2: Draw the rest of the owl.

Replies from: caleb-biddulph
comment by CBiddulph (caleb-biddulph) · 2025-02-20T06:28:07.503Z · LW(p) · GW(p)

I was about to comment with something similar to what Martín said. I think what you want is an AI that solves problems that can be fully specified with a low Kolmogorov complexity in some programming language. A crisp example of this sort of AI is AlphaProof, which gets a statement in Lean as input and has to output a series of steps that causes the prover to output "true" or "false." You could also try to specify problems in a more general programming language like Python.

A "highly complicated mathematical concept" is certainly possible. It's easy to construct hypothetical objects in math (or Python) that have arbitrarily large Kolmogorov complexity: for example, a list of random numbers with length 3^^^3.

Another example of a highly complicated mathematical object which might be "on par with the complexity of the 'trees' concept" is a neural network. In fact, if you have a neural network that looks at an image and tells you whether or not it's a tree with very high accuracy, you might say that this neural network formally encodes (part of) the concept of a tree. It's probably hard to do much better than a neural network if you hope to encode the "concept of a tree," and this requires a Python program a lot longer than the program that would specify a vector or a market.

Eliezer's post on Occam's Razor [LW · GW] explains this better than I could - it's definitely worth a read.

comment by quetzal_rainbow · 2025-02-18T20:31:31.816Z · LW(p) · GW(p)

I think there is an abstraction between "human" and "agent": "animal". Or, maybe, "organic life". Biological systematization (meaning all ways to systematize: phylogenetic, morphological, functional, ecological) is a useful case study for abstraction "in the wild".

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2025-02-18T20:53:13.750Z · LW(p) · GW(p)

Animals are still pretty solidly in the "abstractions over real-life systems" category for me, though. What I'm looking for, under the "continuum" argument, are any practically useful concepts which don't clearly belong to either "theoretical concepts" or "real-life abstractions" according to my intuitions.

Biological systematization falls under "abstractions over real-life systems" for me as well, in the exact same way as "Earthly trees". Conversely, "systems generated by genetic selection algorithms" is clearly a "pure concept".

(You can sort of generate a continuum here, by gradually adding ever more details to the genetic algorithm until it exactly resembles the conditions of Earthly evolution... But I'm guessing Take 4 would still handle that: the resultant intermediary abstractions would likely either (1) show up in many places in the universe, on different abstraction levels, and clearly represent "pure" concepts, (2) show up in exactly one place in the universe, clearly corresponding to a specific type of real-life systems, (3) not show up at all.)

Replies from: Jonas Hallgren
comment by Jonas Hallgren · 2025-02-18T21:48:12.094Z · LW(p) · GW(p)

A random thought that I just has from more mainstream theoretical CS ML or Geometric Deep Learning is about inductive biases from the perspective of different geodesics. 

Like they talk about using structural invariants to design the inductive biases of different ML models and so if we're talking abiut general abstraction learning my question is if it even makes sense without taking the underlying inductive biases you have into account?

Like maybe the model of Natural Abstractions always has to filter through one inductive bias or another and there are different optimal choices for different domains? Some might be convergent but you gotta use the filter or something?

As stated, a random thought but felt I should share. Here's a quick overarching link on GDL if you wanna check it out more: https://geometricdeeplearning.com

comment by Charlie Steiner · 2025-02-19T10:21:38.564Z · LW(p) · GW(p)

I don't think this has much direct application to alignment, because although you can build safe AI with it, it doesn't differentially get us towards the endgame of AI that's trying to do good things and not bad things. But it's still an interesting question.

It seems like the way you're thinking about this, there's some directed relations you care about (the main one being "this is like that, but with some extra details") between concepts, and something is "real"/"applied" if it's near the edge of this network - if it doesn't have many relations directed towards even-more-applied concepts. It seems like this is the sort of thing you could only ever learn by learning about the real world first - you can't start from a blank slate and only learn "the abstract stuff", because you only know which stuff is abstract by learning about its relationships to less abstract stuff.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2025-02-19T15:41:32.272Z · LW(p) · GW(p)

It seems like this is the sort of thing you could only ever learn by learning about the real world first

Yep. The idea is to try and get a system that develops all practically useful "theoretical" abstractions, including those we haven't discovered yet, without developing desires about the real world. So we train some component of it on the real-world data, then somehow filter out "real-world" stuff, leaving only a purified superhuman abstract reasoning engine.

One of the nice-to-have properties here would be is if we don't need to be able to interpret its world-model to filter out the concepts – if, in place of human understanding and judgement calls, we can blindly use some ground-truth-correct definition of what is and isn't a real-world concept.

comment by cubefox · 2025-02-18T22:59:41.049Z · LW(p) · GW(p)

I think abstract concepts could be distinguished with higher-order logic (= simple type theory). For example, the logical predicate "is spherical" (the property of being a sphere) applies to concrete objects. But the predicate "is a shape" applies to properties, like the property of being a sphere. And properties/concepts are abstract objects. So the shape concept is of a higher logical type than the sphere concept. Or take the "color" concept, the property of being a color. In its extension are not concrete objects, but other properties, like being red. Again, concrete objects can be red, but only properties (like redness), which are abstract objects, can be colors. A tomato is not a color, nor can any other concrete (physical or mental) object be a color. There is a type mismatch.

Formally: Let the type of concrete objects be (for "entity"), and the type of the two truth values (TRUE and FALSE) be (for "truth value"), and let functional types, which take an object of type and return an object of type , be designated with . Then the type of "is a sphere" is , and the type of "is a shape" is . Only objects of type are concrete, so objects of type (properties) are abstract. Even if there weren't any physical spheres, no spherical things like planets or soccer balls, you could still talk about the abstract sphere: the sphere concept, the property of being spherical.

Now the question is whether all the (intuitively) abstract objects can indeed, in principle, be formalized as being of some complex logical type. I think yes. Because: What else could they be? (I know a way of analyzing natural numbers, the prototypical examples of abstract objects, as complex logical types. Namely as numerical quantifiers. Though the analysis in that case is significantly more involved than in the "color" and "shape" examples.)