Kolmogorov Complexity Lays Bare the Soul

post by jakej (jake-jenks) · 2023-12-01T18:29:57.379Z · LW · GW · 8 comments

Contents

8 comments

Kolmogorov complexity is a measure of the algorithmic complexity of a particular piece of information, and it is found by identifying the shortest possible representation of that information. For example, if I have a string of characters AAAAA, I can create a compressed representation 5A. The Kolmogorov complexity of AAAAA then is at most 2 because I can represent that same information with only two characters using the shorter version 5A. We can say that 5A is the Kolmogorov string that shows that the Kolmogorov complexity of AAAAA is 2.

Kolmogorov complexity can also give us a sense of the relative complexity of different pieces of information. AAABB can only be compressed down to 3A2B, for a Kolmogorov complexity of 4, making it twice as complex as AAAAA.

Fractals like the one below look complex, and indeed, storing an image of a fractal can take up a lot of space. However, fractals are constructed by following simple rules, and a computer program to execute those instructions can be many orders of magnitude smaller than storing the image itself. This means that the Kolmogorov complexity of fractals can actually be quite low, despite appearances.

It occurred to me to ask - do I have a Kolmogorov complexity? My body and my mind both contain only a finite amount of information, so both could be described in exhaustive detail with only a finite amount of text. A brute-force description of every fiber of my being does not strike me as especially interesting. If I were able to discover my Kolmogorov string though, that does strike me as having some interesting properties.

Such a string would represent my purest and most distilled essence. It would contain every single detail of my existence - since even a single missing bit would disqualify it from being my Kolmogorov string by definition. It is also completely unique, being the shortest possible string. It is so lean that not a single redundant detail remains - otherwise, it would not be the shortest string.

So here we have a total representation of my entire being from which not a single iota can be added or removed; an utterly perfect representation.

One could examine this string and ask any question about my life and get the correct answer in exact detail. The details contained within it would be so exquisite that, even if I were to be utterly destroyed it would be enough to construct an exact replica of me.

One’s Kolmogorov string is an ethereal thing, likely too slippery and ineffable to be truly captured in reality. But, in principle, it does exist and could be found with enough effort. Everything that I am - my memories, my scars, my skills - could be condensed down into its purest essence of information. Despite capturing everything about me, my Kolmogorov string would be a static, unfeeling thing of pure information. Only when that information is situated within a moving, dynamic body would it feel and think.

What is a soul, if not this?

It strikes me just how similar this is to the traditional conception of a soul. Here we have a theoretical object that can perfectly and uniquely capture one’s identity in an abstract and immaterial way. It totally defines us, and yet can never truly be measured.

It also seems to me to be fairly compatible with just about any philosophical background without too much effort. Certainly, an omniscient being would have no difficulty finding one’s Kolmogorov string. At the same time, no appeal to the supernatural is necessary to point out that such a string must almost certainly exist for every individual.

What else could these strings tell us? How big is my soul - what is left of me after you strip away all the fat? How much has my soul grown over my life? How would it change after a major life event? How does it compare to other souls? Would these souls contain similar structures, or would they all be fairly unique? How close can we really get to approximating one? Would someone really be dead if their Kolmogorov string were known?

 

https://sigil.substack.com/p/kolmogorov-complexity-lays-bare-the

8 comments

Comments sorted by top scores.

comment by Zack_M_Davis · 2023-12-02T13:51:22.902Z · LW(p) · GW(p)

Kolmogorov complexity has the counterintuitive property that an ensemble can be simpler than any one of its members. The shortest description of your soul isn't going to directly specify your soul; rather, it's going to be a description of our physical universe plus an "address" that points to you.

Replies from: Vladimir_Nesov, TAG
comment by Vladimir_Nesov · 2023-12-02T14:09:20.766Z · LW(p) · GW(p)

Quantum nondeterminism is going to make an address not much better than compressing the local content directly, searching for the thing rather than at a location. And to the extent laws of physics follow from the local content anyway (my mind holds memories of observing the world and physics textbooks), additionally specifying them does nothing. So unclear if salience of laws of physics in shortest descriptions is correct.

comment by TAG · 2023-12-02T17:44:58.296Z · LW(p) · GW(p)

Kolmogorov complexity has the counterintuitive property that an ensemble can be simpler than any one of its members

Yes but that's usually when the ensemble is infinite.

comment by Viliam · 2023-12-03T14:36:54.091Z · LW(p) · GW(p)

It is also completely unique, being the shortest possible string. It is so lean that not a single redundant detail remains - otherwise, it would not be the shortest string.

I don't think this is necessarily true. (Though I am not sure about it.) I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.

To use your example, "AAABB" can be compressed to either "3A2B" or "3ABB", both containing 4 characters. Knowing that "2B" and "BB" represent the same thing doesn't allow you to exploit this "redundancy" to further reduce it to one character.

Also, some parts of my body and mind are more important than others -- the exact shape of all my hairs at this moment is a lot of data (not easy to compress, because there is a lot of randomness involved), and in a second the shape will be different anyway, and even if you cut my hair short it would still be "me" (at least I do not experience existential horror whenever I get a haircut). Also not sure if gut flora should be included.

I guess my point is that even relatively useless things can require many bits of information and you actually don't need them, some lossy compression would suffice, but if you overdo it, you get The Fly.

Replies from: jake-jenks
comment by jakej (jake-jenks) · 2023-12-04T00:00:26.993Z · LW(p) · GW(p)

I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.

I think this is correct, but I think of this as being similar to chirality - multiple symmetric versions of the same essential information. I think it also probably depends on the description language you use, so maybe in one language something might have multiple versions, but in another it wouldn't?

Replies from: Viliam
comment by Viliam · 2023-12-04T09:25:43.185Z · LW(p) · GW(p)

Yes, if there is no deep underlying reason why the two minimal descriptions should be same, and it "just happened", I would assume that with slightly different description language it would not happen.

Even the "3A2B" vs "3ABB" example would stop working if encoding a number used a different number of bits than encoding a character.

comment by quiet_NaN · 2023-12-02T15:34:25.690Z · LW(p) · GW(p)

First off, the most important practical property of the Kolmogorov complexity is that it is not computable. This means that there is no general algorithm to determine what is the shortest computer program to generate a given output. (The crux is that you can not run all programs below a certain length to see if any of them will generate that output, because some of these programs will run forever, while others might run for an non-computable long time and eventually print that output and terminate.)

This is the reason why we have myriads of heuristic compression algorithms instead of just using Kolmogorov compression, which would yield the best possible results. 

It is also completely unique, being the shortest possible string.

No, there can be more than one such string. 

Even leaving issues of quantum physics aside, macroscopic physical objects like humans are unlikely to be very compressible (information-wise, that is). The author might feel that the number of lead atoms in their 36 molar tooth is not part of their Kolmogorov string, but I would argue that is is certainly part of a complete description. 

Humans are not the product of running some simple starting conditions in some deterministic cellular automaton, but shaped by a chaotic environment full of probabilistic interactions. Mess not math, if you will. 

In practice, that fine level of detail is not actually what I care about. Just like I listen to lossy compressed music, I would be fine with being uploaded into a somewhat lossy representation of myself where I don't have any lead atoms in my teeth.

Replies from: jake-jenks
comment by jakej (jake-jenks) · 2023-12-04T00:09:10.634Z · LW(p) · GW(p)

Even leaving issues of quantum physics aside, macroscopic physical objects like humans are unlikely to be very compressible (information-wise, that is). The author might feel that the number of lead atoms in their 36 molar tooth is not part of their Kolmogorov string, but I would argue that is is certainly part of a complete description. 

I don't know, just how compressible are we? I agree that the lead in my 36 molar is a part of my description, but anomalies such as these are always going to be the hardest part of compression since noise is not compressible. So maybe a complete description would look more like "all of the usual teeth, with xyz lead anomalies".

In practice, that fine level of detail is not actually what I care about. Just like I listen to lossy compressed music, I would be fine with being uploaded into a somewhat lossy representation of myself where I don't have any lead atoms in my teeth.

The "noise" of lead atoms in your teeth are among the least important bits in your Kolmogorov string, and would be the first to be dropped if you decided to allow a lossy representation. This reminds me of overfitting actually. The first thing a model tries to learn are the actual useful bits, and then later on when you train too long it starts to memorize the random noise in the dataset.