What's the minimal additive constant for Kolmogorov Complexity that a programming language can achieve?
post by Noosphere89 (sharmake-farah) · 2023-12-20T15:36:50.968Z · LW · GW · 10 commentsThis is a question post.
Contents
Answers 20 interstice 0 RogerDearnaley None 10 comments
This is a question that I got from these posts, which essentially talked about how the constant issue impacts discussions of simplicity, and thus I wanted to ask a question about Kolmogorov Complexity:
https://www.lesswrong.com/posts/uWdAKyHZMfoxDHcCC/simplicity-arguments-for-scheming-section-4-3-of-scheming#What_is__simplicity__ [LW · GW]
https://en.wikipedia.org/wiki/Kolmogorov_complexity#Invariance_theorem
What's the lowest additive constant that can be achieved by a programming language while still being Turing-Complete?
Or equivalently, what's the least overhead that you must have in order to describe an object like a computer program or string?
Bonus points for not only establishing a lower bound, but either finding a programming language that achieves the lower bound and showing the source, or actually creating a programming language that achieves the lower bound.
As a side question, I'd also like estimations of how much overhead/the additive constant is added to popular languages like Python, Java, C, C++, and Rust.
Answers
The constant is defined between pairs of languages, and it tells you "how many bits does it take to emulate language A in language B". So it doesn't make sense to talk about "the" constant of a language, it's relative to what other language you are comparing it to.
↑ comment by Noosphere89 (sharmake-farah) · 2023-12-20T20:15:56.777Z · LW(p) · GW(p)
So the answer is "it depends on the languages involved".
I thought it was talking about the overhead to describe an object in an absolute sense, but it turns out the constant is related to the difficulty of language emulation.
Replies from: niplav↑ comment by niplav · 2023-12-21T18:49:29.000Z · LW(p) · GW(p)
Well, maybe you could create a graph that for each pair of languages contains the two numbers, and using methods such as HodgeRank (implementation), uncovered set or top cycle to create a single number for each language, which'd give you a simplicity comparison between languages. Ideas (with a little bit more detail) here [LW(p) · GW(p)] and here [LW(p) · GW(p)] ("Towards the Best Programming Language for Universal Induction").
Fun hypothesis: I suspect that doing this, or constructing a prior over programming languages that gets updated according to observations (a sort of two-level AIXI) collapses UDASSA into egoism, because the programming language that says "my observation is the output of the empty program [LW(p) · GW(p)]".
Replies from: sharmake-farah↑ comment by Noosphere89 (sharmake-farah) · 2023-12-21T20:05:52.404Z · LW(p) · GW(p)
So does that mean you worked a little on the additive constant issue I talked about in the question?
Replies from: niplav↑ comment by niplav · 2023-12-22T15:51:31.150Z · LW(p) · GW(p)
"Worked" as in "I thought a bit and have ideas that were shot down by others, but some intuitions", yes—motivated by this podcast which contains a good explanation of the issues. I've been mainly motivated by philosophical problems with AIXI & Solomonoff induction, not by anything concrete, though. And it doesn't seem super important, so I haven't written any of it up.
Practical programming languages are generally designed to try to reduce the Kolmogorov complexity of most common tasks, when pretty-much ignoring the additive constant for the language itself. This strongly encourages the additive constant for the language itself to be large, by adding a great many libraries useful for many types of tasks. For estimating the actual Kolmogorov complexity of a specific task for something like the Universal Prior, you're better off starting with one of the simplest Turing tarpits with low initial additive constant, and digging yourself out of the tarpit to the minimal extent actually required for that one specific task
10 comments
Comments sorted by top scores.
comment by cousin_it · 2023-12-20T15:47:49.105Z · LW(p) · GW(p)
I'm not sure the question makes sense.
For an analogy, consider the family of functions sin(x+c) for different values of c. Any of them is within an additive constant from any other (just take the constant equal to 2). But none of them has the "lowest additive constant", since they're all just shifted versions of each other.
Replies from: sharmake-farah↑ comment by Noosphere89 (sharmake-farah) · 2023-12-20T15:54:28.288Z · LW(p) · GW(p)
So the constant value c is 2, then, at least once we discard the random variable x. I just checked with a calculator, and sin(2+2) is roughly equal to -0.76, rounded to 2 decimal places, and sin(4) got the same value, so the lowest constant is 2, unless it's lower than that, and I suspect that it still holds for all values of the random variable x.
Replies from: cousin_it, cousin_it↑ comment by cousin_it · 2023-12-20T16:09:48.209Z · LW(p) · GW(p)
Ah true, in this case we can take the maximum constant to be 2.
Let's try a different example. Consider the functions |x-c| for different values of c. Then if you take any two such functions, |x-a| and |x-b|, their difference is bounded by a constant |a-b| for all x. But it's not clear how to say which function has the "lowest additive constant".
Replies from: sharmake-farah↑ comment by Noosphere89 (sharmake-farah) · 2023-12-20T16:21:07.406Z · LW(p) · GW(p)
Yeah, I'm really hoping Kolmogorov Complexity isn't like this, though when you say it's not clear that there's a minimum/maximum constant for a function, does that mean they know it lacks a minimum or maximum, or can they not prove whether it has a minimum or maximum.
I'd say the main difference is it involves adding, not subtracting, and I suspect there's a trivial minimum that is equal to 0, because I suspect the programming language can't subtract the Kolmogorov Complexity because it's defined to be the shortest program in a Turing-Complete language that outputs a given object, and the challenge is whether that minimum number of 0 actually exists, or whether the lower bound is higher than that for a programming language while still being Turing-Complete.
Replies from: cousin_it↑ comment by cousin_it · 2023-12-20T16:40:04.300Z · LW(p) · GW(p)
I think K-complexity is actually like that. For any given object you can define a Turing-complete language in which the empty program outputs that object. It could literally be "Python, except the empty program outputs this specific object".
Replies from: sharmake-farah, Viliam↑ comment by Noosphere89 (sharmake-farah) · 2023-12-20T16:44:10.740Z · LW(p) · GW(p)
So when you say that K-complexity is like the function that you describe earlier, does that mean that there's provably no minimum, or does that mean that you can't prove that the minimal constant for Kolmogorov Complexity exists?
Replies from: cousin_it↑ comment by cousin_it · 2023-12-20T16:50:47.895Z · LW(p) · GW(p)
It's still possible that I'm misunderstanding the question, but if it means what I think it means, then the answer is "provably no minimum".
Replies from: sharmake-farah↑ comment by Noosphere89 (sharmake-farah) · 2023-12-20T16:58:19.446Z · LW(p) · GW(p)
I can accept this as an answer, though you'd have to show why Kolmogorov Complexity's additive constant lacks a minimum number more than you have in this comment thread.
↑ comment by Viliam · 2023-12-20T21:19:52.585Z · LW(p) · GW(p)
However, "Python, except the empty program outputs this specific object" would probably be more complex than "Python", in most programming languages. So I wonder whether it would be possible to define objective complexity as eigenvector (not sure I am using the right word here) of relative complexities. As in: "simple" means "simple, when programmed in a simple language".