from Hacker News

Kolmogorov Complexity and Compression Distance (2023)

by rgbimbochamp on 3/30/24, 6:20 PM with 99 comments

by tromp on 3/30/24, 7:30 PM
> let’s assume that there exists a universal language U
Why not specify it?
> That gives us the true language-agnostic definition of Kolmogorov Complexity as follows:
Choosing the language of Turing Machines does not make the definition language agnostic.
Aiming for the simplest definition of description complexity, I instead based my definitions on the older computational model of lambda calculus in [1].
Unlike the assumed UTM above, the universal lambda machine is easy to describe in detail:
```
    (λ 1 1) (λ λ λ 1 (λ λ λ λ 3 (λ 5 (3 (λ 2 (3 (λ λ 3 (λ 1 2 3))) (4 (λ 4 (λ 3 1 (2 1)))))) (1 (2 (λ 1 2)) (λ 4 (λ 4 (λ 2 (1 4))) 5)))) (3 3) 2) (λ 1 ((λ 1 1) (λ 1 1)))
```
Furthermore, it allows almost identical definitions of various variations of descriptional complexity, namely
1) plain complexity
2) prefix complexity
3) monotone complexity
all of which have their application in Algorithmic Information Theory [2].
[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...
[2] https://homepages.cwi.nl/~paulv/kolmogorov.html
by arketyp on 3/30/24, 7:08 PM
Richard von Mises (brother of the economist) formulated a definition of randomness as a sequence of data that, were you a gambler, you cannot by any strategy make money on betting on the outcomes. This was before computational calculus and was later developed by Kolmogorov and others in algorithmic complexity. The modern variation would be (Wiki) "considering a finite sequence random (with respect to a class of computing systems) if any program that can generate the sequence is at least as long as the sequence itself".
by rhelz on 3/30/24, 7:32 PM
I think its bootless to try to define the "minimum possible" Kolmogorov complexity. Here's why:
1. Note, kolmogorov complexity is defined by the length of the shortest program which prints out the string. What counts is the number of instructions, and not the complexity of those instructions.
2. So say S is a very complex spring. We can always construct a turing machine which could print out S using a zero length program: it could just start in a state which prints out S when you turn it on, and then halts.
3. So there is no such thing as a turing machine which prints out every string shorter than any other turing machine prints it out, QED.
That's the bad news. The good news is we don't even need to do that. For any string S, say that M and N are any two universal turing machines. Without loss of generality, specify that KM(S) <= KN(S). Then there is always some C for which KM(S) <= KN(S) + C. The constant C being the length of the program required to emulate machine M on machine N.
We are used to abstracting out constant sums and constant factors like this. The strings we are dealing with (as a species) are growing in length exponentially--that's why we went from 8-bit, to 16bit, etc computers. So as the length of S goes to infinity, the difference between the its complexity for any two machines becomes negligible.
by anonzzzies on 3/30/24, 8:43 PM
Kolmogorov complexity is a lovely subject and one of the more influential ones in my life.
THE book https://link.springer.com/book/10.1007/978-0-387-49820-1 is absolutely a thing to read. It was for me 30 years ago and it aged well.
by wood_spirit on 3/30/24, 7:04 PM
An excellent rabbit hole to dive into is the equivalence of compression and general AI. Every programmer should make a compressor (and, separately, a ray tracer)!
See http://prize.hutter1.net/
by JDEW on 3/30/24, 7:05 PM
> It has been demonstrated that KC(x), can be reasonably estimated by the number of bits required to encode x using a compressor C (such as gzip)
Talk about a cliffhanger :)
Using [0] you get 32B for Alice and 40B for Bob.
[0] It has been demonstrated that KC(x), can be reasonably estimated by the number of bits required to encode x using a compressor C (such as gzip)
by causal on 3/30/24, 6:43 PM
Confused how the interesting number paradox proves KC cannot be computed.
by davesque on 3/30/24, 8:40 PM
Something I've always noticed with the notion of Kolmogorov complexity is that the question of determining the lowest level of computation is problematic.
For example, in the article, the author first defines the basic idea of KC. But then they correctly point out that the basic idea depends very much on the exact language that is chosen. So they describe how theorists have defined the notion of universal computation. But even this adjustment doesn't seem to escape the fact the we still depend on a system of mathematical symbols to describe the theory. And the notion of a Turing machine itself depends on other abstract concepts such as time and space, each with their own inherent, conceptual complexity. What sorts of minds (i.e. brains) are required to make sense out of the theory and what physical system is required for them to operate correctly? If the definition of KC includes a notion of how complex the Turing machine is that is required to compute a string, then the further down you go, the less the difference in complexity should be between any one string and another. After all, they all exist in the same universe!
I guess it just goes to show how much the idea of KC lives in the realm of theory. As soon as you pose the question of complexity so abstractly, you invite in all kinds of theoretical considerations that make the meaning more slippery. That's why KC really doesn't deserve to be compared to Shannon entropy as it often is.
But let me draw a comparison anyway like I said you shouldn't! Because Alice from the article could also have made a strong argument against Bob by just pointing out that the Shannon entropy of his string was lower, which is very relevant in terms of the number of heads or tails and the likelihood of seeing a particular count of them.
by copx on 3/30/24, 7:22 PM
>Bob claims that since the probability of getting both his and Alice’s sequence is the same (2−20 ), it proves that there was no foul-play involved.
..and Bob is 100% right.
>Bob credits his excellent luck. Alice is smart and cannot be easily convinced. She get’s back at Bob by claiming that probability cannot be used in this context as it reveals no information regarding the randomness of the obtained sequences. One can take a quick glance at the obtained sequences and easily point out that Alice’s sequence is more random than Bob’s sequence.
No, it is not. Given a perfectly random coin toss Bob's sequence is indeed just as likely as Alice's sequence and in no way "less random" because both sequences result from the same randomness with equal probability.
A nice example of human intuition being at odds with probability math, though. Bob's result seems less likely but it really is not. Which reminds me that I actually had to write my own computer simulation of the Monty Hall Problem before I was willing to believe the correct answer. I think (most?) human brains have a bug in the "understanding probability" subroutine.
by mojomark on 3/30/24, 7:28 PM
I'm going to keep reading (because I love the KC topic), but I'd appreciate anyone confirming if the following are errors in this article:
1.) Conflating usage of the term "random" and "complexity". After all, a set of "randomly" drawn sample permutations from an alphabet are all equally likely. However, their "complexity" may differ, which is basically the point of the article, but the term more or less "random" keeps being used to refer to permutations with more or less "complexity", which I think is probably going to perpetuate confusion on this topic.
2.) From the article: "Moreover, a string cannot be compressed if its KC(x)≥|x|". Shouldn't the expression accompanying this statement be KC(x)=|x| ?
by pizza on 3/30/24, 7:06 PM
I think maybe another way to put this is that Alice's number is in a typical set [0] of the distribution of bitstrings whereas Bob's might not be. Depending on the tolerance, the typical set can have near-total coverage of the distribution. Another way of making this about compression is that a random code that could encode typical set strings well probably will suffer some overhead when encoding Bob's, but most strings it will encode close to optimally.
[0] https://en.wikipedia.org/wiki/Typical_set
by alfanick on 3/30/24, 8:08 PM
A side question: is this taught in CS curriculum you know? It was at my uni (fairly good one, in a minor European country), and this experience biases me because I assume every CS knows Kolmogorov complexity.
by yamrzou on 3/30/24, 6:56 PM
Well, I reached the end of the article (interesting btw), and still not convinced why bob can't claim that there was no foul-play involved and that his got his result due to excellent luck.
by avmich on 3/30/24, 7:30 PM
> Another thing to note is that Kolmogorov complexity of a string cannot be computed. There cannot exist a computer that will always guarantee the Kolmogorov complexity for all the strings.
Sounds a bit puzzling. Surely for a particular programming language we can enumerate all programs, ordered by length etc. and check which is the shortest one giving the given string. So what's uncomputable here? For long strings that could take long time, but - ?
by robrenaud on 3/31/24, 4:23 PM
If you want something like Kolmogorev complexity for molecules, check out assembly theory. I am a CS person, but there are interesting, related ideas here.
https://en.m.wikipedia.org/wiki/Assembly_theory
by Ono-Sendai on 3/31/24, 12:31 AM
Kolmogorov Complexity does not help with giving a universal measure of complexity or randomness: https://forwardscattering.org/page/0
by marius_k on 3/30/24, 7:10 PM
Sometimes I wonder what would be the smallest program to generate humans DNA. How many operations would it take and how would it compare to real world iterations of total evolution.