Complexity and Randomness revisited

I have posted before (post 1 and post 2) about order, complexity, and randomness. The image above shows the spectrum from organised order to random disorder, with structured complexity somewhere in between. The three textual examples below illustrate the same idea.

Regular Complex Random
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA … It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, … ShrfT e6IJ5 eRU5s nNcat qnI8N m-cm5 seZ6v 5GeYc w2jpg Vp5Lx V4fR7 hhoc- 81ZHi 5qntn ErQ2- uv3UE MnFpy rLD0Y DI3GW p23UF FQwl1 BgP36 RK6Gb 6lpzR nV03H W5X3z 2f1u8 OgpXy tY-6H HkwEU s0xLN 9W8H …

These three examples, and many intermediate cases, can be distinguished by the amount of information they contain. The leading way of measuring that information is with Kolmogorov complexity. The Kolmogorov complexity of a block of text is the length of the shortest program producing that text. Kolmogorov complexity is difficult to calculate in practice, but an approximation is the size of the compressed file produced by good compression software, such as 7-Zip. The chart below shows the number of bytes (a byte is 8 bits) for a compressed version of A Tale of Two Cities, a block of the letter ‘A’ repeated to the same length, and a block of random characters of the same length:

The random characters are chosen to have 64 possible options, which need 6 bits to describe, so a compression to about 75% of the original size is as expected. The novel by Dickens compresses to 31% of its original size.

But does this chart show information? Grassberger notes that Kolmogorov complexity is essentially just a measure of randomness. On this definition, random-number generators would be the best source of new information – but that’s not what most people mean by “information.”

An improvement is to introduce an equivalence relation “means the same.” We write X ≈ Y if X and Y have the same meaning. In particular, versions of A Tale of Two Cities with different capitalisation have the same meaning. Likewise, all meaningless random sequences have the same meaning. The complexity of a block of text is then the length of the shortest program producing something with the same meaning as that text (i.e. the complexity of X is the length of the shortest program producing some Y with X ≈ Y).

In particular, the complexity of a specific block of random text is the length of the shortest program producing random text (my R program for random text is 263 bytes), and we can approximate the complexity of A Tale of Two Cities by compressing an uppercase version of the novel. This definition of complexity starts to look a lot more like what we normally mean by “information.” The novel contains a large amount of information, while random sequences or “AAAAA…” contain almost none:

Those who hold that information satisfies the rule ex nihilo nihil fit can thus be reassured that random-number generators cannot create new information out of nothing. However, if we combine random-number generators with a selection procedure which filters out anything that “means the same” as meaningless sequences, we can indeed create new information, as genetic algorithms and genetic programming have demonstrated – although Stuart Kauffman and others believe that the evolution of biological complexity also requires additional principles, like self-organisation.