Thinking about complexity

I was recently involved in a discussion on complexity. Complexity seems like a natural idea – “abababababababababababababababababababababababababababababab” is a simple sequence of letters, while “Whether ‘tis nobler in the mind to suffer the slings and arrows of outrageous fortune” is a complex one. Actually formalising this idea is a little tricky, however. As with some other concepts (time, for example), we recognise complexity when we see it, but actually defining complexity is difficult. One of the leading approaches is Kolmogorov complexity.

Roughly, Kolmogorov complexity measures the complexity of a sequence by the simplest program that can generate that sequence, which is a formal way of finding the simplest description of the sequence. For example, that first sequence is simple because it can be described as “ab”×30.

The simplest programming language I know is combinatory logic (much simpler even than Turing machines). Combinatory logic has all the theoretical power of any other language, but programs are composed only of brackets and the constants S and K (which I will treat as equivalent to 0 and 1). The brackets satisfy (x y) z = x y z. Execution proceeds by term rewriting, and there are just two execution rules:

S x y zx z (y z)K x yx

We can set up a version of Kolmogorov complexity based on combinatory logic. Through Gödel numbering, each program in combinatory logic is associated with a natural number. Execution of the programs gives finite or infinite sequences of S, K, and brackets. I will treat S as 0 and K as 1, ignoring the brackets. I will treat finite sequences as repeating.

Program #0 in the Gödel numbering is just the constant S, which terminates immediately and is equivalent to producing the sequence “0,0,0,…” Program #1 is just the constant K, equivalent to “1…” A slightly more complex program is #96, which is S S K K S, and executes as follows:

1. S S K K SS K (K K) S

2. S K (K K) SK S ((K K) S) = K S (K K S)

3. K S (K K S)S

Execution stops with S in this case, which is equivalent to producing the sequence “0…”

I will take the simplest program producing a sequence to be the first program in the Gödel numbering, with the complexity of the sequence being the number of bits in the Gödel number. The two sequences “0…” and “1…” therefore both have complexities of one bit. Repeating pairs have complexities of 2 or 3 bits, while repeating triples have complexity 4 to 6 (see below). Because programs may produce unproductive infinite loops, actual calculation of the complexity of a sequence is not always possible.

Sequence Program Complexity
010101010101010101010101010101 = 01… #3 2
101010101010101010101010101010 = 10… #4 3
010010010010010010010010010010 = 010… #8 4
001001001001001001001001001001 = 001… #9 4
100100100100100100100100100100 = 100… #13 4
011011011011011011011011011011 = 011… #15 4
101101101101101101101101101101 = 101… #25 5
110110110110110110110110110110 = 110… #49 6

Some more complex sequences (having non-trivial descriptions) are listed below. The highest complexity will be associated with random sequences – which implies that random-number generators are machines for creating complexity. In the images at the top of the page, the random pattern of mineral flecks in granite therefore form the most complex pattern. That may or may not be what we intended the word “complexity” to mean.

Sequence Program Complexity
010000100001000010000100001000 = 01000… #141 8
001010010100101001010010100101 = 00101… #153 8
010110101101011010110101101011 = 01011… #167 8
000110010001100100011001000110 = 00011001… #183 8
010000100010000100010000100010 = 010000100… #189 8
001000001000001000001000001000 = 001000… #198 8
000101001000101001000101001000 = 000101001… #201 8
001000010010000100100001001000 = 00100001… #216 8
001000100100010010001001000100 = 0010001… #233 8
010100101010010101001010100101 = 0101001… #251 8
010101101010101101010101101010 = 010101101… #275 9
001100011000110001100011000110 = 00110… #305 9
000010000001000000100000010000 = 0000100… #333 9
001110110011101100111011001110 = 00111011… #359 9
010110010110010110010110010110 = 010110… #369 9
001010001010001010001010001010 = 001010… #392 9
010100011010001010100011010001 = 010100011010001… #407 9
000001010000100000101000010000 = 0000010100001… #425 9
001000010001000010001000010001 = 001000010… #434 9
010001001000100100010010001001 = 0100010… #465 9
001001101001001001101001001001 = 001001101001… #471 9

Eight unsolved mathematical problems

Eight things to think about…

1) Are there there infinitely many twin prime pairs p and p+2? We have 3/5, 5/7, 11/13, 17/19, etc. Does that sequence go on forever?

2) Are there there infinitely many Sophie Germain prime pairs p and 2p+1? We have 2/5, 3/7, 5/11, 11/23, etc. Does that sequence go on forever?

3) Is every even number greater than 4 the sum of two odd primes? We have 6 = 3 + 3, 8 = 3 + 5, 10 = 3 + 7, 12 = 5 + 7, etc. Does that always work?

4) Are there infinitely many perfect numbers? We have 6 = 1 + 2 + 3, 28 = 1 + 2 + 4 + 7 + 14, 496, 8128, 33550336, etc. Does that sequence go on forever? And are there any odd numbers in the sequence?

5) Is π a normal number? That is, do the digits 0 to 9 occur equally often in 3.14159265358979323846264338327950288419716939937510582…, and does that also work for bases other than decimal?

6) Do the nontrivial zeros of the Riemann zeta function all have real part ½?

7) Does P = NP? This is perhaps the most important (and most famous) unsolved problem in computer science, and there is a million-dollar prize for solving it.

8) Do smooth solutions to the Navier–Stokes equations always exist? There is a million-dollar prize for this one too.

Visual modalities and data visualisation

In this post we will look at some visual modalities for data visualisation – size, colour, and shape. People use these all the time, but not always well. We will use the following matrix of numbers as an example, encoding the numbers in different ways (large values are highlighted in the image on the right):

    

Using shape to convey magnitude does not reveal the pattern here, since the triangles do not stand out from the diamonds and squares (the relationship between magnitude and shape is also rather arbitrary):

However, both size and brightness succeed in getting the message across (size works best):

    

On the other hand, varying only the hue or the saturation of colours does a poor job:

    

Combining hue and saturation variation with brightness variation, as in the various sequential ColorBrewer palettes, works very well – which is why this is recommended for colouring maps:

    

See also my three lenses on data visualisation post from last year and this paper by Cynthia Brewer on colour use guidelines.

A weak BICEP2?

Earlier this year, the team behind the Background Imaging of Cosmic Extragalactic Polarization instrument, or BICEP2 (photo below by “Amble”), reported the discovery of primordial gravitational waves – supporting the theory of cosmic inflation. It now seems that their observations could instead be explained by cosmic dust polarisation. In a recent Nature column, Princeton physicist Paul Steinhardt outlines some ideas for analysing the results of future experiments of this kind. He also makes the rather disturbing suggestion that inflationary theory may be unfalsifiable. If that is true, is it really science at all?

The Queen’s Birthday honours list for Australia

      

The Queen’s Birthday honours list for Australia is out. Appointments of scientists to the Order of Australia include (among others):

  • Dr Megan Clark, AC – “For eminent service to scientific research and development through fostering innovation, to science administration through strategic leadership roles, and to the development of public policy for technological sciences.”
  • Professor Marc Feldmann, AC, FAA – “For eminent service to medicine and to public health as an acclaimed researcher in the field of chronic immune disease, and through the development of innovative treatment therapies.”
  • Professor Richard Gibbs, AC – “For eminent service to science and academic medicine as a leading researcher, author and scholar, particularly in the field of genetics and human genome sequencing, and as a mentor of emerging scientists.”
  • Dr Ian Allison, AO, AAM – “For distinguished service to the environment as a glaciologist, to furthering international understanding of the science of the Antarctic region, and to climate research.”
  • Professor Nicholas Hoogenraad, AO – “For distinguished service to science education and technological development, particularly in the fields of biochemistry and molecular biology.”
  • Professor Philip Lake, AO – “For distinguished service to conservation and the environment as an ecologist and freshwater scientist, and to research and professional organisations.”
  • Professor Barry Ninham, AO – “For distinguished service to physical sciences through landmark theoretical and practical advances in colloids and surfaces, and as an academic, educator and mentor.”
  • Professor Ian Ritchie, AO – “For distinguished service to science in the field of chemistry and hydrometallurgy, as an academic and educator, and to fostering technical innovation in business and industry.”

Caenorhabditis elegans: a model organism

Caenorhabditis elegans (photo above by Bob Goldstein, diagram below by “KDS444”) is a transparent nematode worm, about 1 mm in length. It lives naturally in the soil, where it eats bacteria, but it is also quite happy to make its home in a Petri dish. A 1963 suggestion by Sydney Brenner led to C. elegans becoming the focal point of a vast collaborative effort to understand the worm in detail. Brenner shared the 2002 Nobel Prize in Physiology or Medicine for this work.

The cellular development of C. elegans has been mapped in detail, and its genome had been largely mapped by 1998. The diagram below shows the neural network of the worm, drawn using R, based on data from here (from this paper via this one). In this diagram, colour shows the centrality of neurons in the network. Other information on C. elegans is available at wormbase.org.

Because of the effort that has gone into understanding this humble worm as whole, rather than as just parts, a great deal has been learned about biology in general. Brenner was on to a good thing!

Three lovely scanning electron microscope (SEM) images


A wonderful image of pollen from sunflower, morning glory, hollyhock, lily, primrose, and castor bean plants (Dartmouth Electron Microscope Facility 2011, colourised by William Crochot).


Image of a strawberry by Annie Cavanagh and David McCarthy of the School of Pharmacy, University of London. This beautifully detailed image was created in 2010 by stitching several different SEM images together. I have mentioned this image before. It comes via via Wellcome Images, and more of the story is here.


A human human T-lymphocyte (white blood cell), colourised blue (NIAID 2010).