Pencil charts for visualising colours

As a result of a discussion with a photographer friend of mine, I’ve been thinking (not for the first time) about visualising the colour palette of images. Consider this sunset, for example (a picture I took in Adelaide 8 years ago):

The photograph is rich in yellow and orange. However, the apparent blue in the sky is actually grey, and the apparent grey of the sea is actually brown. If we postulate a standard set of 35 plausible pencil colours, and map each pixel to the closest-matching pencil colour, we get this (I have done the comparison in RGB space):

Then we can visualise the colour palette of the image by showing the wear on the virtual pencils, if each virtual pencil has been used to colour the corresponding pixels. It can be seen that a lot of orange, brown, and grey was used (click to zoom):

Conversely, this beach scene (photographed in Vanuatu in 2016) is rich in blues:

The warm light greys of the beach don’t quite find an exact match among the pencils, but the other colours match fairly well:

And here is the pencil visualisation (click to zoom):

If, rather than using a standard set of colours, we extract the pencil colours from the image itself (image quantisation), fewer pencils will, of course, be required:

The fit to the original image will be much closer as well:

So this is a trick to remember for another day – pencil visualisations!


Colours in national flags

The infographic above shows the most common colour in various national flags, excluding white and black. For example, red is the most common colour in the US flag. If there are two or more equally common colours (as in BE = Belgium or FR = France), the country is given partial credit for both. Similar colours are grouped using k-means clustering in R.

Overall, shades of red seem the most popular, followed by shades of blue. The set of flag image files I analysed wasn’t fantastic, however, and that may have skewed the results slightly.


Colour in literature

The chart below extends my previous colour analysis to an even more mixed collection of books. On the right are books with many descriptive passages involving colour, and thus a high frequency of colour words (calculated without excluding stop words this time). At the top of the chart are books with large colour vocabularies (counting colour words used twice or more). The dots show the most common colour word in each book.

Results are consistent with the fact that the most common colour words in English (in decreasing order of frequency) are black, white, red, green, blue, yellow, brown, grey, pink, orange, and purple. However, Anne of Green Gables and The Wonderful Wizard of Oz have “green” as the most common word for plot-related reasons, while The Blue Castle by L.M. Montgomery has, not surprisingly, “blue.” The Merry Adventures of Robin Hood by Howard Pyle has “scarlet,” some uses of which are as the name “Will Scarlet.” Twenty Thousand Leagues Under the Sea I have already discussed.


Colour in children’s novels

Following up on the children’s literature theme again, here is an analysis of colour words in three quite different books:

About 0.57% of the words in Twenty Thousand Leagues Under the Sea (after excluding stop words) are colour words, with a wide variety being used (“the finback whale, yellowish brown, the swiftest of all cetaceans” and “Portuguese men-of-war that let their ultramarine tentacles drift in their wakes, medusas whose milky white or dainty pink parasols were festooned with azure tassels”):

In contrast, Five Go Adventuring Again only has about 0.25% colour words, mostly used in clichéd ways (“Anne went very red” and “her blue eyes glinting”). The one use of “scarlet” refers to “scarlet fever,” rather than to a colour:

The Wonderful Wizard of Oz mentions colour even more than the other two books, with about 1.21% colour words. Green and yellow are particularly common, given the storyline:


Computational anthropology – Sororities and colour rules

A recent blog post about US sororities published the above slide from Sigma Delta Tau, dating from 2013 and outlining a palette of acceptable dress colours.

Founded a century ago, Sigma Delta Tau is a historically Jewish sorority with an interest in philanthropy. Their slogan is “empowering women,” and in some way that I cannot possibly understand, this is achieved partly through extremely detailed guidelines on attire. However, the slide above does make a good case study in computational anthropology.

Whenever we have a set of OK/Not OK pronouncements like these, decision tree learning is a good tool for extracting the underlying pattern (I used the rpart package in R). For colours, we can perform analysis using hue, saturation, and value. In this case, the first restriction computed by the tool (and reinforced in the text of the slide) is “don’t go too light” – the sorority requires a colour value below about 79. The second restriction in the decision tree is “not too blue” – specifically a hue lower than about 182. Saturation is not identified as important in the decision tree analysis.

The diagram below highlights the acceptable colour region and the specific examples from the slide above. Of course, this only gives clarity to what the social rule is. It does not explain why the social rule exists, or what social goals the rule might achieve. For that, we must turn to traditional anthropology – although even here, social simulation can provide computational assistance.