Data visualisation is one of my favourite topics. We can examine data visualisation in terms of three “lenses.” The first lens is that of mathematical mapping. Typically we have a visual element that represents one or more real-world variables, and we want a clear mapping from one to the other – that is, a one-to-one mapping without too much distortion.
For example, Fisher’s Iris flower data set gives 4 measurements for each of 150 flowers of three species of Iris. Multidimensional scaling can be used to map distances in the 4-dimensional data to distances in a 2-dimensional image (below). Flowers with similar measurements map to dots close together in the image, and flowers with wildly different measurements map to dots far apart. Some distortion is inevitable here, but the diagram clearly shows that Iris setosa flowers (se) are quite different from Iris virginica (vi) and Iris versicolor (ve), which sometimes have similar measurements:
In many visualisations, the mathematical mapping is defined by a colour scale, as in this map of fine particulate air pollution (credit: Aaron van Donkelaar, Dalhousie University):
The second lens is that of cognitive psychology. We build visualisations for people, not Martians, and people see some things more easily than others. A great deal has been written on this topic. To pick just one example, rainbow scales like the one above have been strongly criticised. They fail when turned into grey-scale, they may over-emphasize the transition at whichever value is coded yellow, and they can also confuse people with colour-blindness. It must be said, however, that the rainbow scale above (an improved version) combines hue and brightness in a way that actually makes good sense to people with most forms of colour-blindness (see the simulation below of what a person with red/green colour-blindness would see). Consequently, the rainbow scale above probably works better for this dataset than many of the alternative colour scales would. On the other hand, the nonlinearity in the colour scale is rather confusing.
The final lens is that of design. Visualisations that are mathematically and cognitively satisfactory may still be boring or ugly. Although the design lens is a little more subjective, there are many examples of good visualisation design out there. The tide prediction infographic by Kelvin Tow below (submitted to the 2013 Information is Beautiful Awards) is just one. The classic books by Edward Tufte also have a lot of good advice in this area.