All about fixed points

In mathematics, a fixed point of a function f is a value x such that f(x) = x. For example, Wolfram notes that cos(0.7390851332) = 0.7390851332. There need not be such a fixed point, of course, but it is interesting when there is.

Unique fixed points

The Banach fixed-point theorem applies to complete metric spaces – sets where there is a concept of distance and where convergent sequences have a limit.

If f is a function that shrinks things – if, say, the distance between f(x) and f(y) is at most 50%, or 90%, or 99.99999% of the distance between x and y – then the function f will not only have a fixed point, but that fixed point will be unique.

Intuitively, this is clear. You apply the function f to the whole set an infinite number of times, and everything shrinks down to a single point satisfying f(x) = x.

Comic 688 by xkcd, with the fixed point highlighted in pink

It follows that if you stand inside a country, city, or home holding (horizontally) a map of that country, city, or home, then exactly one point in the country, city, or home is exactly underneath the corresponding point on the map. More usefully, this theorem has been used to assign meaning to recursive type expressions in computer science.

Iterated function systems

Iterated function systems are one way of defining fractals. An iterated function system is a family of functions, each of which shrinks things (i.e. is contractive). This family of functions defines a Hutchinson operator on sets of points, and that operator is in turn contractive for a certain definition of distance between sets (see Hutchinson, 1981).

Consequently, that operator has a unique “fixed point,” which is in fact a set of points, like the famous Barnsley fern. The set of points can be (approximately) generated by an iterative process (see code example here):

Least fixed points

As another example of fixed points in action, the Kleene fixed-point theorem assigns meaning to recursively defined computer programs, like these:

  • x = x, a circular definition which is satisfied by infinitely many things, defines x = ⊥, the least defined of all those things
  • z = 0 ⊙ (1 + z), where “” means putting a number in front of a list, defines the infinite list z = 0, 1, 2, 3, 4, 5, …
  • g = function (x) { if (x = 0) then 1 else x * g(x − 1) }, defines g to be the factorial function

Computer programs are modelled by Scott-continuous functions on a partial order where ⊥ is the least element (⊥ means “no information” and is the meaning of any computer program stuck in an infinite loop). Scott-continuous functions may have more than one fixed point, but they have a unique least fixed point, which is the limit of the sequence ⊥ ⊑ f(⊥) ⊑ f(f(⊥)) ⊑ …

This sequence corresponds to what actually happens when the computer attempts to execute recursive definitions like the three in red above.

Brouwer’s fixed point theorem

Even more interesting is Brouwer’s fixed-point theorem. Any continuous function mapping a compact convex set to itself has at least one fixed point.

This is hard to prove in general, though easy to see in the case that the set is simply the interval [0, 1]. We have f(0) − 0 ≥ 0 and f(1) − 1 ≤ 0, so that f(x) − x = 0 for at least one x in the range 0 to 1:

An implication is that, when I stir my cup of coffee, at least one speck of liquid winds up exactly where it started.

Fixed points and dynamics

If the function f represents a state transition function, then a fixed point corresponds to a state that remains the same over time. For example, the logistic mapx (1 − x) has fixed points 0 and 0.75. However, these are both unstable – if you start out close to them, repeated application of the function takes you further away:

  • 0.000001, 0.000004, 0.000016, 0.000064, 0.000256, 0.001024, 0.00409, 0.016295, 0.064117, 0.240023, 0.729649, 0.789046, 0.66581, 0.890028, 0.391511, 0.952921, 0.179452, 0.588995, 0.96832, 0.122706, 0.430598, …
  • 0.749999, 0.750002, 0.749996, 0.750008, 0.749984, 0.750032, 0.749936, 0.750128, 0.749744, 0.750512, 0.748975, 0.752045, 0.745893, 0.758147, 0.733441, 0.782021, 0.681856, 0.867713, 0.459147, 0.993324, 0.026525, …

On the other hand, the logistic map 2.5 x (1 − x) has a stable fixed point (an attractor) at 0.6. Starting at, say, 0.5, repeatedly applying the function gets you closer and closer to 0.6:

  • 0.5, 0.625, 0.585938, 0.606537, 0.596625, 0.601659, 0.599164, 0.600416, 0.599791, 0.600104, 0.599948, 0.600026, 0.599987, 0.600007, 0.599997, 0.600002, 0.599999, 0.6, 0.6, 0.6, 0.6, …

Fixed points are everywhere! Not only that, but the concept is a fruitful one in a diverse collection of fields. For more on fixed points, see this book (available electronically here).

The Circle of Fifths

I have often tried to visualise the circle of fifths in a way that makes sense both musically and mathematically. Above (click to zoom) is my latest attempt.

There are 12 notes in an octave (7 white piano keys and 5 black piano keys), and the diagram shows these 12 piano keys wrapped into a circle. A fifth is a step of 7 semitones (7 piano keys, counting black ones, e.g. C→D♭→D→E♭→E→F→F♯→G). The coloured spiral in the chart shows the “circle of fifths” resulting from moving up a fifth 12 times (moving left to right, and hence moving anticlockwise).

The reason that this works is that 7 and 12 have no common factor – and therefore the first multiple of 7 that is also a multiple of 12 is 7 × 12. Therefore every time you move up a fifth you get a different note, returning to the starting note only when you have moved up 12 times. In the process, you have hit every other note exactly once.

Something is going on with the primes…

The chart below illustrates the Erdős–Kac theorem. This relates to the number of distinct prime factors of large numbers (integer sequence A001221 in the On-Line Encyclopedia):

Number No. of prime factors No. of distinct prime factors
1 0 0
2 (prime) 1 1
3 (prime) 1 1
4 = 2×2 2 1
5 (prime) 1 1
6 = 2×3 2 2
7 (prime) 1 1
8 = 2×2×2 3 1
9 = 3×3 2 1
10 = 2×5 2 2
11 (prime) 1 1
12 = 2×2×3 3 2
13 (prime) 1 1
14 = 2×7 2 2
15 = 3×5 2 2
16 = 2×2×2×2 4 1

The Erdős–Kac theorem says that, for large numbers n, the number of distinct prime factors of numbers near n approaches a normal distribution with mean and variance log(log(n)), where the logarithms are to the base e. That seems to be saying that prime numbers are (in some sense) randomly distributed, which is very odd indeed.

In the chart, the observed mean of 3.32 is close to log(log(109)) = 3.03, although the observed variance of 1.36 is smaller. The sample in the chart includes 17 numbers with 8 distinct factors, including 1,000,081,530 = 2×3×3×5×7×19×29×43×67 (9 factors, 8 of which are distinct).

The Erdős–Kac theorem led to an episode where, following the death of Paul Erdős in 1996, Carl Pomerance spoke about the theorem at a conference session in honour of Erdős in 1997. Quoting Albert Einstein (“God does not play dice with the universe”), Pomerance went on to say that he would like to think that Erdős and [Mark] Kac replied “Maybe so, but something is going on with the primes.” The quote is now widely misattributed to Erdős himself.

Recreational mathematics

The wolf, the goat, and the cabbages

Dancing alongside the more serious practitioners of mainstream mathematics are the purveyors of mathematical puzzles and problems. These go back at least as far as Diophantus (c. 200–284), the Alexandrian “father of algebra.” Alcuin of York (c. 735–804) produced a collection of problems that included the the wolf, the goat, and the cabbages (above); the three men who need to cross a river with their sisters; and problems similar to the bird puzzle published by Fibonacci a few centuries later. In more modern times, Martin Gardner (1914–2010) has done more than anyone else to popularise this offshoot of mathematics. It is often called “recreational mathematics,” because people do it for fun (in part because they are not told that it is mathematics).

Particularly popular in recent times have been Sudoku (which is really a network colouring problem in disguise) and the Rubik’s Cube (which illustrates many concepts of group theory, although it was not invented with that in mind). Sudoku puzzles have been printed in more than 600 newspapers worldwide, and more than 20 million copies of Sudoku books have been sold. The Rubik’s Cube has been even more popular: more than 350 million have been sold.

A Soma cube, assembled

Recreational puzzles may be based on networks, as in Hashi (“Bridges”). They may be based on fitting two-dimensional or three-dimensional shapes together, as in pentominoes or the Soma cube. They may be based on transformations, as in the Rubik’s Cube. They may even be based on arithmetic, as in Fibonacci’s problem of the birds, or the various barrel problems, which go back at least as far as the Middle Ages.

In one barrel problem, two men acquire an 8-gallon barrel of wine, which they wish to divide exactly in half. They have an empty 5-gallon barrel and an empty 3-gallon barrel to assist with this. How can this be done? It is impossible to accurately gauge how much wine is inside a barrel, so that all that the men can do is pour wine from one barrel to another, stopping when one barrel is empty, or the other is full [highlight to show solution → (8, 0, 0) → (3, 5, 0) → (3, 2, 3) → (6, 2, 0) → (6, 0, 2) → (1, 5, 2) → (1, 4, 3) → (4, 4, 0)]. There is a similar problem where the barrel sizes are 10, 7, and 3.

The barrels

Apart from being fun, puzzles of this kind have an educational benefit, training people to think. For this reason, Alcuin called his collection of problems Propositiones ad Acuendos Juvenes (Problems to Sharpen the Young). Problems like these may also benefit the elderly – the Alzheimer’s Association in the United States suggests that they may slow the onset of dementia. This is plausible, in that thinking hard boosts blood flow to the brain, and research supports the idea (playing board games and playing musical instruments are even better).

Mathematics and Art: Why can’t we be friends?

The figures of Geometry and Arithmetic by the Coëtivy Master, late 15th century (detail from Philosophy Presenting the Seven Liberal Arts to Boethius)

For most of history, mathematics and the visual arts have been friends. Art was not distinguished from what we now call “craft,” and mathematics – geometry and arithmetic – provided both a source of inspiration and a set of tools. Polykleitos, for example, in the 5th century BC, outlined a set of “ideal” proportions for use in sculpture, based on the square root of two (1.414…). Some later artists used the golden ratio (1.618…) instead.

Symmetry has also been an important part of art, as in the Navajo rug below, as well as a topic of investigation for mathematicians.

Navajo woollen rug, early 20th century (Honolulu Museum of Art)

The Renaissance saw the beginning of the modern idolisation of artists, with Giorgio Vasari’s The Lives of the Most Excellent Painters, Sculptors, and Architects. However, the friendship between mathematics and art became even closer. The theory of perspective was developed during 14th and 15th centuries, so that paintings of the time have one or more “vanishing points,” much like the photograph below.

Perspective in the Galerie des Batailles at Versailles (base image: 1890s Photochrom print, Library of Congress)

Along with the theory of perspective, there was in increasing interest in the mathematics of shape. In particular, the 13 solid shapes known as Archimedean polyhedra were rediscovered. Piero della Francesca rediscovered six, and other artists, such as Luca Pacioli rediscovered others (the last few were rediscovered by Johannes Kepler in the early 17th century). Perspective, polyhedra, and proportion also come together in the work of Albrecht Dürer. Illustrations of the Archimedean polyhedra by Leonardo da Vinci appear in Luca Pacioli’s book De Divina Proportione.

Illustration of a Cuboctahedron by Leonardo da Vinci for Luca Pacioli’s De Divina Proportione (1509)

Some modern artists have continued friendly relations with mathematics. The Dutch artist M. C. Escher (reminiscent of Dürer in some ways) sought inspirations in the diagrams of scientific publications, for example.

Tiling by M. C. Escher on the wall of a museum in Leeuwarden (photo: Bouwe Brouwer)

Today it is possible to follow in Escher’s footsteps by studying a Bachelor of Fine Arts / Bachelor of Science double degree at some institutions. There is also a renewed interest in the beauty of mathematical objects, whether three-dimensional (such as polyhedra) or two-dimensional (such as the Mandelbrot set). The role of the artist then becomes that of bringing out the beauty of the object through rendering, colouring, choice of materials, sculptural techniques, and the like.

View of the Mandelbrot set at −0.7435669 + 0.1314023 i with width 0.0022878 (image: Wolfgang Beyer)

Artistic techniques such as these (“must we call them “craft” or “graphic design”?) are also important in the field of data visualisation, and are recognised by the “Information is Beautiful” Awards. Speaking of which, this year’s awards are now open for submissions.

Sequences, R, and the Free Monoid

An important concept in computer science is the free monoid on a set A, which essentially consists of sequencesa1an⟩ of elements drawn from A. The key operations on the free monoid are:

  • a⟩, forming a singleton sequence from a single element of A
  • xy, concatenation of the sequences x and y, which satisfies the associative law: (xy)⊕z = x⊕(yz)
  • ⟨⟩, the empty sequence, which acts as an identity for concatenation: ⟨⟩⊕x = x⊕⟨⟩ = x

The free monoid satisfies the mathematical definition of a monoid, and is free in the sense of satisfying nothing else. There are many possible implementations of the free monoid, but they are all mathematically equivalent, which justifies calling it the free monoid.

In the R language, there are four main implementations of the free monoid: vectors, lists, dataframes (considered as sequences of rows), and strings (although for strings it’s difficult to tell where elements start and stop). The key operations are:

Vectors Lists Dataframes Strings
⟨⟩, empty c() list() data.frame(n=c()) ""
a⟩, singleton implicit (single values are 1-element vectors) list(a) data.frame(n=a) as.character(a)
xy, concatenation c(x,y) c(x,y) rbind(x,y) paste0(x,y)

An arbitrary monoid on a set A is a set B equipped with:

  • a function f from A to B
  • a binary operation xy, which again satisfies the associative law: (xy)⊗z = x⊗(yz)
  • an element e which acts as an identity for the binary operator: ex = xe = x

As an example, we might have A = {2, 3, 5, …} be the prime numbers, B = {1, 2, 3, 4, 5, …} be the positive whole numbers, f(n) = n be the obvious injection function, ⊗ be multiplication, and (of course) e = 1. Then B is a monoid on A.

A homomorphism from the free monoid to B is a function h which respects the monoid-on-A structure. That is:

  • h(⟨⟩) = e
  • h(⟨a⟩) = f(a)
  • h(xy) = h(x) ⊗ h(y)

As a matter of fact, these restrictions uniquely define the homomorphism from the free monoid to B to be the function which maps the sequence ⟨a1an⟩ to f(a1) ⊗ ⋯ ⊗ f(an).

In other words, simply specifying the monoid B with its function f from A to B and its binary operator ⊗ uniquely defines the homomorphism from the free monoid on A. Furthermore, this homomorphism logically splits into two parts:

  • Map: apply the function f to every element of the input sequence ⟨a1an
  • Reduce: combine the results of mapping using the binary operator, to give f(a1) ⊗ ⋯ ⊗ f(an)

The combination of map and reduce is inherently parallel, since the binary operator ⊗ is associative. If our input sequence is spread out over a hundred computers, each can apply map and reduce to its own segment. The hundred results can then be sent to a central computer where the final 99 ⊗ operations are performed. Among other organisations, Google has made heavy use of this MapReduce paradigm, which goes back to Lisp and APL.

R also provides support for the basic map and reduce operations (albeit with some inconsistencies):

Vectors Lists Dataframes Strings
Map with f sapply(v,f), purrr::map_dbl(v,f) and related operators, or simply f(v) for vectorized functions lapply(x,f) or purrr::map(x,f) Vector operations on columns, possibly with dplyr::mutate, dplyr::transmute, purrr::pmap, or mapply Not possible, unless strsplit or tokenisation is used
Reduce with ⊗ Reduce(g,v), purrr::reduce(v,g), or specific functions like sum, prod, and min purrr::reduce(x,g) Vector operations on columns, or specific functions like colSums, with purrr::reduce2(x,y,g) useful for two-column dataframes Not possible, unless strsplit or tokenisation is used

It can be seen that it is particularly the conceptual reduce operator on dataframes that is poorly supported by the R language. Nevertheless, the map and reduce operations are both powerful mechanisms for manipulating data.

For non-associative binary operators, purrr::reduce(x,g) and similar functions remain extremely useful, but they become inherently sequential.

For more about purrr, see

ASC 16: Tires

Tire on Bochum’s 2017 car (photo: Anthony Dekker)

In response to my Challenger strategy post, someone asked “what about tires?”

Yes, the rolling resistance of tires is a factor with solar cars. As tires rotate, the rubber in them flexes, and some energy gets lost this way. For a top Challenger class car, with good tires, about 85% of the solar energy is lost through aerodynamic drag, and about 15% through tires. That’s why my simple car model considered only aerodynamic drag.

The force required to overcome rolling resistance is g = 9.8 times the weight of the car times a tire-specific coefficient (ranging from 0.002 for an expensive solar car tire to 0.01 for a typical car tire). For a top Challenger class car, car and driver together will weigh around 250 kg. For a four-person Cruiser, it’s more like 800 kg – about triple. Bearing in mind that Cruisers often have tires with a higher coefficient of rolling resistance, this means that rolling resistance becomes quite significant with Cruisers. That is why Cruisers will sometimes turf out passengers to cut their energy use – even when running on a flat road. Climbing the mountains, overcoming gravity will come into play, and that uses up even more energy.

In ASC race news, I have updated my information page and teams list with news about day 1 of scrutineering.

ASC 13: Challenger strategy

This will be a lengthy post on race strategy. I will concentrate on Challenger-class cars, for which the main decision is at what speed to drive (there are also some tactical and psychological issues associated with overtaking, but I won’t get into them here).

The kind of analysis I’m doing assumes that teams have a good predictive model of how their car will perform under various conditions (collecting the data required to build such a model is yet another reason to do lots of testing). I’m using a very simplistic model of a hypothetical car – I only took half a day to build it (in R, of course); a real solar car strategy guy would take months and produce something much more detailed. I’m assuming that aerodynamic drag is the main consumer of energy. Remember that the drag force is:

F = ½ Cd A ρ v2

where Cd is the drag coefficient, A is the frontal area of the car, ρ is the density of air (about 1.2), and v is the speed of the car. The key thing here is that faster speeds burn up more energy.

I’m further assuming a solar model loosely based on the region of the American Solar Challenge, a perfectly flat road (no, that’s not realistic), a race start time of 8:00 AM, and a finish time of 5:00 PM with no rest breaks (no, that’s not realistic either). My hypothetical car has 4 square metres of solar panel with 25% efficiency, and a 5 kWh battery pack which is 80% full at the start of the day. In my first simulation, my car runs at constant speed until 5:00 PM or the battery dies, whichever comes first (solar cars are not intended to run with totally empty batteries). This chart shows what happens:

Initially, the graph is a straight line. The faster you drive, the further you go. But this only lasts up to 73.7 km/h, which takes you a distance of 663 km. Faster than that, and your battery dies before the end of the day, so that you go less far in total. In reality, of course, the driver would slow down before the battery died completely, but I’m not modelling that.

My second chart looks at varying the speed of the car. We start the day at one speed (horizontal axis), gradually speeding up or slowing down to reach a second speed at solar noon (vertical axis), and then slowly shifting back so as to end the day at the start speed:

In the chart, the distance travelled is shown by colour and number. It can be seen that, at least for my hypothetical car, the optimum is fairly forgiving: speeds between about 60 and 85 km/h are OK, as long as they average out to about 73 km/h, which means 653 km travelled. You will also notice that there is no incentive to break the speed limit – the rules have been carefully constructed to ensure that this is true.

In my third chart, I’m assuming a patch of cloud on the road ahead, between 200 km and 400 km from the start. We start the day at one speed (horizontal axis), switch to a second speed once inside the cloud (vertical axis), and then switch to a third speed when leaving the cloud (the third speed is chosen to be whatever works best with the remaining battery charge).

Two vertical stripes are noticeable on the left and right sides of the chart. At 20 km/h, the race day ends before you reach the cloud, so that the choice of “cloud speed” doesn’t actually matter. And at 100 km/h, your battery dies before you reach the cloud, so that the choice of “cloud speed” doesn’t matter either. The optimum is to run at a constant 70 km/h initially, and then to speed up to 75 km/h inside the cloud (so as to get back into sunshine that little bit sooner). If you run at an optimal 63 km/h after leaving the cloud, this lets you cover 619 km.

The final scenario changes the one above to have a moving cloud. This cloud is circular and once again 200 km in diameter, with a centre 300 km away from the start position. The edge of the cloud is initially 210 km from the road, but moving at a rapid 50 km/h towards and across it:

The optimum here is to outrun the cloud: to run at 85 km/h till the cloud is past, and then to drop back to whatever speed gives you a 73 km/h average. Not surprisingly, this gives essentially the same outcome as my first varying-speed example. A solution that is almost as good is to run at 70 km/h initially, and speed up to 80 km/h once inside the cloud. This gives you about an hour and 20 minutes inside the cloud. Slowing down to an optimal 70 km/h after leaving the cloud then lets you cover 642 km.

In real life, outrunning clouds and speeding up slightly through them can both be feasible options – if you have the right prediction software, and if know exactly what the weather is doing (which is the hard part). This is why the strategy guys in the chase car are always so busy!

Nuon’s WSC 2011 chase car (image credit)

ASC 9: Aerodynamics

For both the Challenger class and the Cruiser class, solar car racing is to a large extent about aerodynamic drag. That’s overwhelmingly what the hard-earned solar energy is being wasted on, and therefore it’s what teams need to concentrate on minimising. The drag force on a car is given by the equation:

F = ½ Cd A ρ v2

Breaking that down, v is the speed of the car, ρ is the density of air (about 1.2), A is the frontal area of the car, and Cd is the drag coefficient, a number which indicates how aerodynamic (and therefore, how energy-efficient) the shape of the car is. For Challengers, minimising Cd allows the speed v to be increased, while for Cruisers (for which the average v is essentially given), minimising Cd allows non-solar energy use to be minimised. Of course, minimising frontal area is important too (and that is the motivation behind asymmetric Challenger cars).

To give a feeling for the all-important Cd, here are some vehicles with values ranging from 0.19 to 0.57:

Drag coefficients for a selection of vehicles. Clockwise from top left: 0.57 – Hummer H2 (photo: Thomas Doerfer); 0.30 – Saab 92 (photo: “Liftarn”); 0.26 – BMW i8 (photo: “youkeys”); 0.19 – General Motors EV1 (photo: Rick Rowen)

Because Cd is so all-important, it is the one thing that solar car teams are really secretive about. Challengers generally have values under 0.1. With no need for practicality, they chase their way towards the impossible goal of Cd = 0, trying to come up with the perfect race car, which will slice through air like a hot knife through butter:

Nuon’s 2005 car, Nuna 3, with Cd = 0.07 (photo: Hans-Peter van Velthoven)

Cruisers, on the other hand, have to balance aerodynamics with practicality. Bochum’s early SolarWorld GT had Cd = 0.137:

Bochum’s 2011 car, SolarWorld GT, with Cd = 0.137 (photo: “SolarLabor”)

Eindhoven’s recent Stella Vie, with its sleek aerodynamic shape, does much better than that (but they won’t say how much better):

Eindhoven’s 2017 car, Stella Vie (photo: TU Eindhoven, Bart van Overbeeke)

I understand that Sunswift’s 2013–2015 car eVe had Cd = 0.16. Appalachian State (Sunergy) have stated that their newly-built ROSE has Cd = 0.17. PrISUm’s Penumbra has a higher value (Cd = 0.2), because of the blunt end which they chose for practicality reasons (although they did do a few clever things to reduce the impact of that blunt end). I’m not aware of the Cd values for other ASC cars.

Appalachian State’s beautiful ROSE, with Cd = 0.17 (image credit)

Mathematics in Action: Vehicle Identification Numbers

Motor vehicles have a 17-character Vehicle Identification Number or VIN on a metal plate like the one below, usually on the driver’s side dashboard, or on the driver’s side door jamb, or in front of the engine block:

A Vehicle Identification Number (VIN) plate (Photo: Michiel1972)

VINs offer an interesting example of check digit calculation. The central digit (or an X representing 10) is a check digit (calculated modulo 11) used to detect errors. Any letters in the rest of the VIN are decoded like this:

1 2 3 4 5 6 7 8 1 2 3 4 5 7 9 2 3 4 5 6 7 8 9

The check digit calculation involves decoding the VIN, and multiplying the resulting numbers by the weights shown in blue, giving the products in purple:

VIN L J C P C B L C X 1 1 0 0 0 2 3 7
Decoded 3 1 3 7 3 2 3 3 10 1 1 0 0 0 2 3 7
Weights 8 7 6 5 4 3 2 10 0 9 8 7 6 5 4 3 2
Product 24 7 18 35 12 6 6 30 0 9 8 0 0 0 8 9 14

These products are added up modulo 11 (meaning the sum is divided by 11 and the remainder taken). In this case, the sum is 186 = 10 = X (mod 11), which makes the VIN valid, because it matches the original central X. What about the VIN on your vehicle?