Mathematics in action: returning from a random walk


Three 2-dimensional random walks. All three start at the black circle and finish, after 100 steps, at a coloured square. Later steps are in darker colours. Considerable backtracking occurs.

We have discussed one-dimensional random walks, but it is possible to have random walks in more than one dimension. In two dimensions (above), we can go left, right, forward, and back. A random walk in two dimensions can be played as a kind of game (as can one-dimensional random walks). In three dimensions (below) we can also move vertically. Three-dimensional random walks are related to the motion of molecules in a gas or liquid.


Three 3-dimensional random walks. All three start at the black circle (in the centre of the cube) and finish, after 100 steps, at a coloured square. Later steps are in darker colours.

One very interesting question is whether a random walk ever returns to its starting point. In one dimension, the probability of returning in exactly n ≥ 1 steps is 0 if n is odd, and C(nn/2) / 2n if n is even, where C(nk) is the number of ways of choosing k items out of n, which is defined by C(nk) = n! / (k! (nk)!).

For large numbers n, Stirling’s approximation says that n! is approximately sqrt(2πn)(n/e)n. If we let m = n/2, some tedious algebra gives the probability of returning in exactly n = 2m steps as 1/sqrt(πm) ≈ 0.56/sqrt(m). When I ran some experiments I actually got a factor of 0.55, which is pretty close. Given infinite time, the expected number of times we return to the starting point is then:

0.56 (1 + 1/sqrt(2) + 1/sqrt(3) + 1/sqrt(4) + …) = ∞

This means that an eventual return to the starting point is certain. It may take a while, however – in 100 random walks, summarised in the histogram below, I once had to wait for 11452 steps for a return to the starting point.

Random walks in two dimensions can be understood as two random walks in one dimension happening simultaneously. We return in exactly n = 2m steps if both one-dimensional walks return together. The probability is therefore the one above squared, i.e. 1/(πm) ≈ 0.318/m. Again, given infinite time, the expected number of times we return to the starting point is:

0.318 (1 + 1/2 + 1/3 + 1/4 + …) = ∞

This means that a return to the starting point is also theoretically certain, although it will take much, much longer than for the one-dimensional case. In a simple experiment, four random walks returned to the starting point in 6814, 2, 21876, and 38 steps respectively, but the fifth attempt took so long that I gave up. In three or more dimensions, a return to the starting point might never occur.


Mathematics in action: Flipping coins

Our second post about probability is about flipping coins and random walks. Once again, I’ve used random numbers from www.random.org, but I’ve represented the coin flips as 1 for heads and −1 for tails. The expected mean is therefore 0 (I actually got 0.0034), and the expected variance is s − 02 = 1, where s = 1 is the mean of the squares of the numbers −1 and 1 (for the variance, I actually got 1.000088).

We then consider a random walk where we repeatedly flip a coin and walk a block west if it’s tails and a block east if it’s heads. In particular, we consider doing so 144 times. How far would we expect to get that way? Well, on average, nowhere – we are adding 144 coin flips, and the mean distance travelled will be 0. The coloured lines in the diagram above show ten example random walks (with time running vertically upwards). These finish up between 18 blocks west and 20 blocks east of the starting point, so the mean of 0 represents an average of outcomes where we wind up several blocks west or east.

Since we can add variances, the variance for the random walk will be the variance of a single step times the number of steps. Alternatively, the standard deviation will be the standard deviation of a single step times the square root of the number of steps. In this case, the expected standard deviation of the random walk is 12 (for the ten random walks in the diagram above, I actually got 14.55; for a larger sample, 12.086). The width of the bell curve in the diagram illustrates the theoretical standard deviation (the height of the bell curve is not meaningful).

The expected absolute value of the distance travelled depends on the mean value of half a bell curve: it is 12 × sqrt(2/π) = 9.5746 (I actually got 12.6; for a larger sample, 9.631). So, for our random walk, we can expect to wind up around 10 blocks from the starting point – sometimes more, sometimes less. Naturally, this is just a simple example – there’s a lot more interesting mathematics in the theory of random walks, especially when we consider travelling in more than one dimension.


Mathematics in action: Probability


Dice

Today we begin a series of three posts on probability. We begin with some experiments on rolling dice. To save my wrists, I’ve used random numbers from www.random.org instead.

We begin with individual dice rolls. Rolling 1200 times, we expect each number to come up about 200 times, and the histogram below shows that that’s pretty much what happens. We expect the average (mean) of the numbers rolled to be 3.5, and for my 1200 rolls it was actually very close to that (3.495833). We expect the standard deviation of the numbers rolled to be sqrt(s − 3.52) = sqrt(35/12) = 1.707825, where s is the mean of the squares of the numbers (1, 4, 9, 16, 25, 36). The actual sample standard deviation was 1.704133. It is generally more convenient to work with the variance, however, which is the square of the standard deviation (2.904069, which is very close to 35/12 = 2.916667).


Histogram of 1200 individual dice rolls. The dark blue dots and line show the expected results.

Rolling pairs of dice, we expect a sum of 2 to come up, on average once in 36 rolls. We expect 3 to come up twice (as 1+2 and as 2+1), 4 to come up three times (as 1+3, 2+2, and 3+1), etc. The histogram below shows what actually happens.

We expect the average (mean) of the numbers rolled to be 7, and it was actually very close to that (7.019167). The great thing about adding random numbers is that the variance of the sum is the sum of the individual variances, so we expect the variance to be 70/12 = 5.833333. The actual sample variance for my 1200 rolls was 5.831993.


Histogram of 1200 dice pair rolls. The dark blue dots and line show the expected results.

There is an amusing take on dice pair rolls in Asterix and the Soothsayer, where the soothsayer is trying not to predict the result of a dice pair roll. Given the choice of numbers from I to XII, he guesses VII, which is of course the most likely outcome, and the one which actually comes up (which is the least likely outcome?).

Rolling a larger number of dice, the expected outcome follows a bell curve. The expected mean for twenty dice is 20 × 3.5 = 70 (for my 800 rolls of twenty dice it was actually 70.12). The expected variance is 20 × 35 / 12 = 58.33333 (for my 800 rolls of twenty dice the sample variance was actually 60.42113). The standard deviation (the square root of the variance) determines the “width” of the bell curve. Similar bell curves occur whenever a result is composed of a large number of independent factors (of roughly equal weight) added together.


Histogram of 800 rolls of twenty dice. The dark blue bell curve shows the expected results.