Ada’s Program



Fragment of the Analytical Engine’s arithmetic/logic unit built by Babbage (photo: Science Museum London) and punched cards for operating it (photo: Karoly Lorentey)

Following on from my post about his Difference Engine, Charles Babbage’s Analytical Engine deserves some discussion. Only small pieces of the Analytical Engine were built. Indeed, Babbage’s ideas were so far ahead of his time that it could not be built with the technology available to him. Babbage was clearly either a true genius – or else he was a time-traveller from the future trying to recreate a modern computer.

It is not quite clear whether Babbage’s Analytical Engine was Turing complete. The kind of abstract computer developed independently by Alan Turing and Emil Post uses an arbitrarily long tape. Even more abstract models of computation use arbitrarily long integers to achieve the same effect. For example, the list (2, 3, 0, 1) can be encoded as the number 582 (1001000110 in binary). Modern computers use a sequence of numbered memory locations, accessed by indexing. The Analytical Engine could not do this. To quote the excellent analysis by Allan G. Bromley, “With hindsight we may note that in the Analytical Engine (at least until 1840) Babbage did not possess the variable-address concept; that is, there was no mechanism by which the machine could, as a result of a calculation, specify a particular variable in the store to be used as the operand for an instruction.


Ada King-Noel, the Countess of Lovelace (1836 portrait by Margaret Sarah Carpenter, cropped)

Babbage was not terribly good at explaining his ideas in writing, unfortunately. The best description is a 13-page summary of of a lecture by Babbage written in French by Luigi Federico Menabrea (later Prime Minister of Italy). This was translated into English in 1843 by Augusta Ada King-Noel (née Byron), the Countess of Lovelace.

Ada added 36 pages of detailed notes of her own. These include several insightful comments regarding the philosophy of computing, such as: “Again, it might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should be also susceptible of adaptations to the action of the operating notation and mechanism of the engine. Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent. The Analytical Engine is an embodying of the science of operations, constructed with peculiar reference to abstract number as the subject of those operations.” (from Note A).

Also: “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.” (from Note G).


The diagram from Note G, which shows what is essentially a computer program

Ada is sometimes described as the “first computer programmer,” based on the material in her Note G. This is clearly incorrect, since Charles Babbage had written several dozen programs for the Analytical Engine before 1840. Perhaps “first computer scientist” would be a better title. The program described in Ada’s Note G computes Bernoulli numbers. It does so using the fact that each Bernoulli number can be computed from its predecessors via the relationship:

0 = A0 + A1B1 + A3B3 + A5B5 + … + B2n−1

Here each Ai can be calculated as follows:

a <- function (n, i) {
	if (i == 0) -0.5 * (2*n - 1) / (2*n + 1)
	else if (i == 1) n
	else a(n, i-2) * (2*n + 2 - i) * (2*n + 1 - i) / (i * (i + 1))
}

Bromley notes that “the ‘user instruction set’ of the Analytical Engine seems nowhere to be clearly stated,” which makes it a little difficult to extract an actual program from Ada’s material. After fixing three small bugs, here is something that actually works (in the language R, and all done using numbered registers):

ada <- function (n.max) {
	b <- rep(0, n.max)  # result registers
	v <- c(1, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)  # other registers
	
	while (v[3] <= n.max) {  # how is this loop done in the Analytical Engine?
		v[4] <- v[5] <- v[6] <- v[2] * v[3]
		v[4] <- v[4] - v[1]  # v[1] is always 1
		v[5] <- v[5] + v[1]
		v[11] <- v[4] / v[5]  # accidentally reversed in Ada’s diagram
		v[11] <- v[11] / v[2]  # v[2] is always 2
		v[13] <- v[13] - v[11]
		v[10] <- v[3] - v[1]
		
		if (v[10] > 0) {  # how is this conditional execution done in the Analytical Engine?
			v[7] <- v[2] + v[7]
			v[11] <- v[6] / v[7]
			v[12] <- b[1] * v[11]
			v[13] <- v[12] + v[13]
			v[10] <- v[10] - v[1]
		}

		while(v[10] > 0) {  # how is this loop done in the Analytical Engine?
			v[6] <- v[6] - v[1]
			v[7] <- v[1] + v[7]
			v[8] <- v[6] / v[7]
			v[11] <- v[8] * v[11]
			v[6] <- v[6] - v[1]
			v[7] <- v[1] + v[7]
			v[9] <- v[6] / v[7]
			v[11] <- v[9] * v[11]
			
			i <- v[3] - v[10]   # how is this indexing done in the Analytical Engine?
			v[12] <- b[i] * v[11]
			v[13] <- v[12] + v[13]
			v[10] <- v[10] - v[1]
		}

		n <- v[3]  # how is this indexing done in the Analytical Engine?
		b[n] <- b[n] - v[13]  # another apparent error in Ada's table at line 14 (negation is needed)

		v[3] <- v[1] + v[3]
		v[7] <- 0   # reset the register with a “variable card”
		v[13] <- 0  # a third apparent error in Ada's table (v[13] needs to be reset, not v[6])
	}
	b
}

There are a number of questions about this. First, I am assuming that all registers are read non-destructively (Ada’s notes indicate that read-and-clear is also possible). Second, the results stored in b require indexing, which the Analytical Engine could not do. Third, Ada writes that “Operation 7 must either bring out a result equal to zero (if n = 1); or a result greater than zero, as in the present case; and the engine follows the one or the other of the two courses just explained, contingently on the one or the other result of Operation 7.” This implies that some kind of conditional branching was possible. But how?

A simple response is simply to “unroll” the loops, breaking the program down into instructions of just three kinds:

  • Set 1 567: place the number 567 in register #1
  • Do 2 + 3: add the contents of register #2 to the content of register #3 (and similarly for −, ×, and ÷)
  • Store 4: store a previously computed result in register #4

The following, rather lengthy, version of the program correctly computes the first three Bernoulli numbers:

Set 1 1, Set 2 2, Set 3 1

# First Bernoulli number
Do 2 * 3, Store 4, Store 5, Store 6
Do 4 - 1, Store 4
Do 5 + 1, Store 5
Do 4 / 5, Store 11
Do 11 / 2, Store 11
Do 13 - 11, Store 13
Do 3 - 1, Store 10

Do 1 + 3, Store 3
Do 21 - 13, Store 21,  # 21 done
Set 7 0, Set 13 0

# Second Bernoulli number
Do 2 * 3, Store 4, Store 5, Store 6
Do 4 - 1, Store 4
Do 5 + 1, Store 5
Do 4 / 5, Store 11
Do 11 / 2, Store 11
Do 13 - 11, Store 13
Do 3 - 1, Store 10

Do 2 + 7, Store 7
Do 6 / 7, Store 11
Do 21 * 11, Store 12,  # use 21
Do 12 + 13, Store 13
Do 10 - 1, Store 10

Do 1 + 3, Store 3
Do 22 - 13, Store 22,  # 22 done
Set 7 0, Set 13 0

# Third Bernoulli number
Do 2 * 3, Store 4, Store 5, Store 6
Do 4 - 1, Store 4
Do 5 + 1, Store 5
Do 4 / 5, Store 11
Do 11 / 2, Store 11
Do 13 - 11, Store 13
Do 3 - 1, Store 10

Do 2 + 7, Store 7
Do 6 / 7, Store 11
Do 21 * 11, Store 12,  # use 21
Do 12 + 13, Store 13
Do 10 - 1, Store 10

# Inner loop
Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 8
Do 8 * 11, Store 11
Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 9
Do 9 * 11, Store 11
Do 22 * 11, Store 12,  # use 22
Do 12 + 13, Store 13
Do 10 - 1, Store 10

Do 1 + 3, Store 3
Do 23 - 13, Store 23,  # 23 done
Set 7 0, Set 13 0

Can we do better than that? Bromley notes that “the mechanism by which the sequencing of operations is obtained is obscure.” Furthermore, driven by what was probably a correct intuition about code/data separation, Babbage separated operation and variable cards, and this would have played havoc with control flow (Bromley again: “I am not convinced that Babbage had clearly resolved even the representational difficulties that his separation of operation and variable cards implies”).

I’m resolving those issues by straying into what Babbage might have done had he seen the need. In particular:

  • I assume a conditional jump mechanism, with Ifzero 1 goto A jumping (somehow) to Label A if register #1 is zero (if operation and variable cards are reunited, this can be easily done by moving forward or back the required number of cards)
  • I assume an additional category of card, with its own card queue, with each such card specifying an output register, and with the operations:
    • Q (in Do, Store, Set, or Ifzero): access the register specified by the next card in the card queue
    • ResetQ: wind back the card queue to the start
    • StopifemptyQ: stop if all the cards in the card queue have been read

Yes, that’s all very speculative – but something like that is needed to make Ada’s loops work. In addition, the card queue (plus the associated output registers) performs the role of the tape in Turing/Post machines, or the memory in modern computers. Something like it is therefore needed.

And here is Ada’s program in that modified form. It works, loops and all! I tested it for the first 12 Bernoulli numbers, which are 0.1666667, −0.03333333, 0.02380952
−0.03333333, 0.07575758, −0.2531136, 1.166667, −7.092157, 54.97118, −529.1242, 6192.123, and −86580.25 (numerical errors do accumulate as the sequence is continued).

Set 1 1, Set 2 2, Set 3 1

Label A

Do 2 * 3, Store 4, Store 5, Store 6
Do 4 - 1, Store 4
Do 5 + 1, Store 5
Do 4 / 5, Store 11
Do 11 / 2, Store 11
Do 13 - 11, Store 13
Do 3 - 1, Store 10

Ifzero 10 goto B
Do 2 + 7, Store 7
Do 6 / 7, Store 11
Do Q * 11, Store 12  # Using 21
Do 12 + 13, Store 13
Do 10 - 1, Store 10

Label B

# Inner loop
Ifzero 10 goto C
Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 8
Do 8 * 11, Store 11
Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 9
Do 9 * 11, Store 11
Do Q * 11, Store 12  # Using 22, etc.
Do 12 + 13, Store 13
Do 10 - 1, Store 10
Ifzero 14 goto B  # Unconditional jump

Label C

Do 1 + 3, Store 3
Do 14 - 13, Store Q  # Using register 14 as constant zero
StopifemptyQ

Set 7 0, Set 13 0, ResetQ
Ifzero 14 goto A  # Unconditional jump

And for those interested, here is an emulator (in R) which will read and execute that program. For a slightly different approach, see the online emulator here.

read.program <- function (f) {
	p <- readLines(f)
	p <- gsub(" *#.*$", "", p)  # remove comments
	p <- gsub(" *, *", ",", p)  # remove spaces after commas
	p <- p[p != ""]  # remove blank lines
	p <- paste0(p, collapse=",")  # join up lines
	p <- gsub(",+", ",", p)  # remove duplicate commas
	strsplit(p, ",")[[1]]  # split by commas
}

do.op <- function (x, op, y) {
	if (op == "+") x + y
	else if (op == "-") x - y
	else if (op == "*") x * y
	else if (op == "/") x / y
	else stop(paste0("Bad op: ", op))
}

emulate <- function(program, maxreg) {
	set.inst <- "^Set (Q|[0-9]*) (Q|[0-9]*)$"
	store.inst <- "^Store (Q|[0-9]*)$"
	do.inst <- "^Do (Q|[0-9]*) ([^ ]) (Q|[0-9]*)$"
	label.inst <- "^Label ([0-9A-Za-z]*)$"
	ifzero.inst <- "^Ifzero ([0-9]*) goto ([0-9A-Za-z]*)$"

	v <- rep(0, maxreg)
	op.result <- 0
	stopping <- FALSE
	pc <- 1
	queue <- 21:maxreg
	qptr <- 1
	
	while (pc <= length(program) && ! stopping) {
		p <- program[pc]
		if (grepl(set.inst, p)) {
			i <- gsub(set.inst, "\\1", p)
			j <- as.numeric(gsub(set.inst, "\\2", p))
			if (i == "Q") {
				i <- queue[qptr];
				qptr <- qptr + 1
			} else i <- as.numeric(i)
			v[i] <- j

		} else if (grepl(do.inst, p)) {
			i <- gsub(do.inst, "\\1", p)
			op <- gsub(do.inst, "\\2", p)
			j <- gsub(do.inst, "\\3", p)
			if (i == "Q") {
				i <- queue[qptr];
				qptr <- qptr + 1
			} else i <- as.numeric(i)
			if (j == "Q") {
				j <- queue[qptr];
				qptr <- qptr + 1
			} else j <- as.numeric(j)
			op.result <- do.op(v[i], op, v[j])

		} else if (grepl(store.inst, p)) {
			i <- gsub(store.inst, "\\1", p)
			if (i == "Q") {
				i <- queue[qptr];
				qptr <- qptr + 1
			} else i <- as.numeric(i)
			v[i] <- op.result

		} else if (grepl(ifzero.inst, p)) {
			i <- gsub(ifzero.inst, "\\1", p)
			if (i == "Q") {
				i <- queue[qptr];
				qptr <- qptr + 1
			} else i <- as.numeric(i)
			dest <- gsub(ifzero.inst, "\\2", p)
			j <- which(program == paste0("Label ", dest))
			if (v[i] == 0) pc <- j

		} else if (p == "StopifemptyQ") {
			if (qptr > length(queue)) stopping <- TRUE

		} else if (grepl(label.inst, p)) {
			# do nothing
			
		} else if (p == "ResetQ") {
			qptr <- 1
			
		} else stop(paste0("Bad instruction: ", p))
		pc <- pc + 1
	}
	v
}

emulate(program = read.program("ada.program.txt"), maxreg = 32)

Update: If we take Ada’s program as specifying implicit zeroing of unused registers, we get this slightly fancier version (which also works):

Set 1 1, Set 2 2, Set 3 1

Label A

Do 2 * 3, Store 4, Store 5, Store 6
Do 4 - 1, Store 4
Do 5 + 1, Store 5
Do 4 / 5 clearing 4 and 5
Store 11
Do 11 / 2, Store 11
Do 13 - 11 clearing 11
Store 13
Do 3 - 1, Store 10

Ifzero 10 goto B

Do 2 + 7, Store 7
Do 6 / 7, Store 11
Do Q * 11, Store 12  # Using 21
Do 12 + 13 clearing 12
Store 13
Do 10 - 1, Store 10

Label B

Ifzero 10 goto C  # Inner loop test

Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 8
Do 8 * 11 clearing 8
Store 11
Do 6 - 1, Store 6
Do 1 + 7, Store 7
Do 6 / 7, Store 9
Do 9 * 11 clearing 9
Store 11
Do Q * 11, Store 12  # Using 22, etc.
Do 12 + 13 clearing 12
Store 13
Do 10 - 1, Store 10
Goto B  # End of inner loop

Label C

Do 1 + 3, Store 3
Do 14 - 13 clearing 13
Store Q  # Using register 14 as constant zero
StopifemptyQ

Set 6 0, Set 7 0, Set 11 0, ResetQ
Goto A

Did the Difference Engine make a difference?

I have been reading a few steampunk novels lately – I have a great fondness for the genre. Charles Babbage’s planned “Difference Engine” and “Analytical Engine” always play a large part in the fictional universe of such books. However, as Francis Spufford has pointed out, this does rely on some counterfactual history.


Reconstructed “Difference Engine No. 2” in the Science Museum, London (photo: “Geni”)

Babbage never completed any of his major devices, although redesigned working difference engines were built by Per Georg Scheutz (1843), Martin Wiberg (1859), and George B. Grant (1876). With much fanfare, the Science Museum, London reconstructed Babbage’s “Difference Engine No. 2” between 1985 and 2002, making only essential fixes to the original design – and it works! However, the pinnacle of this kind of technology was probably the beautiful handheld Curta calculator, produced in Liechtenstein by Curt Herzstark from 1947.

The world’s first programmable digital computer was in fact built four years before the Curta, in 1943, by English electrical engineer Tommy Flowers. The wartime secrecy associated with his work has kept this monumental achievement largely in the dark.


Colossus in action at Bletchley Park in 1943 (photo: National Archives)

The significance of the Colossus has also been obscured by a kind of “personality cult” built up around Alan Turing, much like the one built up around Babbage. Turing was one of a number of people who contributed to the design of the cryptographic “Bombe” at Bletchley Park, and Turing also did important theoretical work – although the fundamental result in Turing’s 1936 paper, “On Computable Numbers, with an Application to the Entscheidungsproblem” was not actually new, as is revealed on the second page of Turing’s paper, where Turing admits “In a recent paper Alonzo Church has introduced an idea of ‘effective calculability,’ which is equivalent to my ‘computability,’ but is very differently defined. Church also reaches similar conclusions about the Entscheidungsproblem . The proof of equivalence between ‘computability’ and ‘effective calculability’ is outlined in an appendix to the present paper.

Turing’s life was more colourful than either Church’s or Flowers’s, however, and this may be why he is far more famous. In a similar way, Babage lived a more colourful life than many of his contemporaries, including his collaboration with the forward-thinking Countess of Lovelace.


1: Charles Babbage, 2: Augusta Ada King-Noel (née Byron, Countess of Lovelace), 3: Alonzo Church, 4: Alan Turing, 5: Tommy Flowers

The chart below (click to zoom) puts the work of Babbage and Flowers in a historical context. Various devices are ranked according to their computational power in decimal digits calculated per second (from 1 up to 1,000,000,000,000,000). Because this varies so dramatically, a logarithmic vertical scale is used. The Colossus marks the beginning of a chain of “supercomputers,” often built for government use, with power doubling every 1.84 years (pink line). Starting with the Intel 4004 in 1971, there is also a chain of silicon chips, with power doubling every 1.74 years (blue line). At any given point in time, supercomputers are between 1,000 and 3,000 times more powerful than the chips, but the chips always catch up around 20 years later. The revolutionary PDP-8 of 1965 sits between the two chains.

One of the things that stand out on this chart is the gap between Babbage’s Difference Engine and the later digital computers – even the Colossus was around 280 times more powerful than the Difference Engine (carrying out a simpler task much more quickly). Steampunk fiction often suggests that steam power would have made the Difference Engine faster. However, it turns out that the mechanism jams if it is cranked too quickly. Complex mechanical calculating devices simply cannot operate that fast.


Morse telegraph key (photo: Hp.Baumeler)

In fact, Charles Babbage may actually have distracted people from the way forward. Samuel Morse’s improved telegraph was officially operational in 1844. It used electromechanical principles that were also used in the Colossus a century later. Electricity also has the advantage of travelling at the speed of light, along wires that can be made extremely thin. What might the world have been like had electromechanical computing developed earlier? The chart also shows the 1964 fluidic computer FLODAC. This was a fascinating idea that was abandoned after a successful proof of concept (although a 1975 film portrayed it as the future). What if that idea had been launched in Victorian Britain?


Church and Turing


Alonzo Church (left) and Alan Turing (right)

Alan Turing is no doubt the most well-known of all computer scientists. His Turing machine is justly famous, and the video below shows a really cool Lego version of it.

However, it is not widely known that the fundamental result in Turing’s 1936 paper (“On Computable Numbers, with an Application to the Entscheidungsproblem” – with errors later corrected in 1937: i, ii, iii) was not actually new. This is revealed on the second page of Turing’s paper, where Turing admits “In a recent paper Alonzo Church has introduced an idea of ‘effective calculability,’ which is equivalent to my ‘computability,’ but is very differently defined. Church also reaches similar conclusions about the Entscheidungsproblem . The proof of equivalence between ‘computability’ and ‘effective calculability’ is outlined in an appendix to the present paper.

It seems that Max Newman encouraged Turing to explain how his work related to the result that Alonzo Church had already obtained, lobbied the Secretary of the London Mathematical Society to still permit publication of Turing’s paper, and arranged for Turing to go to Princeton, where Church would be his PhD supervisor (Church also supervised 34 other students). It’s a terrible feeling for a student to be “gazumped” like that, and invaluable to have helpful people backing you up. Turing went on to have a colourful, but relatively short, life. Church lived to 92, but was staid in comparison.

Church’s notation, the lambda calculus, has been quite influential, spawning programming languages such as Lisp and Haskell, as well as more theoretical work. The fundamental feature of lambda calculus, the anonymous function written function (x) { e }, has been incorporated into many other programming languages as well. Consider the following code in R, for example:

i <- function (x) { x }
k <- function (x) { function (y) { x } }
s <- function (x) { function (y) { function (z) { x(z)(y(z)) } } }
y <- function (f) { (function (x) { f(x(x)) }) (function (x) { f(x(x)) }) }

Notice that functions can themselves be arguments to other functions, and in x(x), a function is applied to itself.

The object i is the identity function, so that i(3) reduces to 3. The expressions k(3)(4) and s(k)(k)(3) also reduce to 3. The object y is the Y combinator, with y(k(3)) reducing to 3, and y(i) generating infinite recursion. The following code uses y to generate the factorial function:

fact <- y (function (g) { function (x) { if (x == 0) 1 else x * g (x - 1) } })
fact (3)

The appendix that Turing was forced to write was perhaps the most significant aspect of his 1936 paper. Showing that his definition of computability was equivalent to that of Church led to the Church–Turing thesis – the idea that that is the only kind of computability (at least on numbers). Independently of Turing, Emil Post produced another equivalent form of computability in 1936. Yet another form came out of Russia. Even Conway’s Game of Life (below) turns out to capture the same kind of computability, though in an inherently parallel form:

See more on the Church–Turing thesis at plato.stanford.edu. There was also a series of interesting talks presented on Church and Turing in 2012, including this one by Philip Wadler:

While the lambda calculus was probably the version of computability with the largest impact on programming languages, the more practical Von Neumann architecture probably had the largest impact on hardware design. It is difficult to imagine what the world would look like without those two concepts.