Synthetic Populations

Last year I wrote a Virtual Synthetic Population tool for estimating anthropometric measurements. There are many applications for such measurements, but most nations in the world have not had detailed anthropometric surveys done. Estimation is therefore necessary in many cases, and I’ve used the tool in my consulting work.

The Virtual Synthetic Population tool (written in R, of course!) starts with a detailed sample of anthropometric measurements for a group of several thousand Americans. This sample is then transformed (using simulated annealing) to match available data for various nations (e.g. data collected by the WHO on height and BMI). Other anthropometric variables are then transformed using correlations within the dataset.

For example, bideltoid breadth (the maximum horizontal distance between the lateral margins of the upper arms on the deltoid muscles – see diagram below) can be predicted from the combination of height and BMI (specifically, linear regression on height and BMI predicts 72% of the variance in bideltoid breadth).

Given Australia’s geographical location, I’m particularly interested in anthropometric estimates for the countries of the Pacific. The island nation of Nauru makes a good case study:

The population of Nauru is only about 9500. Among other things, the citizens of Nauru are keen players of Australian rules football:

A 2007 WHO report indicates that Nauruan males aged 15+ had a mean height of 168.1 cm and a mean BMI of 31.7  kg/m2. Of the roughly 3200 males in this age group, approximately 82.1% had a BMI of 25 or more, and 55.7% had a BMI of 30 or more. Using these statistics and the Virtual Synthetic Population tool, we can estimate the anthropometrics of these 3200 males. In particular, the diagram below shows the estimated height and bideltoid breadth for this population. Taller individuals can be found on the right of the chart, and heavier individuals at the top (each figure is to scale – click to zoom):

Given this synthetic population, we can summarise the distributions of the variables (for example, the 5% and 95% quantiles for bideltoid breadth are 44.8 cm and 61.3 cm respectively). Alternatively, the eight people below (chosen from the synthetic population) define an ellipse in height/BMI space that encloses 95% of the population – that is, 95% of the males aged 15+ fall within the extremes defined by these eight people. We can also use the entire synthetic population (or some random sample of it) within a computer simulation.

Cheerleaders in the news

The Internet appears to have melted down again over the above infographic from the University of Washington cheerleading team. Now I must admit to being bemused by the whole US cheerleading phenomenon, but nothing here seems particularly surprising. Cheerleading is a form of public entertainment combining dance and gymnastics. It relies partly on communicating facial expression to distant spectators (hence the lipstick and false eyelashes). America being what it is, there is often a big emphasis on being “family friendly,” rather than “sexy” (hence the restrictions on makeup and tattoos). Avoiding injury is also important (hence the restrictions on body piercing and jewellery).

Simple cheerleading stunt with “flyer” (top), “bases” (sides), and “back spot” (rear) – photo: “unoguy”

The more disturbing feature of the infographic, at least at first sight, is the requirement for bare midriffs in auditions. This is apparently a technique for circumventing laws against having weight restrictions, by allowing fitness to be judged visually. Different team roles (see above) have different body types, with “flyers” needing to be lighter, while “bases” and “back spots” need to be stronger (see below). A healthy BMI is required in each case. Fitness restrictions are therefore not unreasonable.

I cannot help but think that the US has larger problems right now than the kind of eyeshadow that cheerleaders in Washington state wear. Cheerleaders themselves have far more serious problems than being asked to wear false eyelashes (injury rates, for example, and underpayment). And US universities have far more serious problems than whether their cheerleaders sport fake tans. So I’m still kind of confused as to why this infographic is big news.

For aficionados, the R code for the BMI diagram is:

bmif <- function (kg, m) { kg/m^2 }

invf <- function (b) {
	Vectorize (function (h) { b*h^2 })

regionf <- function (b, clr) {
	f <- invf(b)
	hh <- seq(1.4,2.1,0.001)
	xx <- f(hh)
	hh <- c(hh, 2.1, 1.4, 1.4)
	xx <- c(xx, 200, 200, xx[1])
	polygon(xx, hh, col = clr)

linef <- function (b) {
	f <- invf(b)
	hh <- seq(1.4,2.1,0.001)
	xx <- f(hh)
	lines(xx, hh, lwd=2, col="grey40")

inchf <- Vectorize(function (h) {
	ft <- h %/% 12
	h <- h %% 12
	if (h == 0) paste (ft, "'", sep="")
	else paste (ft, "' ", h, '"', sep="")

postscript(file="BMI_Cheer.eps", onefile=FALSE, horizontal=FALSE, width = 10, height = 6)
par(mar=c(4.3, 4.3, 2, 3)) # c(bottom, left, top, right)
plot(c(40,140), c(1.5,2), type="n", xlab="Weight (kg)", ylab="Height (m)", las=1, cex.axis=1.2, cex.lab=1.5)

inches <- 5*12 + seq(0,21,3)
axis(side=4, at=inches*0.0254, labels=inchf(inches), las=1, cex.axis=1.2)

pounds <- seq(100,300,50)
axis(side=3, at=pounds*0.45359237, labels=paste(pounds, " lb", sep=""), cex.axis=1.2)

regionf(18.5, "limegreen")
regionf(25, "darkorange")
regionf(30, "tomato")
for (i in seq(1.5,2,0.1)) { lines(c(0,300),c(i,i), col="grey80") }
for (i in seq(40,160,10)) { lines(c(i,i), c(0,3), col="grey80") }
labs <- c("Underweight\n(BMI < 18.5)", "Normal\n(BMI 18.5–25)", "Overweight\n(BMI 25–30)",  "Obese\n(BMI > 30)")
text(c(50, 82, 104.5, 125), rep(1.95, 4), labs, font=2, cex=0.8)

names <- c("Cheerleaders\noverall", "Bases", "Flyers", "Back spots")
heights <- c(160.2, 161.2, 155.8, 170.4)/100
weights <- c(57.2, 62.3, 50.4, 63.5)
points(weights, heights, pch=19, cex=1.2, lwd=3)
text(weights, heights, pos=c(2,4,1,3), cex=1.2, labels=names, offset=0.5)


BMI revisited

One of the more infamous datasets floating around the Internet is the set of body measurements of Playboy centrefolds. Wired magazine reported on this dataset in 2009 and again last year, although I found their analysis a little disappointing.

Playboy has been around since 1953, and has influenced attitudes to women in a number of (largely negative) ways in that time. However, the numeric data has a life of its own, as Wired points out (and, in a complex system, tracking one component over time sheds light on the system as a whole). Of particular interest, in the light of my previous post about BMI, is the way Playboy sells a certain “ideal” female body shape. Playboy influences and is influenced by the general Zeitgeist in this regard (e.g. the fashion industry), but is also presumably influenced by inherently biological male preferences.

I’ve taken the 2009 dataset used by Wired, added more recent data screen-scraped from Wikipedia, and done my own analysis (all in R, of course). The chart below shows the results. These numbers may, as Wired points out, not be entirely accurate – but even if they are not, Playboy is still selling the body shape which those numbers describe.

The first thing to note is that 52% of Playboy centrefolds had a BMI (body mass index) below the healthy green zone on the chart. Like the fashion industry, Playboy is selling an unhealthily underweight female body shape. However, the average fashion model has a BMI of 17.6 (blue line on the chart), and the mean BMI of Playboy centrefolds has always been higher than that. This may be a case of inherently biological male preferences moderating the Zeitgeist’s drive towards ever thinner models.

The black line on the chart shows the smoothed mean BMI values (using 1st degree loess smoothing). There are some interesting temporal variations here. The mean BMI of Playboy centrefolds was stable at a healthy 19.4 up to early 1965, but then dropped steadily to an unhealthy 17.9 in early 1986. The mean BMI then increased again to 18.5 in early 2009, and then dropped sharply again to 18 in early 2016 (the last three trends are statistically significant at p = 0.00000000003%, 0.12%, and 0.53% respectively). The two minima show up again if we look at the lowest BMI values – those 16 or less. The table below shows that there were two of these around 1980, and four others after 2009. These changes may reflect a movement from the Twiggy generation to the Cindy Crawford generation to the Kate Moss generation. Whatever lies behind the numbers, however, it is disturbing to think that the 1970’s pressures on women to become unhealthily underweight may have returned stronger than ever.

Month Height Weight BMI
July, 1978 168 cm 43 kg 15.3
October, 1982 173 cm 48 kg 16.0
February, 2010 170 cm 46 kg 16.0
September, 2013 160 cm 39 kg 15.1
July, 2014 173 cm 48 kg 16.0
August, 2015 173 cm 45 kg 15.2

For comparison, the chart below shows the estimated BMI for some winners of the Miss World and Miss Universe beauty pageants (in the absence of weight data, BMI is estimated from body measurements, where these are available, using a regression equation derived from a standard anthropometric database). Here 57% of the women are underweight. The mean estimated BMI is roughly steady at 18.7 (just inside the healthy range) up until the year 2000, but from 2000 onwards there is a significant decline (p = 4%). More data would be useful here, but it does seem that (in spite of bans on underweight fashion models in some countries) there are indeed renewed pressures on women to become unhealthily underweight. Furthermore, the post-2009 downturn in the mean BMI of Playboy centrefolds seems to have been in reaction to a more global trend that had already begun a decade earlier.

The words of the fictional demon in C. S. Lewis’s 1942 book The Screwtape Letters still seem relevant today (although Lewis clearly did not foresee the current industry in silicone breast implants, which Wired also comments on):

It is the business of these great masters [of the Lowerarchy] to produce in every age a general misdirection of what may be called sexual ‘taste.’ This they do by working through the small circle of popular artists, dressmakers, actresses, and advertisers who determine the fashionable type. … The age of jazz has succeeded the age of the waltz, and we now teach men to like women whose bodies are scarcely distinguishable from those of boys. Since this is a kind of beauty even more transitory than most, we thus aggravate the female’s chronic horror of growing old … We have engineered a great increase in the licence which society allows to the representation of the apparent nude (not the real nude) in art, and its exhibition on the stage or the bathing beach. It is all a fake, of course; the figures in the popular art are falsely drawn … As a result we are more and more directing the desires of men to something which does not exist – making the rôle of the eye in sexuality more and more important and at the same time making its demands more and more impossible.

Magazine covers and BMI

The Internet is currently in a meltdown over a magazine cover featuring “fuller figured” model Ashley Graham, who is allegedly “unhealthy” (see image above).

Now, there are far more important things going on in the world at the moment than women on magazine covers, but it’s worth pointing out that (with a body mass index or BMI of about 25.1) Graham is in fact sitting pretty much right on the upper boundary of the healthy weight range (see diagram below). The storm in a teacup therefore illustrates to what extent an industry dominated by anorexia has forgotten what the normal female body actually looks like (the average fashion model has a BMI of 17.6, well into the unhealthily underweight range).