# Synthetic Populations

Last year I wrote a Virtual Synthetic Population tool for estimating anthropometric measurements. There are many applications for such measurements, but most nations in the world have not had detailed anthropometric surveys done. Estimation is therefore necessary in many cases, and I’ve used the tool in my consulting work.

The Virtual Synthetic Population tool (written in R, of course!) starts with a detailed sample of anthropometric measurements for a group of several thousand Americans. This sample is then transformed (using simulated annealing) to match available data for various nations (e.g. data collected by the WHO on height and BMI). Other anthropometric variables are then transformed using correlations within the dataset.

For example, bideltoid breadth (the maximum horizontal distance between the lateral margins of the upper arms on the deltoid muscles – see diagram below) can be predicted from the combination of height and BMI (specifically, linear regression on height and BMI predicts 72% of the variance in bideltoid breadth).

Given Australia’s geographical location, I’m particularly interested in anthropometric estimates for the countries of the Pacific. The island nation of Nauru makes a good case study:

The population of Nauru is only about 9500. Among other things, the citizens of Nauru are keen players of Australian rules football:

A 2007 WHO report indicates that Nauruan males aged 15+ had a mean height of 168.1 cm and a mean BMI of 31.7  kg/m2. Of the roughly 3200 males in this age group, approximately 82.1% had a BMI of 25 or more, and 55.7% had a BMI of 30 or more. Using these statistics and the Virtual Synthetic Population tool, we can estimate the anthropometrics of these 3200 males. In particular, the diagram below shows the estimated height and bideltoid breadth for this population. Taller individuals can be found on the right of the chart, and heavier individuals at the top (each figure is to scale – click to zoom):

Given this synthetic population, we can summarise the distributions of the variables (for example, the 5% and 95% quantiles for bideltoid breadth are 44.8 cm and 61.3 cm respectively). Alternatively, the eight people below (chosen from the synthetic population) define an ellipse in height/BMI space that encloses 95% of the population – that is, 95% of the males aged 15+ fall within the extremes defined by these eight people. We can also use the entire synthetic population (or some random sample of it) within a computer simulation.