Why maths is (almost) everything - The Maths of Life and Death by Kit Yates
Wet Wet Wet may have sung that Love is All Around, but it all truth it’s mathematics that really is all around us. In The Maths of Life and Death, Kit Yates tells us the true stories of life changing events, where either the application, or misapplication of mathematics has played a critical role. Accessible, fascinating and on the brink of terrifying in some cases, these stories will bring a new appreciation for the mathematics, and also provide you with some tools and rules for you to use mathematics to make better quality decisions in your own life.
In a piece especially written for Foyles, Kit Yates helps us to get to the heart of the question....
Do you know what I mean?
The chances are that each year the most common number of readers for the books at your local library is zero.
It’s a surprising statistic. Many people, when first reading this statement, will come away with the impression that more library books go unborrowed than borrowed each year. But before you go demanding that your hard-earned tax be taken away from local libraries and given to some other worthier cause, let me explain.
If I’d wanted to be really mischievous I could have said “the average number of times the books at your local library are borrowed is zero”. This would have been further stretching, but not quite breaking the truth. The reason this statement is legitimate is that there are at least three different measures of a data set which we could refer to as the average: the mean, the median and the mode.
The average we are most familiar with is the mean. To find the mean, we add up all the values in a data set and divide by the number of values there are. To find the mean number of times a book is borrowed from the library we would add up the total number of times books were borrowed and divide by the total number of titles offered. If by ‘average’ I had meant the mean, then there’s no getting around it, something would be wrong with our library provision, as a mean of zero implies that the total number of books borrowed is also zero. But this isn’t the average I was talking about.
The mean gives us a good estimate of the overall frequency of borrowing from the library, but it isn’t always the best representation of a data set. In the 1980s the ‘average UK family’ was found to have 2.4 children. Obviously, no-one knows a family with 2.4 children (although I’m sure we all know families in which one of the parents might account for the 0.4). Another, initially surprising, example is the old riddle ‘What is the probability that the next person you meet when walking down the street will have more than the average number of legs?’ The answer is ‘Almost certain’. The very few people who have no legs or one leg are responsible for a small reduction in the mean so that everyone with two legs has more than the average. Clearly it would be ridiculous to assume that the mean correctly characterises any individual in the population.
Although you’ve probably heard the ‘number of legs’ riddle before, perhaps it would surprise you to know that despite having a life expectancy of 78.8 years, which is four years less than that of British females (at 83 years), the majority of British males will live longer than the overall population life expectancy of 81. At first this statement seems contradictory, but in fact it is due to a discrepancy in the statistics we use to summarise the data. The small, but significant, number of people who die young brings down the mean age of death (the typically quoted life expectancy in which everyone’s age at death is added together and then divided by the total number of people). Surprisingly, these early deaths take the mean well below another oft-quoted average - the median (the age that falls exactly in the middle; as many people die before this age as after). To find the median of a data set arrange all the values in a long line and select the middle value. The median age of death for UK males is 82, meaning that half of them will be at least this age when they die. In this case, the summary statistic typically presented – the mean age at death of 78.8 years – is a particularly misleading descriptor of the population.
The bell curve, or normal distribution, which can be used to characterise many everyday data sets, from heights to IQ scores, is a beautifully symmetrical curve in which half of the data lies on one side of the mean and half on the other. This implies that the mean and the median – the middle-most data value – tend to coincide for characteristics that follow this distribution. Because we are familiar with the idea that this prominent curve can describe real-life information, many of us assume that the mean is a good marker of the ‘middle’ of a data set. It surprises us when we come across distributions in which the mean is skewed away from the median. The distribution of ages at death for British males, displayed below, is clearly far from symmetrical. We typically refer to such distributions as ‘skewed’.
The age-dependence of the number of deaths per year for males in Great Britain follows a skewed distribution. The mean age at death is just under 79, while the median age is 82.
The median gives us a better idea of what is typical than the mean does for skewed data sets, including those on our life expectancy. For the same reason, the median is often used when presenting data on average income. The high wages of the very well-off individuals in our societies tends to distort the mean. The median gives us a better idea than the mean of what to expect of a ‘typical’ individual’s income.
Although the median is superior in some cases, it’s not a silver bullet either. It could be argued that the income of high earners or the deaths of younger people should not be neglected, as they are as valid as any other data points in the set. The statistic we choose to use should depend on the context of the point we are trying to get across.
I’ve often come across articles which report the psychological phenomenon of illusory superiority, more commonly known as the better-than-average effect. They usually start by reporting a study that claims to have found that most people rate themselves as above average drivers or better than average lovers. The articles’ authors then go on to poke fun at these people because, as they reason, the maths says that, by definition, half of all people must be below average. Only, if by average, you are referring to the median. Clearly, if you’re taking the mean as your definition of average it’s possible for almost everyone to have an above average number of legs or for the majority of people to live longer than average.
The same is true if we consider our definition of average to be the mode. The mode is the most popular value in a data set. It would tell us that the most common number of children of a UK family is not 2.4, but the more acceptable value of two. It even makes sense when the values in the data set are not numeric. You could use the mode to answer more nuanced questions, about the UK’s average pet for example. The mode gives you the sensible answer ‘dog’ rather than the chimera you might expect by using the mean.
I was thinking of the mode when I came up with the slightly contrived statement “the average number of times the books at your local library are borrowed is zero”. Of all the books a library lends out, some will have been borrowed thousands or tens of thousands of times. Others will have been borrowed hundreds of times and some only tens. A few will not have been borrowed at all. There will be one number of borrowings that is more common than the others. It’s unlikely that this will be some high number like 23,542. What are the chances that more than one book is borrowed exactly 23,542 times? As it happens, although there aren’t huge numbers of books that don’t get borrowed at all, even the very few books that remain unborrowed each year are enough to ensure that zero is the most frequent number of times a book is borrowed. In this case the mode clearly isn’t a very useful descriptor of the data set – unless of course you want to catch people’s attention with what looks like a surprising statistic.
None of mean, median or mode is correct in any objective sense. The different averages are simply useful in different contexts and for describing distinct aspects of real world phenomena: from the median being employed to clean up digital photos or prevent life-endangering false alarms in intensive care units to ‘regression to the mean’ giving us a false impression of how effective alternative therapies are. These and many other practical applications of mathematics alongside the often life-or-death experiences that accompany them, are the essence of The Maths of Life and Death. The stories in the book should empower anyone, no matter their mathematical background, to take the power of mathematics into their own hands, because sometimes maths really is a matter of life and death. Reading the book will guarantee that next time, when you’re wowing someone with a statistic you’ve read about what happens on average, you’ll be confident of exactly what you mean.
Kit Yates is a Senior Lecturer in mathematical biology at the University of Bath. His job consists of taking real-world phenomena and uncovering the mathematical truths that lie behind them. He extracts the common patterns that underlie these processes and communicates them. He works in applications as diverse as embryonic disease, the patterns on eggshells and the devastating swarming of locust plagues - teasing out the mathematical connections in the process.