The Past and Perils of Statistical Analysis

Over the past six months, I have increasingly used statistical analysis to extract usable insights from project datasets. When I was later required to discuss these findings in the relevant report, I started becoming keenly aware of the limitations and potential pitfalls of statistical analysis in crafting an objective evaluation. It was this work that made me curios about the origins of statistical analysis techniques and whether their unaddressed limitations and risks have influenced outcomes before.


The roots of statistical analysis trace back to ancient civilisations including Mesopotamia and Egypt, where rudimentary forms of data collection and analysis emerged for agricultural and administrative purposes. Modern statistical methods began to take shape in the 17th century with the work of John Graunt who, regarded as the founder of demography, analysed mortality data in England to understand patterns of disease. As time passed, the field expanded significantly with the contributions of important historical figures like Carl Friedrich Gauss (once regarded the ‘Prince of Mathematicians’ in the 19th century) who claims to have founded the method of least squares, and Adolphe Quetelet who was first to use the science of probability and statistics outside the field of astronomy. Statistical analysis found early application in astronomy, economics, and demographics, laying the groundwork for its widespread use in countless fields today.


Statistical analysis has evolved into a generalised approach to modern questions through a combination of theoretical advancements, technological innovations, and practical applications across various disciplines. The refinement of statistical theories and methods in the early 20th century provided a solid framework for analysing complex data sets and drawing reliable conclusions. It was the invention of computers in the mid-20th century that revolutionised data processing and analysis, enabling statisticians to tackle larger datasets and develop much more sophisticated models. Furthermore, the interdisciplinary nature of modern research propelled the adoption of statistical analysis, as researchers across fields recognised its utility in uncovering patterns, making predictions, and informing decision-making processes.


Statistical analysis techniques have offered a powerful toolset for understanding the world around us, but that does not mean they come without risk. A notable early example of mistakes that occurred due to misinterpreted statistics is the infamous case of Florence Nightingale’s polar area diagram in the mid-19th century. During the Crimean War, Nightingale presented diagrams showcasing the causes of mortality among soldiers, diagrams that would later be considered icons of graphic design innovation and data visualisation. While her diagrams highlighted the significance of preventable diseases, it also contained errors due to misinterpretation of the data, leading to exaggerated claims about the impact of hospital conditions versus battlefield injuries. Another example is the flawed analysis of Gregor Mendel’s pea plant experiments in the 19th century, where his groundbreaking work on genetics was initially overlooked due to misinterpretation of statistical patterns in his data. His discovery of ‘Mendelian inheritance’ which later credited him as the ‘father’ of modern genetics, was initially refuted due to a series of ill-informed judgments and assumptions on Mendel’s experiment parameters and the results simply being considered “too good to be true”.

Health Analysis

As we’ve just seen, statistical analysis can sometimes lead to unintentional outcomes or even worsen situations within healthcare when it is not applied or interpreted correctly. One example is the overreliance on certain statistical metrics, such as averages or means, which may obscure important nuances within a population. For instance, if a hospital focuses solely on reducing the average length of stay for patients, it might discharge patients prematurely, leading to higher readmission rates or compromised patient outcomes. Instead, it would be prudent to factor in wider context for each patient individually and recognise the boundaries of your statistical model.

Additionally, during clinical trials and research studies, misinterpretation or selective reporting of statistical results can lead to ineffective or even harmful treatments being adopted for reasons that extend beyond clinical results. Overall, while statistical analysis is a valuable tool in healthcare, it’s essential to approach it with caution and consider the broader context to avoid unintended consequences or worsened outcomes.


Methods of overcoming these consequences largely fall on people overcoming their statistical illiteracy through education and learning to critically evaluate information presented with statistical data. Understanding fundamental concepts like probability, correlation, and variability can empower individuals to make informed decisions and navigate the complexities of modern data-driven society. Statistical literacy is crucial for informed decision-making, allowing individuals to critically assess information, understand risks, and make evidence-based choices in healthcare and various other aspects of life.