A note on robust statistical methods

Posted on 23 October 2017 in Pietro's Data Bulletin

TL;DR: Conventional statistical methods like the t-test or the ANOVA F test can perform very poorly if the data does not meet the assumptions of normality and homoscedasticity. So-called robust statistical methods have been developed which perform well even when these assumptions are violated (and we should use them).

In neuroscience, common statistical methods to compare groups or test associations include the omnipresent t-test, the ANOVA F test, Pearson correlation, and least squares regression. And while most people are aware that these methods rely on certain assumptions, most prominently normality and homoscedasticity (equal variances), much less widespread is the awareness that violating these assumptions can strongly impact the Type I error rate (the probability of a false positive) and statistical power.

In the last few decades, a range of statistical techniques known as robust methods have been developed to deal with skewness, outliers, heteroscedasticity, and curvature in the regression line, which represent the main violations of the common assumptions. In general, these methods will perform well if the distributions are well-behaved, and will continue to control Type I error rates and provide higher power as we depart from the standard assumptions.

While that is certainly good news, robust methods are generally not part of the standard statistical training of a neuroscientist, and finding the right method for your problem can be a daunting task. We could read the 608 pages Introduction to Robust Estimation and Hypothesis Testing by Rand Wilcox (which, from the parts I have read, seems like a great read, by the way), but we won’t. And Wilcox knows that. So, earlier this year, he joined forces with Guillaume Rousselet to write an introduction to robust methods which can reasonably fit into our schedules: ‘A Guide to Robust Statistical Methods in Neuroscience’ [1].

A good chunk of the guide is dedicated to showing how and why conventional methods fail when dealing with skewed distributions, outliers, heteroscedasticity, and curvature. A few simulations are provided which show how skewness can dramatically increase false positives and destroy power, and how this can be avoided by employing robust methods which test the median or the trimmed mean.

While conventional statistics does provide you with ways to deal with violations of the classical assumptions, these strategies often turn out to be unsatisfactory. Testing the assumptions is perhaps the most obvious, but it never comes with the guarantee that the test will have enough power to detect deviations that can have sizeable consequences. Other approaches like discarding outliers before comparing means or transforming data are also flawed, and even rank-based and permutation methods do not always perform well.

While no single methods dominates in all circumstances, a few general guidelines are presented in the last section of the article. Notably, always use a method which allows for heteroscedasticity, plot your data in an informative way (i.e. never a bar plot), and consider comparing multiple quantiles instead of only the central tendency.

To leave us no excuses to not adopt more robust methods, there is R code available for all functions. The functions written by Wilcox and described throughout his books are available on his webpage as a single text file or on Github here. A subset of these functions, which contains a number of robust statistical methods (and has the advantage of being more thoroughly documented) is maintained by Patrick Mair and available as the CRAN package WRS2, and is described here. And if you’re thinking ‘wait, but this code is written in R, I don’t use R’, neither do I, I just installed rpy2 and I call the functions from Python, it's very easy to set up.


[1] Wilcox, Rand R., and Guillaume A. Rousselet. "A guide to robust statistical methods in neuroscience." bioRxiv (2017): 151811.