Paul McKeigue research group

Usher Institute of Population Health Sciences and Informatics, University of Edinburgh


Paul McKeigue

Marco Colombo

Athina Spiliopoulou

Joe Mellor



A platform for calculating genotypic predictors of binary and quantitative phenotypes


A program to impute SNP genotypes from ultra-low coverage whole genome sequencing data


A program to model admixture using ancestry-informative markers

Sample size calculator

To calculate the sample size required to learn to classify with a high-dimensional biomarker panel, an online sample size calculator is available. A guide to using the online calculator is here.
The statistical methods are described in this paper.


McKeigue P. Quantifying performance of a diagnostic test as the expected information for discrimination: relation to the C-statistic. Statistical Methods for Medical Research 2018, in press. This paper proposes that the expected information for discrimination (expected weight of evidence) should supplant the C-statistic (area under the ROC curve) for quantifying the performance of a diagnostic test or risk predictor, and for evaluating the incremental contribution of a new biomarker. An R package wevid to calculate and plot the distributions of weights of evidence is available here (will be submitted to CRAN eventually). A tutorial on using the package and interpreting the results is on this page.


Biomedical Data Science

See the lecture notes and the lab material for the Biomedical Data Science course taught in the academic year 2017/2018 by Marco Colombo as part of the MSc in Operational Research (with Data Science) and the MSc in Statistics.

Recent publications