A platform for calculating genotypic predictors of binary and quantitative phenotypes

A program to impute SNP genotypes from ultra-low coverage whole genome sequencing data

A program to model admixture using ancestry-informative markers

To calculate the sample size required to learn to classify with a high-dimensional biomarker panel, an online sample size calculator is available. A guide to using the online calculator is here. The statistical methods are described in this paper.

McKeigue P., Quantifying performance of a diagnostic test as the expected information for discrimination: Relation to the C-statistic. *Statistical Methods for Medical Research* 2018, in press.

This paper proposes that the expected information for discrimination (expected weight of evidence) should supplant the C-statistic (area under the ROC curve) for quantifying the performance of a diagnostic test or risk predictor, and for evaluating the incremental contribution of a new biomarker. The wevid R package is available to calculate and plot the distributions of weights of evidence. A tutorial on using the package and interpreting the results is also available.

See the lecture notes and the lab material for the Biomedical Data Science course taught in the academic year 2017/2018 by Marco Colombo as part of the MSc in Operational Research (with Data Science) and the MSc in Statistics.