Marco Colombo

Experimental area · proceed at your own risk!

About me

I am currently working as an independent consultant, specialised in biomedical data analysis, biomarker discovery, and development of related software.

Between October 2009 and November 2018 I have been based in the Centre for Population Health Sciences, now part of the Usher Institute of Population Health Sciences and Informatics, as a Research Fellow and (from June 2016) as Senior Research Fellow in Paul McKeigue's group. Before that, I was a PhD student (from October 2003) and then a Postdoctoral Research Assistant (from April 2007) within the Edinburgh Research Group in Optimization at the School of Mathematics.

Research interests

Biomarker discovery and prediction of diabetes complications

November 2014 to now: I have been working with Helen Colhoun on data from the SDRNT1BIO cohort, a large cohort of patients with type 1 diabetes linked to electronic health data and genotypes, and where proteins, metabolites, tryptic peptides, glycans are measured in a subset of samples. One of the studies related to the relationship of N-glycans with progression of renal disease (Diabetes Care 2018). We investigated the relationship of persistent C-peptide secretion with the genetic architecture of type 1 diabetes (BMC Medicine 2019), and the contemporary rates of progression of renal disease and their predictors (Diabetologia 2020a). Our work on biomarkers of renal progression has produced the first results (Diabetologia 2019b, Diabetologia 2020b, Pediatric Diabetes 2020) and other work is ongoing.

March 2012 to December 2015: I worked on the SUMMIT project, a European research consortium dedicated to diabetes complications, focusing in particular on non-genetic biomarkers, data mining and in-silico modelling. As complications, we principally looked at cardiovascular disease (Diabetologia 2015, Atherosclerosis 2018) and progression of diabetic kidney disease (Kidney International 2015, Diabetologia 2019a). During the project we also tackled the problem of in-silico identification of unknown metabolites (Journal of Chromatography B 2017).


February 2016 to now: I am one of the core developers of GENOSCORES, a platform built to provide a framework for calculating genotypic predictors of binary and quantitative phenotypes from publicly available summary results of genome-wide association studies of multiple phenotypes, -omic measurements and gene expressions. This was used to explore pleiotropy in the genetic determinants of male pattern baldness (Nature Communications 2017), understand the relationship between persistent C peptide secretion in type 1 diabetes and the genetic architecture of diabetes (BMC Medicine 2019), and provide a novel approach to integrating genetics and neuroimaging in the exploration of white matter microstructure (Translational Psychiatry 2020).

December 2015 to November 2018: I have been involved in the MATURA project, a consortium working on rheumathoid arthritis. The first results concerned genome-wide association studies of response to methotrexate (The Pharmacogenomics Journal 2018a) and tumour necrosis factor inhibitor therapy (The Pharmacogenomics Journal 2018b), prediction of response from genome-wide SNP data (Genetic Epidemiology 2018), validity of a 2-component imaging-based disease activity score (Rheumatology 2019), and the discovery of potentially interesting new associations (Annals of the Rheumatic Diseases 2019).

January 2014 to December 2016: I worked the Stratified Medicine Scotland Innovation Centre project PROMISERA, a study on early rheumatoid arthritis. With the lead by Athina Spiliopoulou we developed a novel approach to imputation of ultralow coverage sequence data (Genetics 2017), which is implemented in GeneImp, which relies on the existence of very large reference panel to avoid modelling recombination explicitly. This leads to much faster imputations of this type of data with minimal loss of quality.

October 2009 to February 2012: I worked on the genetic epidemiology software admixmap with Paul McKeigue: I extended it to work correctly on the X chromosome and we used this to better understand the genetic determinants of sarcoidosis in Afro-American populations (Genes and Immunity 2011). Afterwards, I implemented a computationally efficient factorization that allows to exploit pedigree data in hours rather than in weeks (Genetic Epidemiology 2013).

Large-scale optimization and structure exploitation

My research on the theory and implementation of Interior Point Methods for linear and quadratic programming concentrated particularly on the study of search directions and warm-start approaches.

April 2007 to September 2009: I was employed on an EPSRC-funded project with Andreas Grothey, in which we continued the investigation of warm-start strategies for interior point methods in the context of stochastic programming. This led to consider a multi-step approach, in which the number of intermediate problems can be more than one (ERGO 09-007), and a decomposition-like strategy, in which we generate and warm-start the subproblems rooted at the second-stage nodes (COAP 2013). I also designed the stochastic programming extension for the structure-conveying modelling language SML (Mathematical Programming Computation 2009).

October 2003 to May 2007: During my PhD in optimization, under the supervision of Jacek Gondzio, I have improved the implementation of corrector directions in the HOPDM interior point solver (COAP 2008), before moving to work with the structure-exploiting parallel solver OOPS. I have implemented an SMPS interface for OOPS, HOPDM and CPLEX (extended to LP_SOLVE and GLPK), which allows to solve a stochastic programming problem by formulating the corresponding deterministic equivalent problem. This implementation allows to solve a problem instance with warm-start, by first solving a problem based on a reduced scenario tree (Mathematical Programming 2011).

2003-2020 © marco