Caroline Colijn

Methods and challenges in understanding transmission from pathogen genetic data

We present a Bayesian approach to infer transmission trees with the help of genetic sequence data from a pathogen circulating in an outbreak of an infection. We do this through the lens of a phylogenetic tree, annotating the tree with transmission events from person to person. We adapt the approach to simultaneously use multiple input trees, and share parameters between them, thus reducing the number of parameters inferred per tree, making use of distinct mini-outbreaks (each of which is insufficient to make inference on its own), and providing a method to cope with phylogenetic uncertainty. We then apply machine learning  to connect demographic, clinical and epidemiological data to our predictions of who infected whom. Finally we outline open challenges in the area which, if met, would substantially impact genomic epidemiology.