Shijia Wang

Advanced Monte Carlo methods and applications. 

Monte Carlo methods have emerged as standard tools to do Bayesian statistical inference for sophisticated models. Sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) are two main classes of methods to sample from high dimensional probability distributions. This thesis develops methodologies within these classes to address problems in different research areas. 

Phylogenetic tree reconstruction is a main task in evolutionary biology. Traditional MCMC methods may suffer from the curse of dimensionality and the local-trap problem. Firstly, we introduce a new combinatorial SMC method, with a novel and efficient proposal distribution. We also explore combining SMC and Gibbs sampler to jointly estimate the phylogenetic trees and evolutionary parameters. Secondly, we propose an ``embarrassingly parallel'' method for Bayesian phylogenetic inference, annealed SMC, based on recent advances in the SMC literature such as adaptive determination of annealing parameters.

The next application is in genome wide-association studies. Linear mixed models (LMMs) are powerful methods for controlling confounding caused by population structure. We develop a Bayesian hierarchical model to jointly estimate the LMM parameters and the genetic similarity matrix using genetic sequences and phenotypes. We develop an SMC method to jointly approximate the posterior distributions of the LMM and the phylogeny.

We also consider parameter estimation for nonlinear differential equations (DEs) from noisy measurements of the dynamic system. The differential equations often contain unknown parameters that are of scientific interest, which have to be estimated from noisy measurements of the dynamic system.  We develop a fully Bayesian framework for non-linear DEs system. A flexible nonparametric function is used to represent the dynamic process such that expensive numerical solver can be avoided. We derive an SMC method to sample from multi-modal DE posterior distributions

In addition, we consider Bayesian computing problems related to importance sampling and misclassification in multinomial data.  Lastly,  motivated by a personalized recommender system with dynamic preference changes, we develop a new hidden Markov model (HMM) and propose an efficient online SMC algorithm by hybridizing with the EM algorithm for the HMM model.