Mixture models for spatio-temporal multi-state processes

Farouk Nathoo successfully defended his Ph.D. thesis entitled "Mixture models for spatio-temporal multi-state processes" Consider pine weevils infesting a forest. After an initial attack, the infection spreads to neighboring trees, and then further and further to other trees. Yet some trees appear to be resistant to attack. Can the spread of the disease and the "immune" trees be identified? Surprising, this is a similar problem to people being treated in hospitals and some become infected after their stay. Patients who where treated in the same hospital short periods apart are more likely to have the same subsequent infection histories. Nathoo's thesis examines ways to study these types of phenomena.

Studies of recurring infection or chronic disease often collect longitudinal data on the disease status of subjects. Multi-state transitional models are commonly used for describing the development of such longitudinal data. In this setting, we model a stochastic process, which at any point in time will occupy one of a discrete set of states and interest centers on the transition process between states. For example, states may refer to the number of recurrences of an event or the stage of a disease.

Geographic referencing of data collected in longitudinal studies is progressively more common as scientific databases being linked with GIS systems. This has created a need for statistical methods addressing the resulting spatial-longitudinal structure of the data. In this thesis, we develop hierarchical mixed multi-state models for the analysis of such longitudinal data when the processes corresponding to different subjects may be correlated spatially over a region. Methodological developments have been strongly driven by studies in forestry and spatial epistemology.

Motivated by an application in forest ecology studying pine weevil infestations, the second chapter develops methods for handling mixtures of populations for spatial discrete-time two-state processes. The two-state discrete-time transitional model, often used for studying chronic conditions in human populations, is extended to settings where subjects are spatially arranged. A mixed spatially correlated mover-stayer model is developed. Here, clustering of infection is modeled by a spatially correlated random effect reflecting the density or closeness of the individuals under study. Analysis is carried out using maximum likelihood with a Monte Carlo EM algorithm for implementation and also using a fully Bayesian analysis.

The third chapter presents continuous-time spatial multi-state models. Here, joint modelling of both the spatial correlation as well as correlation between different transition rates is required and a multivariate spatial approach is employed. A proportional intensities frailty model is developed where baseline intensity functions are modeled using both parametric Weibull forms as well as flexible representations based on cubic B-splines. The methodology is applied to a study of revascularization intervention in Quebec examining readmission and mortality rates over a four-year period.

In the fourth and final chapter, we return to the two-state discrete-time setting. An extension of the mixed mover-stayer model is motivated and developed within the Bayesian framework. A multivariate conditional autoregressive (MCAR) model is incorporated providing flexible joint correlation structures. Also developed is a test fo the number of mixture components, quantifying the existence of a hidden subgroup of "stayers" within the population. Posterior summarization is based on a Metropolis-Hastings sampler and methods for assessing the model goodness of fit are based on posterior predictive comparisons.

This type of interdisciplinary work is a hallmark of our program in Applied Statistics at Simon Fraser University. For more information, please contact Farouk Nathoo (nathoo@stat.sfu.ca) or his supervisor Charmaine Dean (dean@stat.sfu.ca), Department of Statistics and Actuarial Science.