Probabilistic analysis of gene family evolution -- gene duplications and sequence evolution
Stockholm Bioinformatics Center Seminars
Wednesday 01 October 2008
to 17:00 at
Lars Arvestad (SBC/CSC)
Probabilistic and Bayesian methods have gained popularity in phylogenetics in recent years. We present a probabilistic gene evolution model, PrIME-GSR, based on a birth-death process in which a gene tree evolves "inside" a species tree. The model is the basis for MCMC-based algorithms for probabilistic approaches to orthology analysis, tree reconciliation studies, and gene tree inference.
We believe this progress represents the "next generation" of phylogenetic analysis. It allows us to pose the question: what is the most probable gene tree explaining a set of sequences and respecting a known species tree? To date, model development in phylogenetics has concentrated on sequence evolution, leaving other types of data to be analyzed later in separate steps. We argue that joint analysis of data is desirable and a model integrating a species tree in the phylogenetic analysis is an important step forward.
Based on our model, we have implemented a Bayesian analysis tool. Our implementation is sound and we demonstrate its utility for genome-wide gene-family analysis by applying it to recently presented yeast data. We validate PrIME-GSR by comparing to previous analyses of this data that takes advantage of gene order information. The results demonstrate the value of a relaxed molecular clock and also suggest that synteny prediction can mislead gene tree estimation.