Date of Award

Spring 1-1-2017

Document Type


Degree Name

Doctor of Philosophy (PhD)

First Advisor

David D. Pollock

Second Advisor

David W. Stock

Third Advisor

Katerina Kechris

Fourth Advisor

Nolan C. Kane

Fifth Advisor

Andrew Martin


Transposable elements (TEs) are a large component of many eukaryotic genomes, and the evolution of TEs is closely connected to that of their hosts. Accurate inference of TE evolutionary relationships is essential to understanding the biology and evolution of TE families and the role they play in genome evolution. Additionally, the great quantity of TEs makes them a useful model system for understanding genomic processes such as mutation and recombination, and their utility as a research system also depends on accurate evolutionary inference.

In this dissertation, I describe novel computational methods for evolutionary inference in TEs, applying them primarily to the Alu family of primate retroelements. A major task in TE evolutionary study is the classification of elements of a family into subfamilies. I developed the AnTE algorithm, a Bayesian approach to subfamily classification that, in contrast to previous deterministic methods, allows for probabilistic subfamily classification, an important advance due to the high uncertainty involved. I use AnTE to provide a more complete picture of the evolutionary history of Alu elements than provided by previous analyses, especially regarding the role of gene conversion. This work suggests that current Alu subfamily classification found in widely-used databases such as RepeatMasker and RepBase provides a misleading account of Alu evolutionary relationships.

Building on the AnTE research, I developed a Bayesian phylogenetics approach to the detection and characterization of gene conversion events among TEs in a genome. I use this approach to identify a burst of interlocus gene conversion among Alu elements in the gorilla genome, occurring at much higher rates than on any other branch of the Great Ape phylogeny. Abnormally high Alu gene conversion rates in gorilla appear to be driven by binding to Alu by PRDM9, a rapidly-evolving protein that targets DNA sequence motifs for double-strand breaks in meiosis. These findings indicate one evolutionary pathway for rapid gene conversion in a TE family, and the conversion events identified provide a rich dataset for understanding the dynamics of gene conversion in primates.