Date of Award

Spring 1-1-2012

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Rob Knight

Second Advisor

Robin Dowell-Deen

Third Advisor

Kenneth Anderson

Fourth Advisor

Diana Nemergut

Fifth Advisor

Henry Tufo


Improvements in sequencing technologies have shifted the foundations in biology, ecology and health. Traditionally, these sciences have dealt with small amounts of data that could be analyzed using simple methods and computational tools. Today, they are confronted with massive numbers of sequences within thousands of samples. These sequences represent the DNA from microorganisms that inhabit diverse environments, from soils, oceans to the human body. Additionally, the recent studies are now moving from simple snapshots to spatial and temporal datasets to studying the distribution of these microbial guests. These larger studies reveal the lack of computational methods and resources researchers have to circumvent to understand the intrinsic patterns of their new sequence based studies. In this dissertation, I present new computational tools, methods, and visualizations that allow microbiologists to make sense of these massive studies, and the interesting results concerning human health that can be obtained from microbial ecology studies. Also, I present a cloud computing method for combining these larger studies, which has already produced potentially important health insights into the temporal development of infants. Finally, I describe a new software tool, which allows microbial ecology researchers to design and statistically power future studies based on previously published studies. These novel components not only demonstrate the future of microbial computational biology, but also show the kind of medical and ecological advances we can observe by combining computational tools with new sequencing technologies.