Spring 2019

Dylan Taatjes

Robert Parson

Robin Dowell


Transcription is a key biological process that converts DNA into RNA. There are a range of proteins involved in this process, and understanding the role that each plays in transcription leads to a better understanding of the process of transcription as a whole. Nascent sequencing is a technique for examining the behavior of RNA that is being actively transcribed by Polymerase II (PolII) in cells. Nascent sequencing is a powerful technique, but produces a tremendous amount of data which can be difficult to summarize effectively. This research focused on the optimization of existing computational techniques used for examining this large amount of data. Methods for Pause Index analysis, Metagene analysis, and Differential Expression analysis were examined and optimized, using a novel dataset in which the TAF1 subunit of the TFIID complex was removed from cells. Computational optimizations were successful, and resulted in significant speedups in computation as well as resolving issues inherent in the existing tools for the analyses considered.

