Date of Award

Spring 1-1-2011

Document Type


Degree Name

Doctor of Philosophy (PhD)


Chemistry & Biochemistry

First Advisor

Natalie G. Ahn

Second Advisor

William M. Old

Third Advisor

Xuedong Liu


Shotgun proteomics is an analytical method used to identify proteins from complex mixtures such as a whole-cell lysates. This method utilizes high-resolution mass spectrometers, proteolysis, and fractionation techniques in order to maximize the number and quantities of proteins being detected. Knowing the identity and abundance of proteins in a cell provides insights into cell functioning, and how cells respond to external stimuli. Our lab uses proteomics to further understanding of signaling networks and how these are dysregulated during melanoma progression.

Computer algorithms are an essential aspect of shotgun proteomics in order to match hundreds of thousands of spectra to the peptide sequences from which they came. The most productive peptide identification methods search databases of protein sequences, looking for the best peptide spectrum matches, but these methods can be plagued by false positives and false negatives. I designed and implemented MSPlus, software which increases sensitivity and specificity in peptide identification by using physicochemical filters and consensus scoring between multiple database search programs, approaches which are now commonly in use.

After peptides are identified, they must be mapped back to the proteins from which they derive, a non-trivial task in the human proteome with its extensive alternative splicing, gene duplication, and post-translational processing. I designed and implemented IsoformResolver, software which accurately and efficiently infers proteins from peptides using a pre-calculated peptide-centric reformatted protein database. Proteins are reported in the context of protein groups, a concise representation which allows experimentalists to see the most likely proteins in the context of all possible proteins for which there is mass spectrometry evidence. This novel representation minimizes the protein volatility inherent to the more common protein-centric output.

Finally I examine the capabilities and limits of shotgun proteomics. I introduce a tier-based representation of protein abundances, and investigate how the abundances vary by protein class and at different sampling depths. I compare proteomics and transcriptomics results, and investigate to what extent proteomics can be used to identify members which distinguish cell states.