Date of Award

Spring 1-1-2017

Document Type


Degree Name

Doctor of Philosophy (PhD)

First Advisor

Aaron Clauset

Second Advisor

Brian Keegan

Third Advisor

Jordan Boyd-Graber

Fourth Advisor

Kenneth Anderson

Fifth Advisor

Rafael Frongillo


Science exhibits many forms of imbalance, ranging from disparities in the representation of certain demographics in the scientific workforce to enormous variation in the quantity and quality of contributions that individuals make to the scientific literature. In this thesis, we investigate the drivers of such imbalances, seeking a greater understanding of both the factors that facilitate success in science and its potential sources of inequality or discrimination. Advances along either direction would inform policy decisions aiming to support scientific discoveries and the scientists who make them. Progress in these directions, however, is typically impeded by the complex nature of the processes that govern who works in science, where they work, and how productive they are. Specifically, interdependencies among these processes complicate any analysis attempting to isolate and quantify particular effects. Here, we use techniques from statistical modeling, machine learning, and causal inference to directly address these sources of complexity and explore the underrepresentation of women in science, the role of productivity in faculty hiring and retention, and how institutional prestige affects researchers' success.

Computer science in many ways represents an ideal case study for investigating sources of imbalance in academia. Throughout the field's history, women have been dramatically underrepresented, despite increasing participation in recent years. Research in computer science is also remarkably diverse, with varying scholastic traditions and rates of publication, and is incredibly well-documented, offering rich sources of data to investigate the drivers that sustain the field's gender imbalance and disparities in research output. Our work therefore focuses on computer science in particular, however our findings have broad implications to the scientific community as a whole.