Date of Award
Doctor of Philosophy (PhD)
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
This is a thesis about how to characterize the statistical structure of the tails of degree distributions of real-world networks. The primary contribution is a statistical test of the prevalence of scale-free structure in real-world networks. A central claim in modern network science is that real-world networks are typically "scale free," meaning that the fraction of nodes with degree k follows a power law, decaying like k-a, often with 2 < a< 3. However, empirical evidence for this belief derives from a relatively small number of real-world networks. In the first section, we test the universality of scale-free structure by applying state-of-the-art statistical tools to a large corpus of nearly 1000 network data sets drawn from social, biological, technological, and informational sources. We fit the power-law model to each degree distribution, test its statistical plausibility, and compare it via a likelihood ratio test to alternative, non-scale-free models, e.g., the log-normal. Across domains, we find that scale-free networks are rare, with only 4% exhibiting the strongest-possible evidence of scale-free structure and 52% exhibiting the weakest-possible evidence. Furthermore, evidence of scale-free structure is not uniformly distributed across sources: social networks are at best weakly scale free, while a handful of technological and biological networks can be called strongly scale free. These results undermine the universality of scale-free networks and reveal that real-world networks exhibit a rich structural diversity that will likely require new ideas and mechanisms to explain. A core methodological component of addressing the ubiquity of scale-free structure in real-world networks is an ability to fit a power law to the degree distribution. In the second section, we numerically evaluate and compare, using both synthetic data with known structure and real-world data with unknown structure, two statistically principled methods for estimating the tail parameters for power-law distributions, showing that in practice, a method based on extreme value theory and a sophisticated bootstrap and the more commonly used method based an empirical minimization approach exhibit similar accuracy.
Broido, Anna, "Characterizing the tails of degree distributions in real-world networks" (2019). Applied Mathematics Graduate Theses & Dissertations. 143.