Date of Award

Spring 1-1-2013

Document Type


Degree Name

Doctor of Philosophy (PhD)



First Advisor

Barbara P. Buttenfield

Second Advisor

Stefan Leyk

Third Advisor

Elisabeth D. Root

Fourth Advisor

Seth Spielman

Fifth Advisor

André Skupin


This dissertation designed and implemented approaches to assess the suitability of commonly used unsupervised and supervised grouping methods on data types commonly used in the geographic domain. Four different types of data have been indexed for organization: a full-text data set depicting 30 years of cartographic literature, a raster data set consisting of physiographic characteristics of the U.S., a suite of GIS software commands used in hydrologic analysis, and a catalog of cartographic generalization algorithms. Various clustering and classification methods from the field of statistics and machine learning were evaluated for organizing these different data types. By systematically applying all types of data organization to each type of indexed data, this research addresses the question of whether certain indexing strategies influence the effectiveness of the organization methods. Depending on the data set and the indexing method applied, some clustering and classification methods performed better than others.

The experiments of this dissertation demonstrate that by the systematic evaluation and validation of clustering and classification results recommendations for organizing data can be formulated based on the results of cluster and classification indices. Furthermore, through systematic evaluation and application of the six clustering and classification methods it is possible to match indexing strategy and organization methods for each of the four data sets used in this dissertation.