#### Date of Award

Spring 1-1-2013

#### Document Type

Thesis

#### Degree Name

Master of Science (MS)

#### Department

Applied Mathematics

#### First Advisor

Jem N. Corcoran

#### Second Advisor

Keith R. Molenaar

#### Third Advisor

Ray L. Littlejohn

#### Abstract

Bayesian networks are widely considered as powerful tools for modeling risk assessment, uncertainty, and decision making. They have been extensively employed to develop decision support systems in variety of domains including medical diagnosis, risk assessment and management, human cognition, industrial process and procurement, pavement and bridge management, and system reliability. Bayesian networks are convenient graphical expressions for high dimensional probability distributions which are used to represent complex relationships between a large number of random variables. A Bayesian network is a directed acyclic graph consisting of nodes which represent random variables and arrows which correspond to probabilistic dependencies between them. The ability to recover Bayesian network structures from data is critical to enhance their application in modeling real-world phenomena. Many research efforts have been done on this topic to identify the specific network structure. However, most Bayesian network learning procedures are based on the following two assumptions: (1) that the data are discrete or (2) that the data are continuous and either follow a Gaussian distribution or are otherwise discretized before recovery. Discretization of data in the continuous non-Gaussian case is often done in an ad hoc manner which destroys the conditional relationships among variables– subsequent network recovery algorithms are then unable to retrieve the correct network. Friedman and Goldszmidt [11] suggest an approach based on the minimum description length principle that chooses a discretization which preserves the information in the original data set, however it is one which is difficult, if not impossible, to implement for even moderately sized networks. This thesis explores a structure of the minimum description length developed and then provides an alternative efficient search strategy which allows one to use the Friedman and Goldszmidt in practice.

#### Recommended Citation

Tran, Dai Daniel, "An Efficient Search Strategy for Aggregation and Discretization of Attributes of Bayesian Networks Using Minimum Description Length" (2013). *Applied Mathematics Graduate Theses & Dissertations*. 41.

http://scholar.colorado.edu/appm_gradetds/41