Date of Award

Spring 1-1-2011

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Roger A. King

Second Advisor

Ruth R. Dameron

Third Advisor

Willem A. Schreuder

Abstract

Many legacy relational databases are hidden behind business layers containing semantic in- formation describing the data contained within the tables of the database. With the creation of the Semantic Web some databases have been exposed utilizing this technology, but with a cost. The process of exposing the database to the Semantic Web has not taken o_ because the manual mapping of the database to the ontology is improbable at a large scale, it is a time intensive process, and to create a domain ontology requires an Ontologist and/or domain expert. Many applications and approaches have been presented over the years to help expose these legacy databases to the Semantic Web. None of these solutions has become widely accepted because they translate all the data to Resource Description Framework (RDF). This does not work with legacy databases since other systems are still interacting with that data. In addition, systems that translate the data from legacy database to RDF triples do not scale for large databases because a statement or RDF triple is made for every cell within every table. Thus, the amount of information generated from a legacy system that has terabytes of data grows too large to be store in a triple store. Other systems generate an ontology that is a basic representation of the schema and lacking any type of hierarchy or semantic meaning.

This thesis proposes an architecture that will semi-automatically extract a meaningful ontology in a timely manner that can scale to handle large database and expose the database as virtual RDF graph by mapping the extracted domain ontology to the database. This will be accomplish by utilizing mapping rules that will evaluate the schema along with the data within the database and utilize existing knowledge base, like DBpedia, in order to find similar ontology classes that match the structure and data within the database. This hybrid approach to ontology extraction and generation of a mapping between the database and extracted ontology does not require an Ontologist, manual mapping, or time intensive work to be done. In addition, the approach can be applied at a larger scale.

Share

COinS