Date of Award

Spring 1-1-2011

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Leysia Palen

Second Advisor

Kenneth Anderson

Third Advisor

Gloria Mark

Abstract

Social media provide a democratized platform for expressing one’s opinion or viewpoint. The everyday discussions that people have with their families, friends and colleagues became available through blogging services. The emergence of the blogging activity made the classic ethnographic approaches more difficult to deploy. The use of these methods becomes even more problematic when the available data contain a wide variety of languages. This thesis proposes the use of topic modeling as a method for quantitatively analyzing crawled blogs that were created by Iraqi citizens and active over a period of 8 years since the beginning of the Iraq war in 2003. This document presents how data were collected, the way dominant languages were separated in different datasets, the limits of using classification and clustering techniques, the benefits of employing topic modeling, and the evaluation of this technique.

Share

COinS