Date of Award

Spring 1-1-2013

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical, Computer & Energy Engineering

First Advisor

Dirk Grunwald

Second Advisor

Fabio Somenzi

Abstract

Online social networks(OSNs) enable real-time event discussion. Due to the word-of-mouth effect, popular events are disseminated exponentially in a short period of time. With highly active public engagement, new events are being self-reported and discussed live. Compared to traditional news event detection and tracking, this huge volume of data, unstructured content, and variety of information in OSNs pose both opportunities and challenges for event analysis in new environments. This thesis makes key contributions in the following three aspects.

Event context identification helps to answer the question of who is interested in the events. It enables applications like user participation prediction, relevant event recommendation and friendship recommendation. We incorporate anchor information into the traditional probability matrix factorization framework to identify the group of users who are interested in given event. Our evaluation based on one-month of 461 events and 1.1 million users shows that our approach outperforms at least 20% over existing approaches.

Location inference addresses the problem of lacking location information in event analysis. It helps to understand where the event is being discussed. We use both textual and structural information to predict locations respectively, and finally use a learn-to-rank algorithm to effectively fuse the results. Evaluation a three-month of 0.82 million users, 16.4 million messages, and 11.5 million friendships shows the performance boost of 25% reduction in average error, and 66% reduction in median error over existing work.

Event modeling provides a solution for understanding the structure of the event. We first build a hierarchical and incremental model for each event, and then identify the causal relationships within the event structure. Our evaluation on 3.5 million messages over a 5-month period and demonstrate the high effectiveness and efficiency of our approach.

Share

COinS