Event Credibility Prediction in Twitter

Online social media services like Twitter are widely adopted by people to self-report activities and stories happening around them. Monitoring social media streams, e.g., tweets in Twitter, becomes an effective way to detect real-time events and monitor emergent situations. However, social media is also increasingly exploited to spread rumors and false information, e.g., fake images during Hurricane Sandy. False rumors in social media can potentially reach millions of people in short amount of time. Counter measures are thus needed to curb false information from undermining the integrity and utility of social media.

The existing works relied on offline aggregation analysis, where a complete set of tweets related to social events are required to extract aggregation features such as the depth of the propagation tree based on the retweets. However, because collecting a complete set of tweets often causes significant delay, this approach is not suitable when we need to detect false events as early as possible.

In this project, we develop a probabilistic generative model for real-time event credibility prediction with streaming tweets. We propose an online streaming prediction algorithm. In contrast to offline aggregation analysis that requires a complete set of tweets related to an event, the proposed algorithm only uses the currently observed streaming tweets. The algorithm updates prediction without the need to store or reprocess the past tweets. We conduct experiments on the dataset of tweets collected from Twitter.