AbstractsComputer Science

Visual analytics of social media for situation awareness

by Dennis Thom




Institution: University of Stuttgart
Department: Fakultät Informatik, Elektrotechnik und Informationstechnik
Degree: PhD
Year: 2015
Record ID: 1101792
Full text PDF: http://elib.uni-stuttgart.de/opus/volltexte/2015/10002/


Abstract

With the emergence of social media services and other user-centered web platforms the nature of the modern internet changed substantially. While it has since been a vast source of information and news on all kinds of topics, it recently grew into a continuous stream of knowledge, observations, thoughts, and situation reports. They are provided in real-time by millions of people from all over the world. This change also offers completely new possibilities for domains that rely on good situation awareness, such as disaster management, emergency response, disease control, and several forms of command and control environments. Analysts can find eyewitness videos of ongoing critical events in Youtube, they can observe the movement and communication behavior of Facebook users during evacuation measures, and they are enabled to trace the outspread of an epidemic disease just by highlighting symptom related keyword usage in Twitter. However, the data sizes that need to be processed in order to identify relevant entries, produce comprehensible overviews, and detect anomalous patterns pose one of the most challenging analytics problems of our time. Not only the volume of data generated on a daily basis is larger than any other single database from the pre-internet era. The data is furthermore streamed in real-time at substantial velocity; it comes in a great variety, including text snippets, images, videos and network information; and it contains inaccuracies, misleading information, rumors, and fake meta-data, leading to uncertain veracity. In contrast to most other computer science challenges, social media analytics thus fully covers all characteristics that have been commonly referred to as the "four V's" of big data. By tightly integrating approaches from the areas of data mining, information retrieval, natural language processing, human computer interaction, and data visualization the emerging field of visual analytics has been devised to tackle these challenges. As a descendant of the more general field of information visualization, visual analytics strives to merge the strengths of highly interactive visual interfaces with the computational power of automatic statistical algorithms. The goal of this combination is to advance problem solving in areas where a human analyst alone would be overwhelmed by the data volumes, while, at the same time, sheer processing power alone would not enable analysts to identify underlying patterns and relate information to semantic knowledge. This thesis identifies four visual analytics requirements that have to be addressed to allow comprehensive situation awareness based on social media: Access to data, visualization of context, coping with semantic complexity, and scalable processing. Based on core ideas of visual analytics, this work contributes three distinct techniques that allow to tackle access, context, and complexity, as well as a prototypical implementation that integrates all of them and allows scalable processing of the data. Means of iterative query optimization and…