AbstractsCommunication

Summarization And Sentiment Analysis For Understanding Socially-Generated Content

by Lu Wang




Institution: Cornell University
Department:
Year: 2016
Keywords: Natural Language Processing; Summarization; Sentiment Analysis
Posted: 02/05/2017
Record ID: 2072728
Full text PDF: http://hdl.handle.net/1813/43671


Abstract

During the past decades, we have witnessed the emergence of significant amounts of socially-generated content enabled by the widespread use of Internet, especially the social media websites. How to efficiently and effectively extract useful information and learn knowledge from the socially-generated content becomes a challenging task. Progress has been made in the area of natural language processing to help users understand and absorb knowledge from large volumes of text documents. This dissertation proposes broadly applicable natural language processing techniques to extract key information from massive amounts of heterogeneous textual data in response to users information queries and present it in a comprehensible way. Concretely, novel automatic summarization approaches are proposed to generate concise and informative responses from large amounts of texts to address users requests. We study textual data ranging from eloquent news articles written by professionals in traditional media, to massive user-generated content in popular social media, and to spontaneous conversations containing disfluency and interruptions. Furthermore, sentiment analysis methods are presented for studying the social interactions in online discussions. We target at discovering useful knowledge from informal text and thus obtaining a deeper understanding of socially-generated content. Advisors/Committee Members: Turnbull,Bruce William (committeeMember), Gehrke,Johannes E. (committeeMember).