AbstractsLaw & Legal Studies

A corpus study of ethnic slurs and derogatory language across Reddit and YouTube with Sentiment considered

by Hasan K Autman




Institution: San Diego State University
Department:
Year: 2016
Posted: 02/05/2017
Record ID: 2064660
Full text PDF: http://hdl.handle.net/10211.3/172742


Abstract

This study sought to create a specialized corpus to determine the frequency of an ethnic slur and animal referents for African Americans, as well as derogatory language aimed at law enforcement agents found on Reddit and YouTube; specifically, when the language inspiring event is ethnically charged and is mirrored on both platforms. Comments inspired by the deaths of unarmed suspects after interactions with law enforcement agents were chosen, and a corpus of 1,883,703 words was created using Enthought Canopy, Python Reddit API Wrapper, and YouTube Comment Scrapper. Antconc was then used to determine the frequency of the target language and next, the online platform, ethnicity of the deceased, gender of the deceased, and the ethnicity of the law enforcement agent were factored in a SPSS Poisson regression test to determine significance. The results indicated that the online platform (YouTube), the ethnicity of the deceased (African American), and the gender of the deceased (female) were predictors of a higher frequency of the ethnic slur and animal referents for African Americans. However, the frequency of derogatory language aimed at law enforcement agents was shown to be higher when the deceased was Caucasian and the agent was Hispanic. Finally, Sentiment Analyzer determined that despite the significant frequency of the target derogatory language, individual YouTube comment sections were classified as neutral in semantic orientation; while individual Reddit comment sections, despite having fewer instances of the target derogatory language, were deemed semantically negative. Advisors/Committee Members: Malouf, Robert, Csomay, Eniko, Alkebulan, Adisa.