AbstractsComputer Science

Inferring Big 5 Personality from Online Social Networks

by Geetha Sitaraman




Institution: University of Washington
Department:
Year: 2015
Keywords: Big Five Personality; Correlation Feature Selection; Multivariate regression; myPersonality data; Social media; YouTube vlogger; Computer science
Record ID: 2058028
Full text PDF: http://hdl.handle.net/1773/27370


Abstract

Online social networks are very popular with millions of people creating online profiles and sharing personal information including their interests, activities, likes/dislikes and thoughts with their friends and family. This rich user generated content from social media makes them an ideal platform to study human behavior. In our research, we are interested in latent variables such as the long term personality traits and the short term emotional state of users. Proper mining of the user generated content can be used to identify personality traits of users without having them fill out questionnaires. These traits are shown to strongly influence a person's decisions, behavior and preferences for language, music, books etc. We explore the use of different machine learning techniques and feature selection methodologies for inferring users' personality traits using information available from their online profile. We study five multivariate regression algorithms and contrast them with a single target approach for predicting the scores. Additionally, we explore feature subset selection using correlation based heuristics and evaluate the quality of the feature space produced using two different machine learning algorithms: Linear Regression and Support Vector Regressors. The performance of the above techniques is evaluated on two different datasets: a myPersonality dataset collected from Facebook and a YouTube personality dataset collected from video posts of vloggers. All five multivariate as well as single target algorithms and correlation based feature selection methods outperformed the average baseline model for all five personality traits on both the datasets. Furthermore, we study the relation between emotions expressed in approximately 1 million Facebook (FB) status updates and the users' personality, age, gender and time of posting. We use this in establishing associations such as open personality users express emotions more frequently, while neurotic users are more reserved. With the ability to identify users' personality and emotions, advertisements could be tailored based on the user's personality type since personality and/or emotion-aware interfaces are more persuasive.