Quantcast

Computational personality recognition in social media

Research paper by Golnoosh Farnadi, Geetha Sitaraman; Shanu Sushmita; Fabio Celli; Michal Kosinski; David Stillwell; Sergio Davalos; Marie-;Francine Moens; Martine De Cock

Indexed on: 12 Aug '16Published on: 05 Feb '16Published in: User Modeling and User-Adapted Interaction



Abstract

Abstract A variety of approaches have been recently proposed to automatically infer users’ personality from their user generated content in social media. Approaches differ in terms of the machine learning algorithms and the feature sets used, type of utilized footprint, and the social media environment used to collect the data. In this paper, we perform a comparative analysis of state-of-the-art computational personality recognition methods on a varied set of social media ground truth data from Facebook, Twitter and YouTube. We answer three questions: (1) Should personality prediction be treated as a multi-label prediction task (i.e., all personality traits of a given user are predicted at once), or should each trait be identified separately? (2) Which predictive features work well across different on-line environments? and (3) What is the decay in accuracy when porting models trained in one social media environment to another?AbstractA variety of approaches have been recently proposed to automatically infer users’ personality from their user generated content in social media. Approaches differ in terms of the machine learning algorithms and the feature sets used, type of utilized footprint, and the social media environment used to collect the data. In this paper, we perform a comparative analysis of state-of-the-art computational personality recognition methods on a varied set of social media ground truth data from Facebook, Twitter and YouTube. We answer three questions: (1) Should personality prediction be treated as a multi-label prediction task (i.e., all personality traits of a given user are predicted at once), or should each trait be identified separately? (2) Which predictive features work well across different on-line environments? and (3) What is the decay in accuracy when porting models trained in one social media environment to another?