PhD student, Columbia University
The goal of this project is to automatically detect deception using spoken cues. We are interested in exploring the factors that play a role in deception and deception detection, such as culture, gender, and personality. Toward that end, we have collected a large corpus of deceptive and non-deceptive speech, comprised of conversations between adult native speakers of American English and of Mandarin Chinese. We are applying machine learning techniques to automatically identify deceptive statements, and exploring individual differences between cultures, genders, and personalities in deceptive behavior.
Abstract: Automatic fake news detection is a challenging problem in deception detection, and it has tremendous real-world political and social impacts. However, statistical approaches to combating fake news has been dramatically limited by the lack of labeled benchmark datasets. In this paper, we present liar: a new, publicly available dataset for fake news detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from PolitiFact.com, which provides detailed analysis report and links to source documents for each case. This dataset can be used for fact-checking research as well. Notably, this new dataset is an order of magnitude larger than previously largest public fake news datasets of similar type. Empirically, we investigate automatic fake news detection based on surface-level linguistic patterns. We have designed a novel, hybrid convolutional neural network to integrate meta-data with text. We show that this hybrid approach can improve a text-only deep learning model.
Pub.: 01 May '17, Pinned: 31 Jul '17
Abstract: Functional magnetic resonance imaging (fMRI) is a technology used to detect brain activity. Patterns of brain activation have been utilized as biomarkers for various neuropsychiatric applications. Detecting deception based on the pattern of brain activation characterized with fMRI is getting attention - with machine learning algorithms being applied to this field in recent years. The high dimensionality of fMRI data makes it a difficult task to directly utilize the original data as input for classification algorithms in detecting deception. In this paper, we investigated the procedures of feature selection to enhance fMRI-based deception detection.We used the t-statistic map derived from the statistical parametric mapping analysis of fMRI signals to construct features that reflect brain activation patterns. We subsequently investigated various feature selection methods including an ensemble method to identify discriminative features to detect deception. Using 124 features selected from a set of 65,166 original features as inputs for a support vector machine classifier, our results indicate that feature selection significantly enhanced the classification accuracy of the support vector machine in comparison to the models trained using all features and dimension reduction based models. Furthermore, the selected features are shown to form anatomic clusters within brain regions, which supports the hypothesis that specific brain regions may play a role during deception processes.Feature selection not only enhances classification accuracy in fMRI-based deception detection but also provides support for the biological hypothesis that brain activities in certain regions of the brain are important for discrimination of deception.
Pub.: 26 Sep '09, Pinned: 31 Jul '17
Abstract: This study is a successful proof of concept of using automated text analysis to accurately classify transcribed 911 homicide calls according to their veracity. Fifty matched, caller-side transcripts were labeled as truthful or deceptive based on the subsequent adjudication of the cases. We mined the transcripts and analyzed a set of linguistic features supported by deception theories. Our results suggest that truthful callers display more negative emotion and anxiety and provide more details for emergency workers to respond to the call. On the other hand, deceivers attempt to suppress verbal responses by using more negation and assent words. Using these features as input variables, we trained and tested several machine-learning classification algorithms and compared the results with the output from a statistical classification technique, discriminant analysis. The overall performance of the classification techniques was as high as 84% for the cross-validated set. The promising results of this study illustrate the potential of using automated linguistic analyses in crime investigations.
Pub.: 12 Aug '14, Pinned: 31 Jul '17
Abstract: The inconvenience operation of EEG P300 or functional magnetic resonance imaging (FMRI) will be overcome, when the deceptive information can be effectively detected from speech signal analysis. In this paper, the fractional Mel cepstral coefficient (FrCC) is proposed as the speech character for deception detection. The different fractional order can reveal various personalities of the speakers. The linear discriminant analysis (LDA) model (which has the ability of global optimal vector mapping) is introduced, and the performance of FrCC and MFCC in deceptive detection is compared when all the data are mapped to low dimensional. Then, the hidden Markov model (HMM) is introduced as a long-term signal analysis tool. Twenty-five male and 25 female participants are involved in the experiment. The results show that the clustering effect of optimal fractional order FrCC is better than that of MFCC. The average accuracy for male and female speaker is 59.9% and 56.2%, respectively, by using the FrCC under the LDA model. When MFCC is used, the accuracy is reduced by 3.2% and 5.9%, respectively, for male and female. The accuracy can be increased to 71.0% and 70.2% for male and female speakers when HMM is used. Moreover, some individual accuracy is increased over 20%, or even more than 85%, when FrCC is introduced. The results show that the deceptive information is indeed hidden in the speech signals. Therefore, speech-based psychophysiology calculating may be a valuable research field.
Pub.: 28 Aug '15, Pinned: 31 Jul '17
Join Sparrho today to stay on top of science
Discover, organise and share research that matters to you