As organizations and government agencies work diligently to detect financial irregularities, malfeasance, fraud and criminal activities through intercepted communication, there is an increasing interest in devising an automated model/tool for deception detection. We build on Pennebaker's empirical model which suggests that deception in text leaves a linguistic signature characterised by changes in frequency of four categories of words: first-person pronouns, exclusive words, negative emotion words, and action words. By applying the model to the Enron email dataset and using an unsupervised matrix-decomposition technique, we explore the differential use of these cue-words/categories in deception detection. Instead of focusing on the predictive power of the individual cue-words, we construct a descriptive model which helps us to understand the multivariate profile of deception based on several linguistic dimensions and highlights the qualitative differences between deceptive and truthful communication. This descriptive model can not only help detect unusual and deceptive communication, but also possibly rank messages along a scale of relative deceptiveness (for instance from strategic negotiation and spin to deception and blatant lying). The model is unintrusive, requires minimal human intervention and, by following the defined pre-processing steps it may be applied to new datasets from different domains. / Thesis (Master, Computing) -- Queen's University, 2007-11-28 18:10:30.45
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OKQ.1974/922 |
Date | 29 November 2007 |
Creators | Gupta, Smita |
Contributors | Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.)) |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English, English |
Detected Language | English |
Type | Thesis |
Format | 12133825 bytes, application/pdf |
Rights | This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner. |
Relation | Canadian theses |
Page generated in 0.0016 seconds